Misplaced Pages

Internal sort

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article does not cite any sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Internal sort" – news · newspapers · books · scholar · JSTOR (December 2022) (Learn how and when to remove this message)

An internal sort is any data sorting process that takes place entirely within the main memory of a computer. This is possible whenever the data to be sorted is small enough to all be held in the main memory. like a hard-disk. Any reading or writing of data to and from this slower media can slow the sortation process considerably. This issue has implications for different sort algorithms.

Some common internal sorting algorithms include:

  1. Bubble Sort
  2. Insertion Sort
  3. Quick Sort
  4. Heap Sort
  5. Radix Sort
  6. Selection sort

Consider a Bubblesort, where adjacent records are swapped in order to get them into the right order, so that records appear to “bubble” up and down through the dataspace. If this has to be done in chunks, then when we have sorted all the records in chunk 1, we move on to chunk 2, but we find that some of the records in chunk 1 need to “bubble through” chunk 2, and vice versa (i.e., there are records in chunk 2 that belong in chunk 1, and records in chunk 1 that belong in chunk 2 or later chunks). This will cause the chunks to be read and written back to disk many times as records cross over the boundaries between them, resulting in a considerable degradation of performance. If the data can all be held in memory as one large chunk, then this performance hit is avoided.

On the other hand, some algorithms handle external sorting rather better. A Merge sort breaks the data up into chunks, sorts the chunks by some other algorithm (maybe bubblesort or Quick sort) and then recombines the chunks two by two so that each recombined chunk is in order. This approach minimises the number or reads and writes of data-chunks from disk, and is a popular external sort method.

Category: