count sort

http://en.wikipedia.org/wiki/Counting_sort

In summary, the algorithm loops over the items, computing a histogram of the number of times each key occurs within the input collection. It then performs a prefix sum computation (a second loop, over the range of possible keys) to determine, for each key, the starting position in the output array of the items having that key. Finally, it loops over the items again, moving each item into its sorted position in the output array.[1][2][3]

In pseudocode, this may be expressed as follows:

''' calculate histogram: '''
# k is the right end of keys' range 
# allocate an array Count[0..k] ; THEN
# initialize each array cell to zero ; THEN
for each input item x:
    increment Count[key(x)]

''' calculate starting index for each key: '''
total = 0
for i = 0, 1, ... k:
    oldCount = Count[i]
    Count[i] = total
    total = total + oldCount

''' copy inputs into output array in order: '''
# allocate an output array Output[0..n-1] ; THEN
for each input item x:
    Output[Count[key(x)]] = x
    increment Count[key(x)]
return Output

After the first for loop, Count[i] stores the number of items with key equal to i. After the second for loop, it instead stores the number of items with key less than i, which is the same as the first index at which an item with key i should be stored in the output array. Throughout the third loop, Count[i] always stores the next position in the output array into which an item with key i should be stored, so each item is moved into its correct position in the output array.[1][2][3] The relative order of items with equal keys is preserved here; i.e., this is a stable sort.

Analysis

Because the algorithm uses only simple for loops, without recursion or subroutine calls, it is straightforward to analyze. The initialization of the Count array, and the second for loop which performs a prefix sum on the count array, each iterate at most k + 1 times and therefore take O(k) time. The other two for loops, and the initialization of the output array, each take O(n) time. Therefore the time for the whole algorithm is the sum of the times for these steps, O(n + k).[1][2]

Because it uses arrays of length k + 1 and n, the total space usage of the algorithm is also O(n + k).[1] For problem instances in which the maximum key value is significantly smaller than the number of items, counting sort can be highly space-efficient, as the only storage it uses other than its input and output arrays is the Count array which uses space O(k).[5]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s