Counting Sort

June 1, 2008

The counting sort algorithm is a type of sort algorithm that can be used to sort collections of values whose range is known. By that, we mean that the algorithm is based on the assumption that we know all possible values of the sort key beforehand. That’s not as terrible as it sounds – it may be as simple as knowing that our sort key will be a byte value, and hence be limited to the range of numbers between 0 and 255, for example.

Unlike the insertion sort, the counting sort is not based on the comparison of one sort key against another. Instead, when we start to sort using this algorithm, we first create an index of all possible values of the key. If we are going to sort a collection of bytes for instance, we’d create an array with 256 positions. This index is used to store the number of items in the collection we are sorting, that is smaller than or equal to the given key.

All this looks hideously complicated when you put it down in words, but it’s really rather simple, Let’s assume we have the following array of bytes: {9, 8, 4, 6, 3, 2, 1, 0, 4}

We first create the index:

[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ] … [ 0 ]

Next, we iterate through the input collection, and add 1 to the value of the key matching the value we get from the input array:

input[0] is 9, so:
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 1 ] … [ 0 ]

input[1] is 8:
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

input[2] is 4:
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 0 ][ 0 ][ 0 ][ 0 ][ 1 ][ 0 ][ 0 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

.. and so on, until after input 8 we have …

input[8] is 4:
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 1 ][ 1 ][ 1 ][ 2 ][ 0 ][ 1 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

Right now this tells us what how many instances of each key we have in our input array. What we really want, though, is to have a count of how many elements are smaller than or equal to a given key, so what we do next is, starting from the SECOND element in our index, we add the number of items from the previous key:

index[1] = index[1] + index[0]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 2 ][ 1 ][ 1 ][ 2 ][ 0 ][ 1 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

index[2] = index[2] + index[1]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 2 ][ 3 ][ 1 ][ 2 ][ 0 ][ 1 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

index[3] = index[3] + index[2]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 2 ][ 3 ][ 4 ][ 2 ][ 0 ][ 1 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

index[4] = index[4] + index[3]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 2 ][ 3 ][ 4 ][ 6 ][ 0 ][ 1 ][ 0 ][ 1 ][ 1 ] … [ 0 ]

… until, a long time later …

index[255] = index[255] + index[254]
[ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7 ][ 8 ][ 9 ] … [255]
[ 1 ][ 2 ][ 3 ][ 4 ][ 6 ][ 6 ][ 7 ][ 7 ][ 8 ][ 9 ] … [ 9 ]

Which tells us that in the array there are 7 values less than or equal to 7, 6 values less than or equal to 4, and so on. We can use this information to create our output array, and reserve space for our keys. If we know that there are 3 values that are less than or equal to 2, for example, we know that, if we place, say, 3 in our output array, we must leave space for those 3 values.

Now we can iterate through our input array backwards, and use the index to see where it fits in the output array. Every time we do this, we subtract 1 from its count in the index array – we’ve already handled one instance of this key, so that’s one less to worry about.

Now, the bit about iterating backwards can sound a bit weird – why backwards? Well, to make sure we have enough space, counting sort drops the value into the output array at the last available space reserved for it. If we were to iterate forward, the first instance of the key would be placed *after* the last instance of the same key. Consider the following scenario:

Input collection {{Name=”Bob”, Age=”10″}, {Name=”Emilyn”,Age=”3″}, {Name=”Tim”,Age=”10″}}

If we sort by age and iterate forward, we get back {{Name=”Emilyn”,Age=”3″}, {Name=”Tim”,Age=”10″}, {Name=”Bob”, Age=”10″}}
If instead we iterate backwards, we get {{Name=”Emilyn”,Age=”3″}, {Name=”Bob”,Age=”10″}, {Name=”Tim”, Age=”10″}}

Since some uses of counting sort assume that items with equal key values will be returned in the same order as they were given in, we should play nice and satisfy that assumption. When a sort algorithm behaves in this way (equal value items returned in same order as given), that algorithm is said to be stable.

As we said, every time we drop a value in its place inside the output collection, we decrease the value of its count in the index by 1. This way, the next time we encounter the same value, the index will be pointing to it’s new home in the output collection, just next to its friend.

Now all that may look terribly wordy – and it is – but the code itself is pretty nice and sweet:

   1: public byte[] Sort(byte[] input)
   2: {
   3:     int[] index = new int[byte.MaxValue + 1];
   4:
   5:     for (int j = 0; j < input.Length; j++)
   6:      index[input[j]]++;
   7:
   8:     for (int i = 1; i < index.Length; i++)
   9:      index[i] = index[i] + index[i - 1];
  10:
  11:     byte[] output = new byte[input.Length];
  12:
  13:     for (int i = input.Length - 1; i >= 0; i--)
  14:     {
  15:      output[index[input[i]] -1] = input[i];
  16:      index[input[i]]--;
  17:     }
  18:
  19:     return output;
  20: }

As you can see, unlike comparison sort, this algorithm will always iterate through the input array twice, and will always iterate the index once – for a given size of input array, we will always have the same number of operations being performed. This means that, barring external factors, a search on an array of a given size will always take roughly the same amount of time.

We can also see that we’re creating two arrays here. This method will be consuming 256 bytes, plus the size of the input array in bytes when it creates the output array. This also means that we can know how much memory the method will consume beforehand, if we need to know that.

kick it on DotNetKicks.com

Advertisements

Insertion Sort

May 27, 2008

Insertion sort is one of the many algorithms that we will cover in this blog. I choose this algorithm to start with because I think that this is a fairly easy to understand (better start with an easy one 🙂 )

To understand the insertion sort imagine that you have a deck of cards that is not sorted. You put this deck of cards on the table and start picking a card one by one; each time putting the card in your left hand. Each time you pick a card you must compare that card with the previous cards until you find a card that is greater than the one you have. By doing so you just sorted the deck of cards on your left hand.

Insertion sort is quite similar. Basically you have a list and you start looping linearly through that list, each time checking with the previous items if the selected item (the key) is smaller than the items before it. If so move that item forward. If the previous item is smaller then you stop checking.

The below figures should make this more clear. (I will refer to the item being sorted as the key.

insertion1 (FIG 1) Start from the element “2”. Check it with the previous element “5”. Since “5” > “2”, move “5” forward (Result shown in Fig 2).

insertion2 (FIG 2) Select the next element “4”. Check it with the previous element “5”. Since “5” > “4”, move “5” forward. Check “4” with “2”. Since “2” < “4” then “2” should not move and thus we stop checking with previous values. Move “4” to the proper index (Result shown in Fig 3)

insertion3 (FIG 3)Select the next element “6”. Check it with the previous element “5”. Since “5” < “6”, “5” should not move and thus we stop checking with previous values (Result shown in Fig 4)

insertion3 (FIG 4)Select the next element “1”. Check it with the previous element “6”. Since 6 > 1 move 6 forward. Check the key(“1”) with the other item “5” since “5” > “1”, move “5” forward. continue the same process with all other elements (Result shown in Fig 5)

insertion4 (FIG 5)

As you can see from the figures above the insertion sort is very simple. Lets now try it out in C#

   1: //unsorted array
   2: int[] list = new int[] { 5, 2, 4, 6, 1 };
   3:
   4: // the key element being sorted
   5: int key;
   6:
   7: //start looping starting from the second element
   8: for (int i = 1; i < list.Length; i++)
   9: {
  10:     key = list[i];//store the key
  11:     int j = i - 1;//get the previous index
  12:
  13:     //loop until you meet a smaller number or 0
  14:     while (j >= 0 && list[j] > key)
  15:     {
  16:         //move the greater number forward
  17:         list[j + 1] = list[j];
  18:         //decrement the counter so that we get the previous element
  19:         j--;
  20:     }
  21:     //set the key in the proper index
  22:     list[j + 1] = key;
  23: }

Conclusion

The insertion sort is a fairly efficient algorithm for sorting small arrays. As you could see in the above explanation you must compare the key with all other elements that are in front of the key (This could suffer performance with large arrays)

The best case scenario for this algorithm would be if the elements in the array are already sorted. This would give us the best performance because there would be no need to compare the key with all the other element in front of it.

The worse case scenario would be if the array is in reverse order. This would force us check each element every time.

I hope I gave a good explanation of this algorithm. Feel free to leave comments/suggestions….

P.S This is my first algorithms article so please be kind; I promise I’ll try to improve 😀

kick it on DotNetKicks.com


Hello people

May 26, 2008

So this is my new blog…. Here I will publish post on how to implement algorithms in C#… I am no algo specialist, in fact I have no idea what these monsters are… Yet I am willing to learn by blogging 😀

So the idea of this blog is that while I am learning the art of Algorithms, i will blog on each one I find so that I share my learnings with you and perhaps even learn from mistakes I do in the long run….. I am sure that the readers will teach me more than I am trying to 😀

Looking forward for this…..