JavaScript Sorting Algorithm - Radix Sort and Bucket Sort
Comparison based sorting algorithms
We have discussed all popular comparison based sorting algorithms: insertion sort and selection sort, bubble sort and merge sort, Quicksort and Heap Sort.
All these algorithms are great under most of conditions, but their lower bound is O(nlogn), they can not do better.
If for some reason, we want to achieve better than nLogn, we have to find some other ways.
About why the lower bound of all comparison based sorting algorithms is nlogn ?
Why nlogn
For any array with n elements, it has n! possible orders. Since we are using comparison based, suppose we are building a decision tree for all these possible orders, and h is the height of the decision tree and also is how many number of comparisons we need to get down to the leaves. Then we should have:
2^h >= n! which means: the total number of leaves should be enough to cover all possible orders of our array.
Then we would get h >= nlogn for approximation.
Non-Comparison based sorting algorithms
So if we want to get over the nlogn, we have to ask help from some non-comparison based ways.
If you knew or heard about the sorting algorithms, you should know we have several famous non-comparison based sorting algorithms: counting sort, bucket sort and radix sort.
And lets talk about them today, one by one :)
Counting Sort
Imagine this situation:
You have a deck of playing cards in random order(without jokers), and you want to sort them into the ascending order from A to K. What you gonna do ? Possibly you will count the cards and group them into 13 groups from A to K, and then combine these groups from A to K.
This is a classical counting sort. Once we are sure how many groups we have in our array or we know that the elements in our array are coming from a distribution from a to b. Then we can group them into (b - a + 1) buckets and loop over the array, put elements into proper buckets and then combine them.
Here is the code:
1 | // counting sort |
Bucket Sort
The counting sort is amazing fast : O(n+k), k is the number of buckets we have. But it costs too much space, and if we don’t know the distribution of elements, it maybe have a lot of empty buckets which is a waste of space.
Bucket sort is an optimization of counting sort, instead of only assigning same elements into one bucket, it will put several elements into one bucket but make sure the it is ascending from the point of buckets which means: for i,j buckets, if i < j, we know any elements in i will smaller than any elements in j.
By doing this, we can divide the entire array into a lot of small subarrays, and now we can just use any comparison based sorting algorithm to sort the small arrays.
Same as counting sort, if we have a wonderful distribution of our elements, it would be O(n + klogb), k is the number of buckets and b is the number of elements in one bucket, to sort the entire array. But the worst case, all elements assigned into the same buckets, it will degrade to the comparison based sorting algorithm we use, but only when you choose a really bad method to group.
1 | // bucketSort |
Radix Sort
Counting sort and Bucket sort are great, but they are too space-consuming and sometimes they are even slower than comparison based ones. Like if we have a really sparse array coming from 0 to n^2, then counting sort would down to O(n^2), and also if we don’t know the distribution of all elements in the array, we might choose an unefficient way to do the hash part for bucket sort, we could still get O(n^2).
Radix is here to help us out of this trouble. The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
For example: we have: [101, 203, 5, 87, 76, 48], using radix sort:
[101,203,5,76,87,48] <- last digits
[101,203,5,48,76,87] <- second last digits
[5,48,76,87,101,203] <- the first digits
Using zero when the number doesn’t have this digit.
Now lets show the code:
1 | // helper function to get the last nth digit of a number |
The time complexity for radix sort is : O(d*(n+b)), d is the number of digits the array has, b is the buckets we have, normal it is 10 for base 10 system.
Cool ha :)
BTW
Since I combine radix sort with bucket sort and counting sort, so this is the last post about sorting algorithms. But for this serie, I think I will have another post talking about when we should use which algorithm.
See ya.
Oh, and also, I combine all these codes together and create a gist for it: Soting Algorithms in JS.