"""
This is a pure Python implementation of the radix sort algorithm
Source: https://en.wikipedia.org/wiki/Radix_sort
"""
from __future__ import annotations
RADIX = 10
def radix_sort(list_of_ints: list[int]) -> list[int]:
"""
Examples:
>>> radix_sort([0, 5, 3, 2, 2])
[0, 2, 2, 3, 5]
>>> radix_sort(list(range(15))) == sorted(range(15))
True
>>> radix_sort(list(range(14,-1,-1))) == sorted(range(15))
True
>>> radix_sort([1,100,10,1000]) == sorted([1,100,10,1000])
True
"""
placement = 1
max_digit = max(list_of_ints)
while placement <= max_digit:
# declare and initialize empty buckets
buckets: list[list] = [[] for _ in range(RADIX)]
# split list_of_ints between the buckets
for i in list_of_ints:
tmp = int((i / placement) % RADIX)
buckets[tmp].append(i)
# put each buckets' contents into list_of_ints
a = 0
for b in range(RADIX):
for i in buckets[b]:
list_of_ints[a] = i
a += 1
# move to next
placement *= RADIX
return list_of_ints
if __name__ == "__main__":
import doctest
doctest.testmod()
The lower bound for Comparison based sorting algorithm (Merge Sort, Heap Sort, Quick-Sort .. etc) is Ω(nlogn)
, i.e., they cannot do better than nlogn
.
Counting sort is a linear time sorting algorithm that sort in O(n+k)
time when elements are in the range from 1 to k.
What if the elements are in the range from 1 to n2? We can’t use counting sort because counting sort will take O(n2)
which is worse than comparison-based sorting algorithms. Can we sort such an array in linear time?
Radix Sort is the answer. The idea of Radix Sort is to do digit by digit sort starting from least significant digit to most significant digit. Radix sort uses counting sort as a subroutine to sort.
Do following for each digit i where i varies from least significant digit to the most significant digit. Sort input array using counting sort (or any stable sort) according to the i’th digit.
Example:
Original, unsorted list:
170, 45, 75, 90, 802, 24, 2, 66
Sorting by least significant digit (1s place) gives:
[*Notice that we keep 802 before 2, because 802 occurred before 2 in the original list, and similarly for pairs 170 & 90 and 45 & 75.]
Sorting by next digit (10s place) gives:
[*Notice that 802 again comes before 2 as 802 comes before 2 in the previous list.]
802, 2, 24, 45, 66, 170, 75, 90
Sorting by the most significant digit (100s place) gives:
2, 24, 45, 66, 75, 90, 170, 802
Let there be d digits in input integers. Radix Sort takes O(d*(n+b))
time where b is the base for representing numbers, for example, for the decimal system, b is 10.
What is the value of d? If k
is the maximum possible value, then d would be O(logb(k))
. So overall time complexity is O((n+b) * logb(k))
. Which looks more than the
time complexity of comparison-based sorting algorithms for a large k. Let us first limit k. Let k <= nc where c is a constant. In that case, the complexity becomes
O(n logb(n))
. But it still doesn’t beat comparison-based sorting algorithms.
If we have log2n
bits for every digit, the running time of Radix appears to be better than Quick Sort for a wide range of input numbers. The constant factors hidden in
asymptotic notation are higher for Radix Sort and Quick-Sort uses hardware caches more effectively. Also, Radix sort uses counting sort as a subroutine and counting sort
takes extra space to sort numbers.
Video reference: https://youtu.be/nu4gDuFabIM