Bucket sort is an efficient sorting algorithm that uses buckets to sort a collection of items. It is one of the most efficient sorting algorithms, with an average time complexity of O(n) and a worst-case time complexity of O(n^2). Bucket sort is often used in data structures and algorithms with Python due to its efficient performance and ease of implementation.
In this article, we will cover the basics of bucket sort, including how the algorithm works, its time and space complexity, and a Python code example to illustrate the algorithm. We will also provide some coding exercises at the end of the article to test the reader’s understanding of bucket sort.
What is Bucket Sort?
Bucket sort is an efficient sorting algorithm that works by dividing a collection of items into buckets, then sorting each bucket independently. It is a comparison-based sorting algorithm, meaning it uses comparisons between elements to determine the order of the elements.
Bucket sort is often used to sort a collection of items with a small range of values, such as integers between 0 and 10. The algorithm works by first creating a number of buckets, then distributing the items into the buckets based on the item’s value. Once the items are distributed into the buckets, the buckets are sorted using a sorting algorithm of choice (typically insertion sort or selection sort). Finally, the sorted buckets are combined into a single sorted collection.
How Does Bucket Sort Work?
Bucket sort works by first dividing a collection of items into buckets, then sorting each bucket independently. The algorithm works as follows:
- Create a number of buckets.
- Distribute the items into the buckets based on the item’s value.
- Sort each bucket using a sorting algorithm of choice (typically insertion sort or selection sort).
- Combine the sorted buckets into a single sorted collection.
Let’s look at an example to illustrate how bucket sort works. Suppose we have the following collection of integers:
[7, 3, 5, 2, 1, 9, 4, 8, 6]
We can divide the collection into 8 buckets, with each bucket containing a range of values. For example, the first bucket could contain all integers from 0 to 1, the second from 2 to 3, and so on. The buckets would look like this:
Bucket 1 [0, 1]
Bucket 2 [2, 3]
Bucket 3 [4, 5]
Bucket 4 [6, 7]
Bucket 5 [8, 9]
Once the buckets are created, we can distribute the items into the buckets based on their value. For example, the number 7 would go into the fourth bucket, 3 into the second bucket, and so on. The buckets would now look like this:
Bucket 1 [1]
Bucket 2 [2, 3]
Bucket 3 [4, 5]
Bucket 4 [6, 7]
Bucket 5 [8, 9]
Next, we can sort each bucket using a sorting algorithm of choice. For this example, we will use insertion sort to sort each bucket. After sorting, the buckets would look like this:
Bucket 1 [1]
Bucket 2 [2, 3]
Bucket 3 [4, 5]
Bucket 4 [6, 7]
Bucket 5 [8, 9]
Finally, we can combine the sorted buckets into a single sorted collection. The resulting collection would look like this:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Time Complexity of Bucket Sort
The time complexity of bucket sort is dependent on the sorting algorithm used to sort each bucket. In general, the best-case time complexity of bucket sort is O(n), while the worst-case time complexity is O(n^2).
The best-case time complexity of bucket sort occurs when the buckets are already sorted, meaning the sorting algorithm used to sort each bucket has a time complexity of O(1). In this scenario, the time complexity of bucket sort is O(n), since it only takes one pass through the buckets to combine them into a single sorted collection.
The worst-case time complexity of bucket sort occurs when the buckets are not sorted, meaning the sorting algorithm used to sort each bucket has a time complexity of O(n^2). In this scenario, the time complexity of bucket sort is O(n^2), since it takes n passes through the buckets to combine them into a single sorted collection.
Space Complexity of Bucket Sort
The space complexity of bucket sort is O(n). This is because, in the worst case, the algorithm requires additional space to store the buckets and the sorted items.
Python Code Example
Now that we have covered the basics of bucket sort, let’s look at a Python code example to illustrate the algorithm. The following code implements bucket sort using the insertion sort algorithm to sort each bucket.
# Function to sort an array using bucket sort
def bucket_sort(arr):
# Get the maximum and minimum values in the array
max_val = max(arr)
min_val = min(arr)
# Create buckets for each value
buckets = [[] for _ in range(min_val, max_val + 1)]
# Distribute the array values into the buckets
for i in range(len(arr)):
buckets[arr[i] - min_val].append(arr[i])
# Sort each bucket using insertion sort
for i in range(len(buckets)):
insertion_sort(buckets[i])
# Combine the sorted buckets
sorted_arr = []
for i in range(len(buckets)):
sorted_arr.extend(buckets[i])
return sorted_arr
# Function to sort a bucket using insertion sort
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i-1
while j >= 0 and key < arr[j] :
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# Driver code
arr = [7, 3, 5, 2, 1, 9, 4, 8, 6]
sorted_arr = bucket_sort(arr)
# Print the sorted array
print(sorted_arr)
# Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Conclusion
In this article, we have covered the basics of bucket sort, including how the algorithm works, its time and space complexity, and a Python code example to illustrate the algorithm. Bucket sort is an efficient sorting algorithm with an average time complexity of O(n) and a worst-case time complexity of O(n^2). It is often used to sort a collection of items with a small range of values, such as integers between 0 and 10.
Exercises
Write a Python function that implements bucket sort using selection sort to sort each bucket.
# Function to sort an array using bucket sort
def bucket_sort(arr):
# Get the maximum and minimum values in the array
max_val = max(arr)
min_val = min(arr)
# Create buckets for each value
buckets = [[] for _ in range(min_val, max_val + 1)]
# Distribute the array values into the buckets
for i in range(len(arr)):
buckets[arr[i] - min_val].append(arr[i])
# Sort each bucket using selection sort
for i in range(len(buckets)):
selection_sort(buckets[i])
# Combine the sorted buckets
sorted_arr = []
for i in range(len(buckets)):
sorted_arr.extend(buckets[i])
return sorted_arr
# Function to sort a bucket using selection sort
def selection_sort(arr):
for i in range(len(arr)):
# Find the minimum element in the unsorted array
min_idx = i
for j in range(i+1, len(arr)):
if arr[min_idx] > arr[j]:
min_idx = j
# Swap the found minimum element with the first element
arr[i], arr[min_idx] = arr[min_idx], arr[i]
Write a Python function that implements bucket sort using quick sort to sort each bucket.
# Function to sort an array using bucket sort
def bucket_sort(arr):
# Get the maximum and minimum values in the array
max_val = max(arr)
min_val = min(arr)
# Create buckets for each value
buckets = [[] for _ in range(min_val, max_val + 1)]
# Distribute the array values into the buckets
for i in range(len(arr)):
buckets[arr[i] - min_val].append(arr[i])
# Sort each bucket using quick sort
for i in range(len(buckets)):
quick_sort(buckets[i], 0, len(buckets[i])-1)
# Combine the sorted buckets
sorted_arr = []
for i in range(len(buckets)):
sorted_arr.extend(buckets[i])
return sorted_arr
# Function to sort a bucket using quick sort
def quick_sort(arr, low, high):
if low < high:
# Partition the array
pi = partition(arr, low, high)
# Sort the elements before and after the partition
quick_sort(arr, low, pi-1)
quick_sort(arr, pi+1, high)
# Function to partition the array
def partition(arr, low, high):
i = (low-1) # Index of smaller element
pivot = arr[high] # Pivot
for j in range(low, high):
# If current element is smaller than or
# equal to pivot
if arr[j] <= pivot:
# Increment index of smaller element
i = i+1
arr[i],arr[j] = arr[j],arr[i]
arr[i+1],arr[high] = arr[high],arr[i+1]
return (i+1)
Write a Python program to generate a collection of random integers between 0 and 10 and sort them using bucket sort.
# Function to sort an array using bucket sort
def bucket_sort(arr):
# Get the maximum and minimum values in the array
max_val = max(arr)
min_val = min(arr)
# Create buckets for each value
buckets = [[] for _ in range(min_val, max_val + 1)]
# Distribute the array values into the buckets
for i in range(len(arr)):
buckets[arr[i] - min_val].append(arr[i])
# Sort each bucket using insertion sort
for i in range(len(buckets)):
insertion_sort(buckets[i])
# Combine the sorted buckets
sorted_arr = []
for i in range(len(buckets)):
sorted_arr.extend(buckets[i])
return sorted_arr
# Function to sort a bucket using insertion sort
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i-1
while j >= 0 and key < arr[j] :
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# Function to generate a collection of random integers
def generate_random_ints(n):
arr = []
for _ in range(n):
arr.append(random.randint(0, 10))
return arr
# Driver code
arr = generate_random_ints(10)
sorted_arr = bucket_sort(arr)
# Print the sorted array
print(sorted_arr)
Write a Python program to generate a collection of random items and sort them using bucket sort.
import random
# Function to sort an array using bucket sort
def bucket_sort(arr):
# Get the maximum and minimum values in the array
max_val = max(arr)
min_val = min(arr)
# Create buckets for each value
buckets = [[] for _ in range(min_val, max_val + 1)]
# Distribute the array values into the buckets
for i in range(len(arr)):
buckets[arr[i] - min_val].append(arr[i])
# Sort each bucket using insertion sort
for i in range(len(buckets)):
insertion_sort(buckets[i])
# Combine the sorted buckets
sorted_arr = []
for i in range(len(buckets)):
sorted_arr.extend(buckets[i])
return sorted_arr
# Function to sort a bucket using insertion sort
def insertion_sort(arr):
for i in range(1, len(arr)):
key = arr[i]
j = i-1
while j >= 0 and key < arr[j] :
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# Function to generate a collection of random items
def generate_random_items(n):
arr = []
for _ in range(n):
arr.append(random.randint(0, 10))
return arr
# Driver code
arr = generate_random_items(10)
sorted_arr = bucket_sort(arr)
# Print the sorted array
print(sorted_arr)
What is the space complexity (worst) of Bucket Sort?
O(n)