Heapsort – heap sort. Time complexity – O(n log n), fast eh? I would call this sorting – sorting of falling stones. It seems to me that the easiest way to explain it is visually.
The input is a list of numbers, for example:
5, 0, 7, 2, 3, 9, 4
From left to right, a data structure is made – a binary tree, or as I call it – a pyramid. Pyramid elements can have a maximum of two child elements, with only one top element.
Let’s make a binary tree:
⠀⠀5
⠀0⠀7
2 3 9 4
If you look at the pyramid for a long time, you can see that these are just numbers from the array, going one after another, the number of elements in each floor is multiplied by two.
Then the fun begins, we sort the pyramid from bottom to top, using the falling pebbles method (heapify). Sorting could be started from the last floor (2 3 9 4 ), but it makes no sense because there is no floor below where one could fall.
Therefore, we start dropping elements from the penultimate floor (0 7)
⠀⠀5
⠀0⠀7
2 3 9 4
The first element to fall is selected on the right, in our case it is 7, then we look at what is under it, and below it are 9 and 4, nine is more than four, so also nine is more than seven! We drop 7 on 9, and raise 9 to place 7.
⠀⠀5
⠀0⠀9
2 3 7 4
Further, we understand that the seven has nowhere to fall below, go to the number 0, which is located on the penultimate floor on the left:
⠀⠀5
⠀0⠀9
2 3 7 4
We look at what is under it – 2 and 3, two is less than three, three is greater than zero, so we change zero and three in places:
⠀⠀5
⠀3⠀9
2 0 7 4
When you get to the end of the floor, go to the floor above and drop everything there if you can.
The result is a data structure – a heap (heap), namely max heap, because at the top is the largest element:
⠀⠀9
⠀3⠀7
2 0 5 4
If you return it to an array representation, you get a list:
[9, 3, 7, 2, 0, 5, 4]
From this we can conclude that by swapping the first and last element, we will get the first number in the final sorted position, namely 9 should be at the end of the sorted list, swap:
[4, 3, 7, 2, 0, 5, 9]
Let’s look at the binary tree:
⠀⠀4
⠀3⠀7
2 0 5 9
The result is a situation in which the lower part of the tree is sorted, you just need to drop 4 to the correct position, repeat the algorithm, but do not take into account the already sorted numbers, namely 9:
⠀⠀4
⠀3⠀7
2 0 5 9
⠀⠀7
⠀3⠀4
2 0 5 9
⠀⠀7
⠀3⠀5
2 0 4 9
It turned out that we, having dropped 4, raised the next largest number after 9 – 7. Swap the last unsorted number (4) and the largest number (7)
⠀⠀4
⠀3⠀5
2 0 7 9
It turned out that now we have two numbers in the correct final position:
4, 3, 5, 2, 0, 7, 9
Next, we repeat the sorting algorithm, ignoring those already sorted, as a result we get heap:
⠀⠀0
⠀2⠀3
4 5 7 9
Or as a list:
0, 2, 3, 4, 5, 7, 9
Implementation
The algorithm is usually divided into three functions:
- Heap creation
- Sifting algorithm (heapify)
- Replacing the last unsorted element and the first one
A heap is created by traversing the penultimate row of the binary tree using the heapify function, from right to left to the end of the array. Then the first replacement of numbers is made in the cycle, after which the first element falls / remains in place, as a result of which the largest element falls into first place, the cycle repeats with a decrease in participants by one, because after each pass, the sorted numbers remain at the end of the list.
Heapsort example in Ruby:
DEMO = true
module Colors
BLUE = "\033[94m"
RED = "\033[31m"
STOP = "\033[0m"
end
def heapsort(rawNumbers)
numbers = rawNumbers.dup
def swap(numbers, from, to)
temp = numbers[from]
numbers[from] = numbers[to]
numbers[to] = temp
end
def heapify(numbers)
count = numbers.length()
lastParentNode = (count - 2) / 2
for start in lastParentNode.downto(0)
siftDown(numbers, start, count - 1)
start -= 1
end
if DEMO
puts "--- heapify ends ---"
end
end
def siftDown(numbers, start, rightBound)
cursor = start
printBinaryHeap(numbers, cursor, rightBound)
def calculateLhsChildIndex(cursor)
return cursor * 2 + 1
end
def calculateRhsChildIndex(cursor)
return cursor * 2 + 2
end
while calculateLhsChildIndex(cursor) <= rightBound
lhsChildIndex = calculateLhsChildIndex(cursor)
rhsChildIndex = calculateRhsChildIndex(cursor)
lhsNumber = numbers[lhsChildIndex]
biggerChildIndex = lhsChildIndex
if rhsChildIndex <= rightBound
rhsNumber = numbers[rhsChildIndex]
if lhsNumber < rhsNumber
biggerChildIndex = rhsChildIndex
end
end
if numbers[cursor] < numbers[biggerChildIndex]
swap(numbers, cursor, biggerChildIndex)
cursor = biggerChildIndex
else
break
end
printBinaryHeap(numbers, cursor, rightBound)
end
printBinaryHeap(numbers, cursor, rightBound)
end
def printBinaryHeap(numbers, nodeIndex = -1, rightBound = -1)
if DEMO == false
return
end
perLineWidth = (numbers.length() * 4).to_i
linesCount = Math.log2(numbers.length()).ceil()
xPrinterCount = 1
cursor = 0
spacing = 3
for y in (0..linesCount)
line = perLineWidth.times.map { " " }
spacing = spacing == 3 ? 4 : 3
printIndex = (perLineWidth / 2) - (spacing * xPrinterCount) / 2
for x in (0..xPrinterCount - 1)
if cursor >= numbers.length
break
end
if nodeIndex != -1 && cursor == nodeIndex
line[printIndex] = "%s%s%s" % [Colors::RED, numbers[cursor].to_s, Colors::STOP]
elsif rightBound != -1 && cursor > rightBound
line[printIndex] = "%s%s%s" % [Colors::BLUE, numbers[cursor].to_s, Colors::STOP]
else
line[printIndex] = numbers[cursor].to_s
end
cursor += 1
printIndex += spacing
end
print line.join()
xPrinterCount *= 2
print "\n"
end
end
heapify(numbers)
rightBound = numbers.length() - 1
while rightBound > 0
swap(numbers, 0, rightBound)
rightBound -= 1
siftDown(numbers, 0, rightBound)
end
return numbers
end
numbersCount = 14
maximalNumber = 10
numbers = numbersCount.times.map { Random.rand(maximalNumber) }
print numbers
print "\n---\n"
start = Time.now
sortedNumbers = heapsort(numbers)
finish = Time.now
heapSortTime = start - finish
start = Time.now
referenceSortedNumbers = numbers.sort()
finish = Time.now
referenceSortTime = start - finish
print "Reference sort: "
print referenceSortedNumbers
print "\n"
print "Reference sort time: %f\n" % referenceSortTime
print "Heap sort: "
print sortedNumbers
print "\n"
if DEMO == false
print "Heap sort time: %f\n" % heapSortTime
else
print "Disable DEMO for performance measure\n"
end
if sortedNumbers != referenceSortedNumbers
puts "Validation failed"
exit 1
else
puts "Validation success"
exit 0
end
Without visualization, this algorithm is not easy to understand, so the first thing I recommend is to write a function that will print the current view of the binary tree.
Links
https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/heapsort/heapsort.rb
References
http://rosettacode.org/wiki/Sorting_algorithms/Heapsort
https://www.youtube.com/watch?v=LbB357_RwlY
https://habr.com/ru/company/otus/blog/460087/
https://ru.wikipedia.org/wiki/Пирамидальная_сортировка
https://neerc.ifmo.ru/wiki/index.php?title=Сортировка_кучей
https://wiki5.ru/wiki/Heapsort
https://wiki.c2.com/?HeapSort
https://ru.wikipedia.org/wiki/Дерево (структура данных)
https://ru.wikipedia.org/wiki/Куча (структура данных)
https://www.youtube.com/watch?v=2DmK_H7IdTo
https://www.youtube.com/watch?v=kU4KBD4NFtw
https://www.youtube.com/watch?v=DU1uG5310x0
https://www.youtube.com/watch?v=BzQGPA_v-vc
https://www.geeksforgeeks.org/array-representation-of-binary-heap/
https://habr.com/ru/post/112222/
https://www.cs.usfca.edu/~galles/visualization/BST.html
https://www.youtube.com/watch?v=EQzqHWtsKq4
https://medium.com/@dimko1/%D0%B0%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC%D1%8B-%D1%81%D0%BE%D1%80%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%BA%D0%B8-heapsort-796ba965018b
https://ru.wikibrief.org/wiki/Heapsort
https://www.youtube.com/watch?v=GUUpmrTnNbw