Heapsort is a pyramid sort. The time complexity of the algorithm is O(n log n), fast, right? I would call this sorting – sorting of falling stones. It seems to me that the easiest way to explain it is visually.
The input is a list of numbers, for example:
5, 0, 7, 2, 3, 9, 4
From left to right, a data structure is made – a binary tree, or as I call it – a pyramid. Pyramid elements can have a maximum of two child elements, but only one top element.
Let’s make a binary tree:
⠀⠀5
⠀0⠀7
2 3 9 4
If you look at the pyramid for a long time, you can see that it is simply numbers from an array, following one after another, the number of elements on each floor is multiplied by two.
Next comes the most interesting part, we sort the pyramid from the bottom up, using the falling pebbles method (heapify). Sorting could start from the last floor (2 3 9 4), but there is no point because there is no floor below to fall to.
Therefore, we begin to drop elements from the penultimate floor (0 7)
⠀⠀5
⠀0⠀7
2 3 9 4
The first element to fall is chosen from the right, in our case it is 7, then we look at what is under it, and under it is 9 and 4, nine is greater than four, and nine is also greater than seven! We drop 7 on 9, and lift 9 to the place of 7.
⠀⠀5
⠀0⠀9
2 3 7 4
Next, we understand that the seven has nowhere to fall lower, we move on to the number 0, which is on the penultimate floor on the left:
⠀⠀5
⠀0⠀9
2 3 7 4
We look at what’s underneath it – 2 and 3, two is less than three, three is greater than zero, so we swap zero and three:
⠀⠀5
⠀3⠀9
2 0 7 4
When you reach the end of the floor, go to the floor above and drop everything there if you can.
The result will be a data structure – a heap, namely a max heap, since the largest element is at the top:
⠀⠀9
⠀3⠀7
2 0 5 4
If you return to the array representation, you will get a list:
[9, 3, 7, 2, 0, 5, 4]
From this we can conclude that by swapping the first and last elements, we will get the first number in the final sorted position, namely 9 should be at the end of the sorted list, swap them:
[4, 3, 7, 2, 0, 5, 9]
Let’s look at a binary tree:
⠀⠀4
⠀3⠀7
2 0 5 9
We have a situation where the bottom of the tree is sorted, we just need to drop 4 to the correct position, we repeat the algorithm, but we do not take into account the already sorted numbers, namely 9:
⠀⠀4
⠀3⠀7
2 0 5 9
⠀⠀7
⠀3⠀4
2 0 5 9
⠀⠀7
⠀3⠀5
2 0 4 9
It turned out that, having dropped 4, we raised the next largest number after 9 – 7. We swap the last unsorted number (4) and the largest number (7)
⠀⠀4
⠀3⠀5
2 0 7 9
It turns out that now we have two numbers in the correct final position:
4, 3, 5, 2, 0, 7, 9
Then we repeat the sorting algorithm, ignoring those already sorted, and as a result we get a heap of the following type:
⠀⠀0
⠀2⠀3
4 5 7 9
Or as a list:
0, 2, 3, 4, 5, 7, 9
Implementation
The algorithm is usually divided into three functions:
- Creating a heap
- Heapify Algorithm
- Replace the last unsorted element with the first
The heap is created by going through the penultimate row of the binary tree using the heapify function, from right to left until the end of the array. Then the first number swap is made in the loop, after which the first element falls/stays in place, as a result of which the largest element gets to the first place, the loop is repeated with the participants decreasing by one, since after each pass at the end of the list there are sorted numbers.
Heapsort example in Ruby:
module Colors
BLUE = "\033[94m"
RED = "\033[31m"
STOP = "\033[0m"
end
def heapsort(rawNumbers)
numbers = rawNumbers.dup
def swap(numbers, from, to)
temp = numbers[from]
numbers[from] = numbers[to]
numbers[to] = temp
end
def heapify(numbers)
count = numbers.length()
lastParentNode = (count - 2) / 2
for start in lastParentNode.downto(0)
siftDown(numbers, start, count - 1)
start -= 1
end
if DEMO
puts "--- heapify ends ---"
end
end
def siftDown(numbers, start, rightBound)
cursor = start
printBinaryHeap(numbers, cursor, rightBound)
def calculateLhsChildIndex(cursor)
return cursor * 2 + 1
end
def calculateRhsChildIndex(cursor)
return cursor * 2 + 2
end
while calculateLhsChildIndex(cursor) <= rightBound
lhsChildIndex = calculateLhsChildIndex(cursor)
rhsChildIndex = calculateRhsChildIndex(cursor)
lhsNumber = numbers[lhsChildIndex]
biggerChildIndex = lhsChildIndex
if rhsChildIndex <= rightBound
rhsNumber = numbers[rhsChildIndex]
if lhsNumber < rhsNumber
biggerChildIndex = rhsChildIndex
end
end
if numbers[cursor] < numbers[biggerChildIndex]
swap(numbers, cursor, biggerChildIndex)
cursor = biggerChildIndex
else
break
end
printBinaryHeap(numbers, cursor, rightBound)
end
printBinaryHeap(numbers, cursor, rightBound)
end
def printBinaryHeap(numbers, nodeIndex = -1, rightBound = -1)
if DEMO == false
return
end
perLineWidth = (numbers.length() * 4).to_i
linesCount = Math.log2(numbers.length()).ceil()
xPrinterCount = 1
cursor = 0
spacing = 3
for y in (0..linesCount)
line = perLineWidth.times.map { " " }
spacing = spacing == 3 ? 4 : 3
printIndex = (perLineWidth / 2) - (spacing * xPrinterCount) / 2
for x in (0..xPrinterCount - 1)
if cursor >= numbers.length
break
end
if nodeIndex != -1 && cursor == nodeIndex
line[printIndex] = "%s%s%s" % [Colors::RED, numbers[cursor].to_s, Colors::STOP]
elsif rightBound != -1 && cursor > rightBound
line[printIndex] = "%s%s%s" % [Colors::BLUE, numbers[cursor].to_s, Colors::STOP]
else
line[printIndex] = numbers[cursor].to_s
end
cursor += 1
printIndex += spacing
end
print line.join()
xPrinterCount *= 2
print "\n"
end
end
heapify(numbers)
rightBound = numbers.length() - 1
while rightBound > 0
swap(numbers, 0, rightBound)
rightBound -= 1
siftDown(numbers, 0, rightBound)
end
return numbers
end
numbersCount = 14
maximalNumber = 10
numbers = numbersCount.times.map { Random.rand(maximalNumber) }
print numbers
print "\n---\n"
start = Time.now
sortedNumbers = heapsort(numbers)
finish = Time.now
heapSortTime = start - finish
start = Time.now
referenceSortedNumbers = numbers.sort()
finish = Time.now
referenceSortTime = start - finish
print "Reference sort: "
print referenceSortedNumbers
print "\n"
print "Reference sort time: %f\n" % referenceSortTime
print "Heap sort: "
print sortedNumbers
print "\n"
if DEMO == false
print "Heap sort time: %f\n" % heapSortTime
else
print "Disable DEMO for performance measure\n"
end
if sortedNumbers != referenceSortedNumbers
puts "Validation failed"
exit 1
else
puts "Validation success"
exit 0
end
Without visualization, this algorithm is not easy to understand, so the first thing I recommend is to write a function that will print the current form of the binary tree.
Links
https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/heapsort/heapsort.rb
Sources
http://rosettacode.org/wiki/Sorting_algorithms/Heapsort
https://www.youtube.com/watch?v=LbB357_RwlY
https://habr.com/ru/company/ otus/blog/460087/
https://ru.wikipedia.org/wiki/Пыramidal_sorting
https://neerc.ifmo.ru/wiki /index.php?title=Heap_sort
https://wiki5.ru/wiki/Heapsort
https://wiki.c2.com/?HeapSort p>
https://ru.wikipedia.org/wiki/Tree (data structure)
https://ru.wikipedia.org/wiki/Heap (data structure)
https://www.youtube.com/watch?v=2DmK_H7IdTo
https://www.youtube.com/watch?v=kU4KBD4NFtw
https://www.youtube.com/watch?v=DU1uG5310x0
https://www.youtube.com/watch?v =BzQGPA_v-vc
https://www.geeksforgeeks.org/ array-representation-of-binary-heap/
https://habr.com/ru/post/112222/ a>
https://www.cs.usfca. edu/~galles/visualization/BST.html
https://www.youtube.com/watch?v=EQzqHWtsKq4
https://ru.wikibrief.org/wiki/Heapsort
https://www.youtube.com/watch?v=GUUpmrTnNbw