Heapsort

Heapsort – heap sort. Time complexity – O(n log n), fast eh? I would call this sorting – sorting of falling stones. It seems to me that the easiest way to explain it is visually.

The input is a list of numbers, for example:
5, 0, 7, 2, 3, 9, 4

From left to right, a data structure is made – a binary tree, or as I call it – a pyramid. Pyramid elements can have a maximum of two child elements, with only one top element.

Let’s make a binary tree:
⠀⠀5
⠀0⠀7
2 3 9 4

If you look at the pyramid for a long time, you can see that these are just numbers from the array, going one after another, the number of elements in each floor is multiplied by two.

Then the fun begins, we sort the pyramid from bottom to top, using the falling pebbles method (heapify). Sorting could be started from the last floor (2 3 9 4 ), but it makes no sense because there is no floor below where one could fall.

Therefore, we start dropping elements from the penultimate floor (0 7)
⠀⠀5
⠀0⠀7
2 3 9 4

The first element to fall is selected on the right, in our case it is 7, then we look at what is under it, and below it are 9 and 4, nine is more than four, so also nine is more than seven! We drop 7 on 9, and raise 9 to place 7.
⠀⠀5
⠀0⠀9
2 3 7 4

Further, we understand that the seven has nowhere to fall below, go to the number 0, which is located on the penultimate floor on the left:
⠀⠀5
0⠀9
2 3 7 4

We look at what is under it – 2 and 3, two is less than three, three is greater than zero, so we change zero and three in places:
⠀⠀5
⠀3⠀9
2 0 7 4

When you get to the end of the floor, go to the floor above and drop everything there if you can.
The result is a data structure – a heap (heap), namely max heap, because at the top is the largest element:
⠀⠀9
⠀3⠀7
2 0 5 4

If you return it to an array representation, you get a list:
[9, 3, 7, 2, 0, 5, 4]

From this we can conclude that by swapping the first and last element, we will get the first number in the final sorted position, namely 9 should be at the end of the sorted list, swap:
[4, 3, 7, 2, 0, 5, 9]

Let’s look at the binary tree:
⠀⠀4
⠀3⠀7
2 0 5 9

The result is a situation in which the lower part of the tree is sorted, you just need to drop 4 to the correct position, repeat the algorithm, but do not take into account the already sorted numbers, namely 9:
⠀⠀4
⠀3⠀7
2 0 5 9

⠀⠀7
⠀3⠀4
2 0 5 9

⠀⠀7
⠀3⠀5
2 0 4 9

It turned out that we, having dropped 4, raised the next largest number after 9 – 7. Swap the last unsorted number (4) and the largest number (7)
⠀⠀4
⠀3⠀5
2 0 7 9

It turned out that now we have two numbers in the correct final position:
4, 3, 5, 2, 0, 7, 9

Next, we repeat the sorting algorithm, ignoring those already sorted, as a result we get heap:
⠀⠀0
⠀2⠀3
4 5 7 9

Or as a list:
0, 2, 3, 4, 5, 7, 9

Implementation

The algorithm is usually divided into three functions:

  1. Heap creation
  2. Sifting algorithm (heapify)
  3. Replacing the last unsorted element and the first one

A heap is created by traversing the penultimate row of the binary tree using the heapify function, from right to left to the end of the array. Then the first replacement of numbers is made in the cycle, after which the first element falls / remains in place, as a result of which the largest element falls into first place, the cycle repeats with a decrease in participants by one, because after each pass, the sorted numbers remain at the end of the list.

Heapsort example in Ruby:

DEMO = true

module Colors
    BLUE = "\033[94m"
    RED = "\033[31m"
    STOP = "\033[0m"
end

def heapsort(rawNumbers)
    numbers = rawNumbers.dup

    def swap(numbers, from, to)
        temp = numbers[from]
        numbers[from] = numbers[to]
        numbers[to] = temp
    end

    def heapify(numbers)
        count = numbers.length()
        lastParentNode = (count - 2) / 2

        for start in lastParentNode.downto(0)
            siftDown(numbers, start, count - 1)
            start -= 1 
        end

        if DEMO
            puts "--- heapify ends ---"
        end
    end

    def siftDown(numbers, start, rightBound)      
        cursor = start
        printBinaryHeap(numbers, cursor, rightBound)

        def calculateLhsChildIndex(cursor)
            return cursor * 2 + 1
        end

        def calculateRhsChildIndex(cursor)
            return cursor * 2 + 2
        end            

        while calculateLhsChildIndex(cursor) <= rightBound
            lhsChildIndex = calculateLhsChildIndex(cursor)
            rhsChildIndex = calculateRhsChildIndex(cursor)

            lhsNumber = numbers[lhsChildIndex]
            biggerChildIndex = lhsChildIndex

            if rhsChildIndex <= rightBound
                rhsNumber = numbers[rhsChildIndex]
                if lhsNumber < rhsNumber
                    biggerChildIndex = rhsChildIndex
                end
            end

            if numbers[cursor] < numbers[biggerChildIndex]
                swap(numbers, cursor, biggerChildIndex)
                cursor = biggerChildIndex
            else
                break
            end
            printBinaryHeap(numbers, cursor, rightBound)
        end
        printBinaryHeap(numbers, cursor, rightBound)
    end

    def printBinaryHeap(numbers, nodeIndex = -1, rightBound = -1)
        if DEMO == false
            return
        end
        perLineWidth = (numbers.length() * 4).to_i
        linesCount = Math.log2(numbers.length()).ceil()
        xPrinterCount = 1
        cursor = 0
        spacing = 3
        for y in (0..linesCount)
            line = perLineWidth.times.map { " " }
            spacing = spacing == 3 ? 4 : 3
            printIndex = (perLineWidth / 2) - (spacing * xPrinterCount) / 2
            for x in (0..xPrinterCount - 1)
                if cursor >= numbers.length
                    break
                end
                if nodeIndex != -1 && cursor == nodeIndex
                    line[printIndex] = "%s%s%s" % [Colors::RED, numbers[cursor].to_s, Colors::STOP]
                elsif rightBound != -1 && cursor > rightBound
                    line[printIndex] = "%s%s%s" % [Colors::BLUE, numbers[cursor].to_s, Colors::STOP]
                else
                    line[printIndex] = numbers[cursor].to_s
                end
                cursor += 1
                printIndex += spacing
            end
            print line.join()
            xPrinterCount *= 2           
            print "\n"            
        end
    end

    heapify(numbers)
    rightBound = numbers.length() - 1

    while rightBound > 0
        swap(numbers, 0, rightBound)   
        rightBound -= 1
        siftDown(numbers, 0, rightBound)     
    end

    return numbers
end

numbersCount = 14
maximalNumber = 10
numbers = numbersCount.times.map { Random.rand(maximalNumber) }
print numbers
print "\n---\n"

start = Time.now
sortedNumbers = heapsort(numbers)
finish = Time.now
heapSortTime = start - finish

start = Time.now
referenceSortedNumbers = numbers.sort()
finish = Time.now
referenceSortTime = start - finish

print "Reference sort: "
print referenceSortedNumbers
print "\n"
print "Reference sort time: %f\n" % referenceSortTime
print "Heap sort:      "
print sortedNumbers
print "\n"
if DEMO == false
    print "Heap sort time:      %f\n" % heapSortTime
else
    print "Disable DEMO for performance measure\n"
end

if sortedNumbers != referenceSortedNumbers
    puts "Validation failed"
    exit 1
else
    puts "Validation success"
    exit 0
end

Without visualization, this algorithm is not easy to understand, so the first thing I recommend is to write a function that will print the current view of the binary tree.

Links

https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/heapsort/heapsort.rb

References

http://rosettacode.org/wiki/Sorting_algorithms/Heapsort
https://www.youtube.com/watch?v=LbB357_RwlY
https://habr.com/ru/company/otus/blog/460087/
https://ru.wikipedia.org/wiki/Пирамидальная_сортировка
https://neerc.ifmo.ru/wiki/index.php?title=Сортировка_кучей
https://wiki5.ru/wiki/Heapsort
https://wiki.c2.com/?HeapSort
https://ru.wikipedia.org/wiki/Дерево (структура данных)
https://ru.wikipedia.org/wiki/Куча (структура данных)
https://www.youtube.com/watch?v=2DmK_H7IdTo
https://www.youtube.com/watch?v=kU4KBD4NFtw
https://www.youtube.com/watch?v=DU1uG5310x0
https://www.youtube.com/watch?v=BzQGPA_v-vc
https://www.geeksforgeeks.org/array-representation-of-binary-heap/
https://habr.com/ru/post/112222/
https://www.cs.usfca.edu/~galles/visualization/BST.html
https://www.youtube.com/watch?v=EQzqHWtsKq4
https://medium.com/@dimko1/%D0%B0%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC%D1%8B-%D1%81%D0%BE%D1%80%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%BA%D0%B8-heapsort-796ba965018b
https://ru.wikibrief.org/wiki/Heapsort
https://www.youtube.com/watch?v=GUUpmrTnNbw