Tree sort

Tree sort – binary search tree sort. Time complexity – O(n²). In such a tree, each node has numbers less than the node on the left, more than the node on the right, when coming from the root and printing the values ​​from left to right, we get a sorted list of numbers. Surprising huh?

Consider the binary search tree schema:

Derrick Coetzee (public domain)

Try to manually read the numbers starting from the penultimate left node of the lower left corner, for each node on the left – a node – on the right.

It will turn out like this:

  1. The penultimate node at the bottom left is 3.
  2. She has a left branch – 1.
  3. Take this number (1)
  4. Next, take the vertex 3 (1, 3) itself
  5. To the right is branch 6, but it contains branches. Therefore, we read it in the same way.
  6. Left branch of node 6 number 4 (1, 3, 4)
  7. Node 6 itself (1, 3, 4, 6)
  8. Right 7 (1, 3, 4, 6, 7)
  9. Go up to the root node – 8 (1,3, 4 ,6, 7, 8)
  10. Print everything on the right by analogy
  11. Get the final list – 1, 3, 4, 6, 7, 8, 10, 13, 14

To implement the algorithm in code, you need two functions:

  1. Building a binary search tree
  2. Printing the binary search tree in the correct order

They assemble a binary search tree in the same way as they read it, a number is attached to each node on the left or right, depending on whether it is less or more.

Lua example:

Node = {value = nil, lhs = nil, rhs = nil}

function Node:new(value, lhs, rhs)
    output = {}
    setmetatable(output, self)
    self.__index = self  
    output.value = value
    output.lhs = lhs
    output.rhs = rhs
    output.counter = 1
    return output  
end

function Node:Increment()
    self.counter = self.counter + 1
end

function Node:Insert(value)
    if self.lhs ~= nil and self.lhs.value > value then
        self.lhs:Insert(value)
        return
    end

    if self.rhs ~= nil and self.rhs.value < value then
        self.rhs:Insert(value)
        return
    end

    if self.value == value then
        self:Increment()
        return
    elseif self.value > value then
        if self.lhs == nil then
            self.lhs = Node:new(value, nil, nil)
        else
            self.lhs:Insert(value)
        end
        return
    else
        if self.rhs == nil then
            self.rhs = Node:new(value, nil, nil)
        else
            self.rhs:Insert(value)
        end
        return
    end
end

function Node:InOrder(output)
    if self.lhs ~= nil then
       output = self.lhs:InOrder(output)
    end
    output = self:printSelf(output)
    if self.rhs ~= nil then
        output = self.rhs:InOrder(output)
    end
    return output
end

function Node:printSelf(output)
    for i=0,self.counter-1 do
        output = output .. tostring(self.value) .. " "
    end
    return output
end

function PrintArray(numbers)
    output = ""
    for i=0,#numbers do
        output = output .. tostring(numbers[i]) .. " "
    end    
    print(output)
end

function Treesort(numbers)
    rootNode = Node:new(numbers[0], nil, nil)
    for i=1,#numbers do
        rootNode:Insert(numbers[i])
    end
    print(rootNode:InOrder(""))
end


numbersCount = 10
maxNumber = 9

numbers = {}

for i=0,numbersCount-1 do
    numbers[i] = math.random(0, maxNumber)
end

PrintArray(numbers)
Treesort(numbers)

An important nuance is that for numbers that are equal to the vertex, a lot of interesting mechanisms for hooking to the node have been invented, but I just added a counter to the vertex class, when printing, the numbers are returned by the counter.

Links

https://gitlab.com/demensdeum /algorithms/-/tree/master/sortAlgorithms/treesort

References

TreeSort Algorithm Explained and Implemented with Examples in Java | Sorting Algorithms | Geekific – YouTube

Tree sort – YouTube

Convert Sorted Array to Binary Search Tree (LeetCode 108. Algorithm Explained) – YouTube

Sorting algorithms/Tree sort on a linked list – Rosetta Code

Tree Sort – GeeksforGeeks

Tree sort – Wikipedia

How to handle duplicates in Binary Search Tree? – GeeksforGeeks

Tree Sort | GeeksforGeeks – YouTube

Bucket Sort

Bucket Sort – bucket sorting. The algorithm is similar to sorting by counting, with the difference that the numbers are collected into “buckets”-ranges, then the buckets are sorted using any other, sufficiently productive, sorting algorithm, and the final chord is the expansion of the “buckets” one by one, resulting in a sorted list.

The time complexity of the algorithm is O(nk). The algorithm runs in linear time for data that obeys a uniform distribution. To put it simply, the elements must be in a certain range, without “splashes”, for example, numbers from 0.0 to 1.0. If among such numbers there are 4 or 999, then such a series, according to the laws of the yard, is no longer considered “even”.

Implementation example in Julia:

function bucketSort(numbers, bucketsCount)
    buckets = Vector{Vector{Int}}()
    
    for i in 0:bucketsCount - 1
        bucket = Vector{Int}()
        push!(buckets, bucket)
    end

    maxNumber = maximum(numbers)

    for i in 0:length(numbers) - 1
        bucketIndex = 1 + Int(floor(bucketsCount * numbers[1 + i] / (maxNumber + 1)))
        push!(buckets[bucketIndex], numbers[1 + i])
    end

    for i in 0:length(buckets) - 1
        bucketIndex = 1 + i
        buckets[bucketIndex] = sort(buckets[bucketIndex])
    end

    flat = [(buckets...)...]
    print(flat, "\n")

end

numbersCount = 10
maxNumber = 10
numbers = rand(1:maxNumber, numbersCount)
print(numbers,"\n")
bucketsCount = 10
bucketSort(numbers, bucketsCount)

The performance of the algorithm is also affected by the number of buckets, for more numbers it is better to take a larger number of buckets (Algorithms in a nutshell by George T. Heineman)

Links

https://gitlab.com/demensdeum/algorithms/-/tree/master/sortAlgorithms/bucketSort

References

https://www.youtube.com/watch?v=VuXbEb5ywrU
https://www.youtube.com/watch?v=ELrhrrCjDOA
https://medium.com/karuna-sehgal/an-introduction-to-bucket-sort-62aa5325d124
https://www.geeksforgeeks.org/bucket-sort-2/
https://ru.wikipedia.org/wiki/%D0%91%D0%BB%D0%BE%D1%87%D0%BD%D0%B0%D1%8F_%D1%81%D0%BE%D1%80%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%BA%D0%B0
https://www.youtube.com/watch?v=LPrF9yEKTks
https://en.wikipedia.org/wiki/Bucket_sort
https://julialang.org/
https://www.oreilly.com/library/view/algorithms-in-a/9780596516246/ch04s08.html

Radix Sort

Radix Sort – radix sort. The algorithm is similar to counting sort in that there is no comparison of elements, instead elements are *character-by-character* grouped into *buckets* (buckets), the bucket is selected by the index of the current number-character. Time complexity – O(nd).

Works like this:

  • The input will be the numbers 6, 12, 44, 9
  • Let’s create 10 buckets of lists (0-9) into which we will add/sort numbers bit by bit.

Further:

  1. Run a loop with counter i up to the maximum number of characters in the number
  2. At index i from right to left we get one character for each number, if there is no character, then we consider it to be zero
  3. The character is converted to a number
  4. Select a bucket by index – number, put the whole number there
  5. After finishing iterating over numbers, convert all buckets back to a list of numbers
  6. Get numbers sorted by digit
  7. Repeat until all digits run out

Radix Sort example in Scala:

import scala.collection.mutable.ListBuffer
import scala.util.Random.nextInt

object RadixSort {
    def main(args: Array[String]) = {
        var maxNumber = 200
        var numbersCount = 30
        var maxLength = maxNumber.toString.length() - 1

        var referenceNumbers = LazyList.continually(nextInt(maxNumber + 1)).take(numbersCount).toList
        var numbers = referenceNumbers
        
        var buckets = List.fill(10)(ListBuffer[Int]())

        for( i <- 0 to maxLength) { numbers.foreach( number => {
                    var numberString = number.toString
                    if (numberString.length() > i) {
                        var index = numberString.length() - i - 1
                        var character = numberString.charAt(index).toString
                        var characterInteger = character.toInt  
                        buckets.apply(characterInteger) += number
                    }
                    else {
                        buckets.apply(0) += number
                    }
                }
            )
            numbers = buckets.flatten
            buckets.foreach(x => x.clear())
        }
        println(referenceNumbers)
        println(numbers)
        println(s"Validation result: ${numbers == referenceNumbers.sorted}")
    }
}

The algorithm also has a version for parallel execution, for example on the GPU; there is also a variant of bit sort, which is probably very interesting and truly breathtaking!

Links

https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/radixSort/radixSort.scala

Sources

https://ru.wikipedia.org/wiki/%D0%9F%D0%BE%D1%80%D0%B0%D0%B7%D1%80% D1%8F%D0%B4%D0%BD%D0%B0%D1%8F_%D1%81%D0%BE%D1%80%D1%82%D0%B8%D1%80%D0%BE%D0% B2%D0%BA%D0%B0
https://www.geeksforgeeks.org/radix-sort/
https://www.youtube.com/watch?v=toAlAJKojos
https://github.com/gyatskov/radix-sort

Heapsort

Heapsort – heap sort. Time complexity – O(n log n), fast eh? I would call this sorting – sorting of falling stones. It seems to me that the easiest way to explain it is visually.

The input is a list of numbers, for example:
5, 0, 7, 2, 3, 9, 4

From left to right, a data structure is made – a binary tree, or as I call it – a pyramid. Pyramid elements can have a maximum of two child elements, with only one top element.

Let’s make a binary tree:
⠀⠀5
⠀0⠀7
2 3 9 4

If you look at the pyramid for a long time, you can see that these are just numbers from the array, going one after another, the number of elements in each floor is multiplied by two.

Then the fun begins, we sort the pyramid from bottom to top, using the falling pebbles method (heapify). Sorting could be started from the last floor (2 3 9 4 ), but it makes no sense because there is no floor below where one could fall.

Therefore, we start dropping elements from the penultimate floor (0 7)
⠀⠀5
⠀0⠀7
2 3 9 4

The first element to fall is selected on the right, in our case it is 7, then we look at what is under it, and below it are 9 and 4, nine is more than four, so also nine is more than seven! We drop 7 on 9, and raise 9 to place 7.
⠀⠀5
⠀0⠀9
2 3 7 4

Further, we understand that the seven has nowhere to fall below, go to the number 0, which is located on the penultimate floor on the left:
⠀⠀5
0⠀9
2 3 7 4

We look at what is under it – 2 and 3, two is less than three, three is greater than zero, so we change zero and three in places:
⠀⠀5
⠀3⠀9
2 0 7 4

When you get to the end of the floor, go to the floor above and drop everything there if you can.
The result is a data structure – a heap (heap), namely max heap, because at the top is the largest element:
⠀⠀9
⠀3⠀7
2 0 5 4

If you return it to an array representation, you get a list:
[9, 3, 7, 2, 0, 5, 4]

From this we can conclude that by swapping the first and last element, we will get the first number in the final sorted position, namely 9 should be at the end of the sorted list, swap:
[4, 3, 7, 2, 0, 5, 9]

Let’s look at the binary tree:
⠀⠀4
⠀3⠀7
2 0 5 9

The result is a situation in which the lower part of the tree is sorted, you just need to drop 4 to the correct position, repeat the algorithm, but do not take into account the already sorted numbers, namely 9:
⠀⠀4
⠀3⠀7
2 0 5 9

⠀⠀7
⠀3⠀4
2 0 5 9

⠀⠀7
⠀3⠀5
2 0 4 9

It turned out that we, having dropped 4, raised the next largest number after 9 – 7. Swap the last unsorted number (4) and the largest number (7)
⠀⠀4
⠀3⠀5
2 0 7 9

It turned out that now we have two numbers in the correct final position:
4, 3, 5, 2, 0, 7, 9

Next, we repeat the sorting algorithm, ignoring those already sorted, as a result we get heap:
⠀⠀0
⠀2⠀3
4 5 7 9

Or as a list:
0, 2, 3, 4, 5, 7, 9

Implementation

The algorithm is usually divided into three functions:

  1. Heap creation
  2. Sifting algorithm (heapify)
  3. Replacing the last unsorted element and the first one

A heap is created by traversing the penultimate row of the binary tree using the heapify function, from right to left to the end of the array. Then the first replacement of numbers is made in the cycle, after which the first element falls / remains in place, as a result of which the largest element falls into first place, the cycle repeats with a decrease in participants by one, because after each pass, the sorted numbers remain at the end of the list.

Heapsort example in Ruby:

DEMO = true

module Colors
    BLUE = "\033[94m"
    RED = "\033[31m"
    STOP = "\033[0m"
end

def heapsort(rawNumbers)
    numbers = rawNumbers.dup

    def swap(numbers, from, to)
        temp = numbers[from]
        numbers[from] = numbers[to]
        numbers[to] = temp
    end

    def heapify(numbers)
        count = numbers.length()
        lastParentNode = (count - 2) / 2

        for start in lastParentNode.downto(0)
            siftDown(numbers, start, count - 1)
            start -= 1 
        end

        if DEMO
            puts "--- heapify ends ---"
        end
    end

    def siftDown(numbers, start, rightBound)      
        cursor = start
        printBinaryHeap(numbers, cursor, rightBound)

        def calculateLhsChildIndex(cursor)
            return cursor * 2 + 1
        end

        def calculateRhsChildIndex(cursor)
            return cursor * 2 + 2
        end            

        while calculateLhsChildIndex(cursor) <= rightBound
            lhsChildIndex = calculateLhsChildIndex(cursor)
            rhsChildIndex = calculateRhsChildIndex(cursor)

            lhsNumber = numbers[lhsChildIndex]
            biggerChildIndex = lhsChildIndex

            if rhsChildIndex <= rightBound
                rhsNumber = numbers[rhsChildIndex]
                if lhsNumber < rhsNumber
                    biggerChildIndex = rhsChildIndex
                end
            end

            if numbers[cursor] < numbers[biggerChildIndex]
                swap(numbers, cursor, biggerChildIndex)
                cursor = biggerChildIndex
            else
                break
            end
            printBinaryHeap(numbers, cursor, rightBound)
        end
        printBinaryHeap(numbers, cursor, rightBound)
    end

    def printBinaryHeap(numbers, nodeIndex = -1, rightBound = -1)
        if DEMO == false
            return
        end
        perLineWidth = (numbers.length() * 4).to_i
        linesCount = Math.log2(numbers.length()).ceil()
        xPrinterCount = 1
        cursor = 0
        spacing = 3
        for y in (0..linesCount)
            line = perLineWidth.times.map { " " }
            spacing = spacing == 3 ? 4 : 3
            printIndex = (perLineWidth / 2) - (spacing * xPrinterCount) / 2
            for x in (0..xPrinterCount - 1)
                if cursor >= numbers.length
                    break
                end
                if nodeIndex != -1 && cursor == nodeIndex
                    line[printIndex] = "%s%s%s" % [Colors::RED, numbers[cursor].to_s, Colors::STOP]
                elsif rightBound != -1 && cursor > rightBound
                    line[printIndex] = "%s%s%s" % [Colors::BLUE, numbers[cursor].to_s, Colors::STOP]
                else
                    line[printIndex] = numbers[cursor].to_s
                end
                cursor += 1
                printIndex += spacing
            end
            print line.join()
            xPrinterCount *= 2           
            print "\n"            
        end
    end

    heapify(numbers)
    rightBound = numbers.length() - 1

    while rightBound > 0
        swap(numbers, 0, rightBound)   
        rightBound -= 1
        siftDown(numbers, 0, rightBound)     
    end

    return numbers
end

numbersCount = 14
maximalNumber = 10
numbers = numbersCount.times.map { Random.rand(maximalNumber) }
print numbers
print "\n---\n"

start = Time.now
sortedNumbers = heapsort(numbers)
finish = Time.now
heapSortTime = start - finish

start = Time.now
referenceSortedNumbers = numbers.sort()
finish = Time.now
referenceSortTime = start - finish

print "Reference sort: "
print referenceSortedNumbers
print "\n"
print "Reference sort time: %f\n" % referenceSortTime
print "Heap sort:      "
print sortedNumbers
print "\n"
if DEMO == false
    print "Heap sort time:      %f\n" % heapSortTime
else
    print "Disable DEMO for performance measure\n"
end

if sortedNumbers != referenceSortedNumbers
    puts "Validation failed"
    exit 1
else
    puts "Validation success"
    exit 0
end

Without visualization, this algorithm is not easy to understand, so the first thing I recommend is to write a function that will print the current view of the binary tree.

Links

https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/heapsort/heapsort.rb

References

http://rosettacode.org/wiki/Sorting_algorithms/Heapsort
https://www.youtube.com/watch?v=LbB357_RwlY
https://habr.com/ru/company/otus/blog/460087/
https://ru.wikipedia.org/wiki/Пирамидальная_сортировка
https://neerc.ifmo.ru/wiki/index.php?title=Сортировка_кучей
https://wiki5.ru/wiki/Heapsort
https://wiki.c2.com/?HeapSort
https://ru.wikipedia.org/wiki/Дерево (структура данных)
https://ru.wikipedia.org/wiki/Куча (структура данных)
https://www.youtube.com/watch?v=2DmK_H7IdTo
https://www.youtube.com/watch?v=kU4KBD4NFtw
https://www.youtube.com/watch?v=DU1uG5310x0
https://www.youtube.com/watch?v=BzQGPA_v-vc
https://www.geeksforgeeks.org/array-representation-of-binary-heap/
https://habr.com/ru/post/112222/
https://www.cs.usfca.edu/~galles/visualization/BST.html
https://www.youtube.com/watch?v=EQzqHWtsKq4
https://medium.com/@dimko1/%D0%B0%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC%D1%8B-%D1%81%D0%BE%D1%80%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%BA%D0%B8-heapsort-796ba965018b
https://ru.wikibrief.org/wiki/Heapsort
https://www.youtube.com/watch?v=GUUpmrTnNbw

Quicksort

Quicksort is a divide-and-conquer sorting algorithm. Recursively, in parts, parse the array of numbers, setting the numbers in the smaller and larger order from the selected pivot element, insert the pivot element itself into the hot-point between them. After a few recursive iterations, you’ll end up with a sorted list. Time complexity O(n2).

Scheme:

  1. We start with the fact that we get a list of elements outside, sorting boundaries. In the first step, the sort boundaries will be from start to finish.
  2. Check that the boundaries of the beginning and end do not intersect, if this happens, then it’s time to finish
  3. Select some element from the list, call it pivot
  4. Shift to the right to the end to the last index, so as not to interfere
  5. Create a counter of *smaller numbers* while equal to zero
  6. Loop through the list from left to right, up to and including the last index where the anchor element is located
  7. Each element is compared to the pivot
  8. If it is less than the reference, then we swap it in places according to the index of the counter of smaller numbers. Increment the counter of smaller numbers.
  9. When the cycle reaches the anchor element, we stop, swap the anchor element with the element by the counter of smaller numbers.
  10. We run the algorithm separately for the left smaller part of the list, and separately for the right most part of the list.
  11. As a result, all recursive iterations will start to stop due to the check in paragraph 2
  12. Get sorted list

Quicksort was invented by the scientist Charles Antony Richard Hoare at Moscow State University, having learned Russian, he studied computer translation, as well as probability theory at the Kolmogorov school. In 1960, due to a political crisis, he left the Soviet Union.

An example implementation in Rust:

extern crate rand;

use rand::Rng;

fn swap(numbers: &mut [i64], from: usize, to: usize) {
    let temp = numbers[from];
    numbers[from] = numbers[to];
    numbers[to] = temp;
}

fn quicksort(numbers: &mut [i64], left: usize, right: usize) {
    if left >= right {
        return
    }
    let length = right - left;
    if length <= 1 {
        return
    }
    let pivot_index = left + (length / 2);
    let pivot = numbers[pivot_index];

    let last_index = right - 1;
    swap(numbers, pivot_index, last_index);

    let mut less_insert_index = left;

    for i in left..last_index {
        if numbers[i] < pivot {
            swap(numbers, i, less_insert_index);
            less_insert_index += 1;
        }
    }
    swap(numbers, last_index, less_insert_index);
    quicksort(numbers, left, less_insert_index);
    quicksort(numbers, less_insert_index + 1, right);
}

fn main() {
    let mut numbers = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0];
    let mut reference_numbers = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0];

    let mut rng = rand::thread_rng();
    for i in 0..numbers.len() {
        numbers[i] = rng.gen_range(-10..10);
        reference_numbers[i] = numbers[i];
    }

    reference_numbers.sort();

  println!("Numbers           {:?}", numbers);
  let length = numbers.len();
  quicksort(&mut numbers, 0, length);
  println!("Numbers           {:?}", numbers);
  println!("Reference numbers {:?}", reference_numbers);

  if numbers != reference_numbers {
    println!("Validation failed");
    std::process::exit(1);
  }
  else {
    println!("Validation success!");
    std::process::exit(0);
  }
}

If you don't understand anything, I suggest watching a video by Rob Edwards from the San Diego State University https://www.youtube.com/watch?v=ZHVk2blR45Q it most simply, step by step, shows the essence and implementation of the algorithm.

Links

https://gitlab.com/demensdeum/algorithms/-/tree/master/sortAlgorithms/quickSort

References

https://www.youtube.com/watch?v=4s-aG6yGGLU
https://www.youtube.com/watch?v=ywWBy6J5gz8
https://www.youtube.com/watch?v=Hoixgm4-P4M
https://ru.wikipedia.org/wiki/Быстрая_сортировка
https://www.youtube.com/watch?v=Hoixgm4-P4M
https://www.youtube.com/watch?v=XE4VP_8Y0BU
https://www.youtube.com/watch?v=MZaf_9IZCrc
https://www.youtube.com/watch?v=ZHVk2blR45Q
http://rosettacode.org/wiki/Sorting_algorithms/Quicksort
https://www.youtube.com/watch?v=4s-aG6yGGLU
https://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=maibrCbZWKw
https://www.geeksforgeeks.org/quick-sort/
https://www.youtube.com/watch?v=uXBnyYuwPe8

Binary Insertion Sort

Binary Insertion Sort is a variant of insertion sort in which the insertion position is determined using a binary search. The time complexity of the algorithm is O(n2)

The algorithm works like this:

  1. Loop from zero to end of list
  2. In the loop, a number is selected for sorting, the number is stored in a separate variable
  3. Binary search looks for the index to insert this number against the numbers on the left
  4. After finding the index, the numbers on the left are shifted one position to the right, starting at the insertion index. The process will erase the number to be sorted.
  5. The previously stored number is inserted at the insertion index
  6. At the end of the loop, the entire list will be sorted

During a binary search, it is possible that the number will not be found, but the index is not returned. Due to the peculiarities of the binary search, the number closest to the desired one will be found, then to return the index it will be necessary to compare it with the desired one, if the desired one is less, then the desired one should be at the index on the left, and if it is greater or equal, then on the right.

Source code in Go:

package main

import (
	"fmt"
	"math/rand"
	"time"
)

const numbersCount = 20
const maximalNumber = 100

func binarySearch(numbers []int, item int, low int, high int) int {
	for high > low {
		center := (low + high) / 2
		if numbers[center] < item { low = center + 1 } else if numbers[center] > item {
			high = center - 1
		} else {
			return center
		}
	}

	if numbers[low] < item {
		return low + 1
	} else {
		return low
	}
}

func main() {
	rand.Seed(time.Now().Unix())
	var numbers [numbersCount]int
	for i := 0; i < numbersCount; i++ {
		numbers[i] = rand.Intn(maximalNumber)
	}
	fmt.Println(numbers)

	for i := 1; i < len(numbers); i++ { searchAreaLastIndex := i - 1 insertNumber := numbers[i] insertIndex := binarySearch(numbers[:], insertNumber, 0, searchAreaLastIndex) for x := searchAreaLastIndex; x >= insertIndex; x-- {
			numbers[x+1] = numbers[x]
		}
		numbers[insertIndex] = insertNumber
	}
	fmt.Println(numbers)
}

Links

https://gitlab.com/demensdeum/algorithms/-/blob/master/sortAlgorithms/binaryInsertionSort/binaryInsertionSort.go

References

https://www.geeksforgeeks.org/binary-insertion-sort/
https://www.youtube.com/watch?v=-OVB5pOZJug

Shell Sort

Shell Sort is a variant of sorting by inserts with preliminary combing of an array of numbers.

We need to remember how insertion sort works:

  1. The loop starts from zero to the end of the loop, thus the array is divided into two parts
  2. For the left side, the second loop is started, comparing elements from right to left, the smaller element on the right is omitted until there is a smaller element on the left
  3. At the end of both cycles, we get a sorted list

Once in a while, computer scientist Donald Shell wondered how to improve the insertion sort algorithm. He came up with the idea to preliminarily go through the array with two cycles, but at a certain distance, gradually reducing the “comb” until it turns into a regular insertion sort algorithm. Everything is really so simple, no pitfalls, we add another one to the two cycles from above, in which we gradually reduce the size of the “comb”. The only thing that will need to be done is to check the distance when comparing so that it does not go beyond the array.

A really interesting topic is the choice of sequence for changing the length of the comparison at each iteration of the first loop. It is interesting for the reason that the performance of the algorithm depends on it.

A table of known variants and time complexity can be viewed here:
https://en.wikipedia.org/wiki/Shellsort#Gap_sequences

Different people were engaged in calculating the ideal distance, so, apparently, this topic was interesting to them. Couldn’t they just fire up Ruby, call the fastest sort() algorithm?

In general, these strange people wrote dissertations on the topic of calculating the distance / gap of the “comb” for the Shell algorithm. I just used the results of their work and checked 5 types of sequences, Hibbard, Knuth-Pratt, Ciura, Sedgewick.

from typing import List
import time
import random
from functools import reduce
import math

DEMO_MODE = False

if input("Demo Mode Y/N? ").upper() == "Y":
    DEMO_MODE = True

class Colors:
    BLUE = '\033[94m'
    RED = '\033[31m'
    END = '\033[0m'

def swap(list, lhs, rhs):
    list[lhs], list[rhs] = list[rhs], list[lhs]
    return list

def colorPrintoutStep(numbers: List[int], lhs: int, rhs: int):
    for index, number in enumerate(numbers):
        if index == lhs:
            print(f"{Colors.BLUE}", end = "")
        elif index == rhs:
            print(f"{Colors.RED}", end = "")
        print(f"{number},", end = "")
        if index == lhs or index == rhs:
            print(f"{Colors.END}", end = "")
        if index == lhs or index == rhs:
            print(f"{Colors.END}", end = "")
    print("\n")
    input(">")

def ShellSortLoop(numbers: List[int], distanceSequence: List[int]):
    distanceSequenceIterator = reversed(distanceSequence)
    while distance:= next(distanceSequenceIterator, None):
        for sortArea in range(0, len(numbers)):
            for rhs in reversed(range(distance, sortArea + 1)):
                lhs = rhs - distance
                if DEMO_MODE:
                    print(f"Distance: {distance}")
                    colorPrintoutStep(numbers, lhs, rhs)
                if numbers[lhs] > numbers[rhs]:
                    swap(numbers, lhs, rhs)
                else:
                    break

def ShellSort(numbers: List[int]):
    global ShellSequence
    ShellSortLoop(numbers, ShellSequence)

def HibbardSort(numbers: List[int]):
    global HibbardSequence
    ShellSortLoop(numbers, HibbardSequence)

def ShellPlusKnuttPrattSort(numbers: List[int]):
    global KnuttPrattSequence
    ShellSortLoop(numbers, KnuttPrattSequence)

def ShellPlusCiuraSort(numbers: List[int]):
    global CiuraSequence
    ShellSortLoop(numbers, CiuraSequence)

def ShellPlusSedgewickSort(numbers: List[int]):
    global SedgewickSequence
    ShellSortLoop(numbers, SedgewickSequence)

def insertionSort(numbers: List[int]):
    global insertionSortDistanceSequence
    ShellSortLoop(numbers, insertionSortDistanceSequence)

def defaultSort(numbers: List[int]):
    numbers.sort()

def measureExecution(inputNumbers: List[int], algorithmName: str, algorithm):
    if DEMO_MODE:
        print(f"{algorithmName} started")
    numbers = inputNumbers.copy()
    startTime = time.perf_counter()
    algorithm(numbers)
    endTime = time.perf_counter()
    print(f"{algorithmName} performance: {endTime - startTime}")

def sortedNumbersAsString(inputNumbers: List[int], algorithm) -> str:
    numbers = inputNumbers.copy()
    algorithm(numbers)
    return str(numbers)

if DEMO_MODE:
    maximalNumber = 10
    numbersCount = 10
else:
    maximalNumber = 10
    numbersCount = random.randint(10000, 20000)

randomNumbers = [random.randrange(1, maximalNumber) for i in range(numbersCount)]

ShellSequenceGenerator = lambda n: reduce(lambda x, _: x + [int(x[-1]/2)], range(int(math.log(numbersCount, 2))), [int(numbersCount / 2)])
ShellSequence = ShellSequenceGenerator(randomNumbers)
ShellSequence.reverse()
ShellSequence.pop()

HibbardSequence = [
    0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095,
    8191, 16383, 32767, 65535, 131071, 262143, 524287, 1048575,
    2097151, 4194303, 8388607, 16777215, 33554431, 67108863, 134217727,
    268435455, 536870911, 1073741823, 2147483647, 4294967295, 8589934591
]

KnuttPrattSequence = [
    1, 4, 13, 40, 121, 364, 1093, 3280, 9841, 29524, 88573, 265720, 
    797161, 2391484, 7174453, 21523360, 64570081, 193710244, 581130733, 
    1743392200, 5230176601, 15690529804, 47071589413
]

CiuraSequence = [
            1, 4, 10, 23, 57, 132, 301, 701, 1750, 4376, 
            10941, 27353, 68383, 170958, 427396, 1068491, 
            2671228, 6678071, 16695178, 41737946, 104344866, 
            260862166, 652155416, 1630388541
]

SedgewickSequence = [
            1, 5, 19, 41, 109, 209, 505, 929, 2161, 3905,
            8929, 16001, 36289, 64769, 146305, 260609, 587521, 
            1045505, 2354689, 4188161, 9427969, 16764929, 37730305, 
            67084289, 150958081, 268386305, 603906049, 1073643521, 
            2415771649, 4294770689, 9663381505, 17179475969
]

insertionSortDistanceSequence = [1]

algorithms = {
    "Default Python Sort": defaultSort,
    "Shell Sort": ShellSort,
    "Shell + Hibbard" : HibbardSort,
    "Shell + Prat, Knutt": ShellPlusKnuttPrattSort,
    "Shell + Ciura Sort": ShellPlusCiuraSort,
    "Shell + Sedgewick Sort": ShellPlusSedgewickSort,
    "Insertion Sort": insertionSort
}

for name, algorithm in algorithms.items():
    measureExecution(randomNumbers, name, algorithm)

reference = sortedNumbersAsString(randomNumbers, defaultSort)

for name, algorithm in algorithms.items():
    if sortedNumbersAsString(randomNumbers, algorithm) != reference:
        print("Sorting validation failed")
        exit(1)

print("Sorting validation success")
exit(0)
In my implementation, for a random set, the fastest numbers are the Sedgwick and Hibbard gaps.

mypy

I would also like to mention the static typing analyzer for Python 3 – mypy. Helps to cope with the problems inherent in languages ​​with dynamic typing, namely, it eliminates the possibility of sticking something where it is not necessary.
As experienced programmers say, “static typing is not needed when you have a team of professionals”, someday we will all become professionals, we will write code in complete unity and understanding with machines, but for now you can use similar utilities and languages ​​with static typing.

Links

https://gitlab.com/demensdeum/algorithms/-/tree/master/sortAlgorithms/shellSort
http://mypy-lang.org/

References

https://dl.acm.org/doi/10.1145/368370.368387
https://en.wikipedia.org/wiki/Shellsort
http://rosettacode.org/wiki/Sorting_algorithms/Shell_sort
https://ru.wikipedia.org/wiki/Сортировка_Шелла
https://neerc.ifmo.ru/wiki/index.php?title=Сортировка_Шелла
https://twitter.com/gvanrossum/status/700741601966985216

Double Selection Sort

Double Selection Sort – a subspecies of sorting by selection, it seems like it should be accelerated twice. The vanilla algorithm double loops through the list of numbers, finds the minimum number, and swaps it with the current digit pointed to by the loop at the level above. Double selection sort, on the other hand, looks for the minimum and maximum number, then it replaces the two digits indicated by the loop at the level above – two numbers on the left and on the right. All this orgy ends when the cursors of numbers to replace meet in the middle of the list, as a result, sorted numbers are obtained to the left and right of the visual center. The time complexity of the algorithm is similar to Selection Sort – O(n2), but supposedly there is a 30% speedup.

Corner case

Already at this stage, you can imagine the moment of collision, when the number of the left cursor (the minimum number) will point to the maximum number in the list, then the minimum number is permuted, the permutation of the maximum number immediately breaks down. Therefore, all implementations of the algorithm contain a check of such cases, replacement of indices with correct ones. In my implementation, one check was enough:

if (leftCursor == maximalNumberIndex) {
    maximalNumberIndex = minimalNumberIndex;
}

Cito implementation

Cito is a lib language, a translator language. You can write on it for C, C++, C#, Java, JavaScript, Python, Swift, TypeScript, OpenCL C, while knowing absolutely nothing about these languages. The source code in the Cito language is translated into the source code in the supported languages, then you can use it as a library, or directly by correcting the generated code by hand. A kind of Write once – translate to anything.
Double Selection Sort – Cito:

public class DoubleSelectionSort
{
    public static int[] sort(int[]# numbers, int length)
    {
        int[]# sortedNumbers = new int[length];
        for (int i = 0; i < length; i++) {
            sortedNumbers[i] = numbers[i];
        }
        for (int leftCursor = 0; leftCursor < length / 2; leftCursor++) {
            int minimalNumberIndex = leftCursor;
            int minimalNumber = sortedNumbers[leftCursor];

            int rightCursor = length - (leftCursor + 1);
            int maximalNumberIndex = rightCursor;
            int maximalNumber = sortedNumbers[maximalNumberIndex];

            for (int cursor = leftCursor; cursor <= rightCursor; cursor++) { int cursorNumber = sortedNumbers[cursor]; if (minimalNumber > cursorNumber) {
                    minimalNumber = cursorNumber;
                    minimalNumberIndex = cursor;
                }
                if (maximalNumber < cursorNumber) {
                    maximalNumber = cursorNumber;
                    maximalNumberIndex = cursor;
                }
            }

            if (leftCursor == maximalNumberIndex) {
                maximalNumberIndex = minimalNumberIndex;
            }

            int fromNumber = sortedNumbers[leftCursor];
            int toNumber = sortedNumbers[minimalNumberIndex];
            sortedNumbers[minimalNumberIndex] = fromNumber;
            sortedNumbers[leftCursor] = toNumber;

            fromNumber = sortedNumbers[rightCursor];
            toNumber = sortedNumbers[maximalNumberIndex];
            sortedNumbers[maximalNumberIndex] = fromNumber;
            sortedNumbers[rightCursor] = toNumber;
        }
        return sortedNumbers;
    }
} 

Links

https://gitlab.com/demensdeum/algorithms/-/tree/master/sortAlgorithms/doubleSelectionSort
https://github.com/pfusik/cito

References

https://www.researchgate.net/publication/330084245_Improved_Double_Selection_Sort_using_Algorithm
http://algolab.valemak.com/selection-double
https://www.geeksforgeeks.org/sorting-algorithm-slightly-improves-selection-sort/