{"id":3311,"date":"2022-07-27T00:46:04","date_gmt":"2022-07-26T21:46:04","guid":{"rendered":"https:\/\/demensdeum.com\/blog\/?p=3311"},"modified":"2024-12-16T22:32:18","modified_gmt":"2024-12-16T19:32:18","slug":"shell-sort","status":"publish","type":"post","link":"https:\/\/demensdeum.com\/blog\/2022\/07\/27\/shell-sort\/","title":{"rendered":"Shell Sort"},"content":{"rendered":"<p>Shell Sort is a variant of insertion sorting with preliminary combing of the array of numbers.<\/p>\n<p>We need to remember how insertion sort works:<\/p>\n<p>1. The loop starts from zero to the end of the loop, thus the array is divided into two parts<br \/>2. For the left part, the second cycle is started, comparing elements from right to left, the smaller element on the right is omitted until a smaller element on the left is found<br \/>3. At the end of both cycles, we get a sorted list<\/p>\n<p>Once upon a time, computer scientist Donald Shell was puzzled by how to improve the insertion sort algorithm. He came up with the idea of \u200b\u200balso going through the array with two cycles, but at a certain distance, gradually reducing the &#8220;comb&#8221; until it turns into a regular insertion sort algorithm. Everything is really that simple, no pitfalls, we add another cycle to the two cycles on top, in which we gradually reduce the size of the &#8220;comb&#8221;. The only thing you will need to do is check the distance when comparing, so that it does not go beyond the array.<\/p>\n<p>A really interesting topic is the choice of the sequence of changing the comparison length at each iteration of the first cycle. It is interesting because the performance of the algorithm depends on it.<\/p>\n<p>A table of known variants and time complexity can be found here: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Shellsort#Gap_sequences\" target=\"_blank\" rel=\"noopener\">https:\/\/en.wikipedia.org\/wiki\/Shellsort#Gap_sequences<\/a><\/p>\n<p>Different people were involved in calculating the ideal distance, apparently they were so interested in the topic. Couldn&#8217;t they just run Ruby and call the fastest sort() algorithm?<\/p>\n<p>In general, these strange people wrote dissertations on the topic of calculating the distance\/gap of the &#8220;comb&#8221; for the Shell algorithm. I simply used the results of their work and checked 5 types of sequences, Hibbard, Knuth-Pratt, Chiura, Sedgewick.<\/p>\n<div class=\"hcb_wrap\">\n<div class=\"hcb_wrap\">\n<pre class=\"prism line-numbers lang-unknown\" data-lang=\"unknown\"><code>import time\nimport random\nfrom functools import reduce\nimport math\n\nDEMO_MODE = False\n\nif input(\"Demo Mode Y\/N? \").upper() == \"Y\":\n    DEMO_MODE = True\n\nclass Colors:\n    BLUE = '\\033[94m'\n    RED = '\\033[31m'\n    END = '\\033[0m'\n\ndef swap(list, lhs, rhs):\n    list[lhs], list[rhs] = list[rhs], list[lhs]\n    return list\n\ndef colorPrintoutStep(numbers: List[int], lhs: int, rhs: int):\n    for index, number in enumerate(numbers):\n        if index == lhs:\n            print(f\"{Colors.BLUE}\", end = \"\")\n        elif index == rhs:\n            print(f\"{Colors.RED}\", end = \"\")\n        print(f\"{number},\", end = \"\")\n        if index == lhs or index == rhs:\n            print(f\"{Colors.END}\", end = \"\")\n        if index == lhs or index == rhs:\n            print(f\"{Colors.END}\", end = \"\")\n    print(\"\\n\")\n    input(\"&gt;\")\n\ndef ShellSortLoop(numbers: List[int], distanceSequence: List[int]):\n    distanceSequenceIterator = reversed(distanceSequence)\n    while distance:= next(distanceSequenceIterator, None):\n        for sortArea in range(0, len(numbers)):\n            for rhs in reversed(range(distance, sortArea + 1)):\n                lhs = rhs - distance\n                if DEMO_MODE:\n                    print(f\"Distance: {distance}\")\n                    colorPrintoutStep(numbers, lhs, rhs)\n                if numbers[lhs] &gt; numbers[rhs]:\n                    swap(numbers, lhs, rhs)\n                else:\n                    break\n\ndef ShellSort(numbers: List[int]):\n    global ShellSequence\n    ShellSortLoop(numbers, ShellSequence)\n\ndef HibbardSort(numbers: List[int]):\n    global HibbardSequence\n    ShellSortLoop(numbers, HibbardSequence)\n\ndef ShellPlusKnuttPrattSort(numbers: List[int]):\n    global KnuttPrattSequence\n    ShellSortLoop(numbers, KnuttPrattSequence)\n\ndef ShellPlusCiuraSort(numbers: List[int]):\n    global CiuraSequence\n    ShellSortLoop(numbers, CiuraSequence)\n\ndef ShellPlusSedgewickSort(numbers: List[int]):\n    global SedgewickSequence\n    ShellSortLoop(numbers, SedgewickSequence)\n\ndef insertionSort(numbers: List[int]):\n    global insertionSortDistanceSequence\n    ShellSortLoop(numbers, insertionSortDistanceSequence)\n\ndef defaultSort(numbers: List[int]):\n    numbers.sort()\n\ndef measureExecution(inputNumbers: List[int], algorithmName: str, algorithm):\n    if DEMO_MODE:\n        print(f\"{algorithmName} started\")\n    numbers = inputNumbers.copy()\n    startTime = time.perf_counter()\n    algorithm(numbers)\n    endTime = time.perf_counter()\n    print(f\"{algorithmName} performance: {endTime - startTime}\")\n\ndef sortedNumbersAsString(inputNumbers: List[int], algorithm) -&gt; str:\n    numbers = inputNumbers.copy()\n    algorithm(numbers)\n    return str(numbers)\n\nif DEMO_MODE:\n    maximalNumber = 10\n    numbersCount = 10\nelse:\n    maximalNumber = 10\n    numbersCount = random.randint(10000, 20000)\n\nrandomNumbers = [random.randrange(1, maximalNumber) for i in range(numbersCount)]\n\nShellSequenceGenerator = lambda n: reduce(lambda x, _: x + [int(x[-1]\/2)], range(int(math.log(numbersCount, 2))), [int(numbersCount \/ 2)])\nShellSequence = ShellSequenceGenerator(randomNumbers)\nShellSequence.reverse()\nShellSequence.pop()\n\nHibbardSequence = [\n    0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095,\n    8191, 16383, 32767, 65535, 131071, 262143, 524287, 1048575,\n    2097151, 4194303, 8388607, 16777215, 33554431, 67108863, 134217727,\n    268435455, 536870911, 1073741823, 2147483647, 4294967295, 8589934591\n]\n\nKnuttPrattSequence = [\n    1, 4, 13, 40, 121, 364, 1093, 3280, 9841, 29524, 88573, 265720, \n    797161, 2391484, 7174453, 21523360, 64570081, 193710244, 581130733, \n    1743392200, 5230176601, 15690529804, 47071589413\n]\n\nCiuraSequence = [\n            1, 4, 10, 23, 57, 132, 301, 701, 1750, 4376, \n            10941, 27353, 68383, 170958, 427396, 1068491, \n            2671228, 6678071, 16695178, 41737946, 104344866, \n            260862166, 652155416, 1630388541\n]\n\nSedgewickSequence = [\n            1, 5, 19, 41, 109, 209, 505, 929, 2161, 3905,\n            8929, 16001, 36289, 64769, 146305, 260609, 587521, \n            1045505, 2354689, 4188161, 9427969, 16764929, 37730305, \n            67084289, 150958081, 268386305, 603906049, 1073643521, \n            2415771649, 4294770689, 9663381505, 17179475969\n]\n\ninsertionSortDistanceSequence = [1]\n\nalgorithms = {\n    \"Default Python Sort\": defaultSort,\n    \"Shell Sort\": ShellSort,\n    \"Shell + Hibbard\" : HibbardSort,\n    \"Shell + Prat, Knutt\": ShellPlusKnuttPrattSort,\n    \"Shell + Ciura Sort\": ShellPlusCiuraSort,\n    \"Shell + Sedgewick Sort\": ShellPlusSedgewickSort,\n    \"Insertion Sort\": insertionSort\n}\n\nfor name, algorithm in algorithms.items():\n    measureExecution(randomNumbers, name, algorithm)\n\nreference = sortedNumbersAsString(randomNumbers, defaultSort)\n\nfor name, algorithm in algorithms.items():\n    if sortedNumbersAsString(randomNumbers, algorithm) != reference:\n        print(\"Sorting validation failed\")\n        exit(1)\n\nprint(\"Sorting validation success\")\nexit(0)\n<\/code><\/pre>\n<\/div>\n<\/div>\n<p>In my implementation, for a random set of numbers, the fastest gaps are Sedgewick and Hibbard.<\/p>\n<h3>mypy<\/h3>\n<p>I would also like to mention the static typing analyzer for Python 3 &#8211; mypy. It helps to cope with the problems inherent in languages \u200b\u200bwith dynamic typing, namely, it eliminates the possibility of slipping something where it shouldn&#8217;t.<\/p>\n<p>As experienced programmers say, &#8220;static typing is not needed when you have a team of professionals&#8221;, someday we will all become professionals, we will write code in complete unity and understanding with machines, but for now you can use such utilities and languages \u200b\u200bwith static typing.<\/p>\n<h3>Links<\/h3>\n<p><a href=\"https:\/\/gitlab.com\/demensdeum\/algorithms\/-\/tree\/master\/sortAlgorithms\/shellSort\" target=\"_blank\" rel=\"noopener\">https:\/\/gitlab.com\/demensdeum \/algorithms\/-\/tree\/master\/sortAlgorithms\/shellSort<\/a><br \/><a href=\"http:\/\/mypy-lang.org\/\" target=\"_blank\" rel=\"noopener\">http:\/\/mypy-lang.org\/<\/a><\/p>\n<h3>Sources<\/h3>\n<p><a href=\"https:\/\/dl.acm.org\/doi\/10.1145\/368370.368387\" target=\"_blank\" rel=\"noopener\">https:\/\/dl.acm.org\/doi\/10.1145\/368370.368387 <\/a><br \/><a href=\"https:\/\/en.wikipedia.org\/wiki\/Shellsort\" target=\"_blank\" rel=\"noopener\">https:\/\/en.wikipedia.org\/wiki\/Shellsort<\/a><br \/>\n<a href=\"http:\/\/rosettacode.org\/wiki\/Sorting_algorithms\/Shell_sort\" target=\"_blank\" rel=\"noopener\">http:\/\/rosettacode.org\/wiki\/Sorting_algorithms\/Shell_sort<\/a><br \/>\n<a href=\"https:\/\/ru.wikipedia.org\/wiki\/\u0421\u043e\u0440\u0442\u0438\u0440\u043e\u0432\u043a\u0430_\u0428\u0435\u043b\u043b\u0430\" target=\"_blank\" rel=\"noopener\">https:\/\/ru.wikipedia.org\/wiki\/\u0421\u043e\u0440\u0442\u0438\u0440\u043e\u0432\u043a\u0430_\u0428\u0435\u043b\u043b\u0430<\/a><br \/>\n<a href=\"https:\/\/neerc.ifmo.ru\/wiki\/index.php?title=\u0421\u043e\u0440\u0442\u0438\u0440\u043e\u0432\u043a\u0430_\u0428\u0435\u043b\u043b\u0430\" target=\"_blank\" rel=\"noopener\">https:\/\/neerc.ifmo.ru\/wiki\/index.php?title=\u0421\u043e\u0440\u0442\u0438\u0440\u043e\u0432\u043a\u0430_\u0428\u0435\u043b\u043b\u0430<\/a><br \/>\n<a href=\"https:\/\/twitter.com\/gvanrossum\/status\/700741601966985216\" target=\"_blank\" rel=\"noopener\">https:\/\/twitter.com\/gvanrossum\/status\/700741601966985216<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Shell Sort is a variant of insertion sorting with preliminary combing of the array of numbers. We need to remember how insertion sort works: 1. The loop starts from zero to the end of the loop, thus the array is divided into two parts2. For the left part, the second cycle is started, comparing elements<a class=\"more-link\" href=\"https:\/\/demensdeum.com\/blog\/2022\/07\/27\/shell-sort\/\">Continue reading <span class=\"screen-reader-text\">&#8220;Shell Sort&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[61,52],"tags":[131,197,190],"class_list":["post-3311","post","type-post","status-publish","format-standard","hentry","category-techie","category-tutorials","tag-algorithms","tag-shell-sorting","tag-sorting","entry"],"translation":{"provider":"WPGlobus","version":"3.0.2","language":"en","enabled_languages":["en","ru","zh","de","fr","ja","pt"],"languages":{"en":{"title":true,"content":true,"excerpt":false},"ru":{"title":true,"content":true,"excerpt":false},"zh":{"title":true,"content":true,"excerpt":false},"de":{"title":true,"content":true,"excerpt":false},"fr":{"title":true,"content":true,"excerpt":false},"ja":{"title":true,"content":true,"excerpt":false},"pt":{"title":true,"content":true,"excerpt":false}}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/posts\/3311","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/comments?post=3311"}],"version-history":[{"count":15,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/posts\/3311\/revisions"}],"predecessor-version":[{"id":3884,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/posts\/3311\/revisions\/3884"}],"wp:attachment":[{"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/media?parent=3311"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/categories?post=3311"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/demensdeum.com\/blog\/wp-json\/wp\/v2\/tags?post=3311"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}