Mateuš Latusek -- hacks -- improved heap sort

Quick sort algorithm and derivatives need a lot of element exchange operations. While this is not so important when dealing with arrays, it becomes a little bit of a problem with linked lists, especially the single-linked ones.

Another advantage is that heap sort does not involve recurrence which improves memory usage.

Working with the c-toolkit library, I chose the heap sort algorithm for sorting linked lists. They were single linked, for memory efficiency again. It turned out that the original procedure can be improved to make less comparisons, and with minimum extra code.

Classic heap sort

The classic heap sort algorithm takes one element off the original list, and looks for its new position comparing elements one by one:

C implementation of the algorithm may look like this (i is the current element we are inserting that is -- blue on the graph):

void heapsort (Object *list, Compare comp_fun)

{

  Object i, *prev, next;

  if (*list == NULL)

    return;

  i = (*list) -> next;

  (*list) -> next = NULL;

  while (i != NULL)

  {

    next = i -> next;

    prev = list;

    while ((*prev != NULL) && (comp_fun (i -> data, (*prev) -> data) == 0))

      prev = &((*prev) -> next);

    i -> next = *prev;

    *prev = i;

    i = next;

  }

}

The correction

It is obvious, that the procedure can be sped up by skipping over a number of elements rather than checking them one by one. The corrected algorithm jumps over step = nodes/div elements (where nodes is the number of elements in the sorted list and div is constant). If step becomes small enough (lesser than slimit) the program switches back to the classic heap sort.

void heapsort2 (Object *list, Compare comp_fun)

{

  Object i, *j, *prev, next;

  unsigned int nodes, step, n;

  if (*list == NULL)

    return;

  nodes = 1;

  i = (*list) -> next;

  (*list) -> next = NULL;

  while (i != NULL)

  {

    next = i -> next;

    prev = list;

    step = nodes;

loop:

    step /= div;

    if (step < slimit)

      while ((*prev != NULL) && (comp_fun (i -> data, (*prev) -> data) == 0))

        prev = &((*prev) -> next);

    else

      if ((*prev != NULL) && (comp_fun (i -> data, (*prev) -> data) == 0))

      {

        j = prev;

        do

        {

          j = &((*j) -> next);

          prev = j;

          for (n = step - 1; ((*j) != NULL) && (n > 0); n --)

            j = &((*j) -> next);

        }

        while ((*j != NULL) && (comp_fun (i -> data, (*j) -> data) == 0));

        goto loop;

      }

    i -> next = *prev;

    *prev = i;

    nodes ++;

    i = next;

  }

}

The above code introduces one more enhancement. Suppose step is still a large number and the matching element is found before any step has been made. The first version of the program would divide step by div, compare, divide and so on, until the condition step < slimit is true.