Engineering an even faster qsortFaster Sudoku solver in CWhy is my Bubblesort implementation faster than my Quicksort code?Merge sort could work 10 times fasterReverse engineering Darkstone game archivesConstructing an odd-even mergesort sorting networkFaster QuickSortGeneric Pairing Heap PerformanceHow to sort array by divisor sum faster?Making a faster parallel quicksortSplit linked list into odd and even
What's the point of deactivating Num Lock on login screens?
What defenses are there against being summoned by the Gate spell?
Why can't we play rap on piano?
infared filters v nd
What does "Puller Prush Person" mean?
Revoked SSL certificate
dbcc cleantable batch size explanation
If human space travel is limited by the G force vulnerability, is there a way to counter G forces?
Add text to same line using sed
LaTeX: Why are digits allowed in environments, but forbidden in commands?
meaning of に in 本当に?
Is it possible for a square root function,f(x), to map to a finite number of integers for all x in domain of f?
Unable to deploy metadata from Partner Developer scratch org because of extra fields
How much RAM could one put in a typical 80386 setup?
How much of data wrangling is a data scientist's job?
How does quantile regression compare to logistic regression with the variable split at the quantile?
When a company launches a new product do they "come out" with a new product or do they "come up" with a new product?
How does one intimidate enemies without having the capacity for violence?
High voltage LED indicator 40-1000 VDC without additional power supply
Was any UN Security Council vote triple-vetoed?
What doth I be?
Why does Kotter return in Welcome Back Kotter?
Why "Having chlorophyll without photosynthesis is actually very dangerous" and "like living with a bomb"?
Rock identification in KY
Engineering an even faster qsort
Faster Sudoku solver in CWhy is my Bubblesort implementation faster than my Quicksort code?Merge sort could work 10 times fasterReverse engineering Darkstone game archivesConstructing an odd-even mergesort sorting networkFaster QuickSortGeneric Pairing Heap PerformanceHow to sort array by divisor sum faster?Making a faster parallel quicksortSplit linked list into odd and even
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
I understand that C++
STL template sort
runs faster than qsort
in C because qsort
needs to call the comparator multiple times and C++
makes the code specific to that type. However, C
's pre-processor can hack that, too. So I build a test using qsort
.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* printf fputc stdout sprintf */
#include <string.h> /* strcmp */
#include <errno.h> /* errno */
#include <time.h> /* clock */
#include <ctype.h> /* toupper */
#include <math.h> /* pow */
/* Linked with a qsort called qsort_ that acts as a control to test qsort.
With C89, no inline, one has to optimise to get anything. */
/*#define QSORT_*/
static int int_cmp(const int *const a, const int *const b)
return (*a > *b) - (*b > *a);
#define SORT_NAME Int
#define SORT_TYPE int
#define SORT_COMPARATOR &int_cmp
#include "Sort.h"
static int int_void_cmp(const void *a, const void *b)
return int_cmp(a, b);
static void int_fill(int *const a)
*a = rand();
struct Foo
int no;
char name[64];
;
static const size_t foo_name_size = sizeof ((struct Foo *)0)->name;
static int foo_no_cmp(const struct Foo *const a, const struct Foo *const b)
return (a->no > b->no) - (b->no > a->no);
#define SORT_NAME FooNo
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_no_cmp
#include "Sort.h"
static int foo_name_cmp(const struct Foo *const a, const struct Foo *const b)
return strcmp(a->name, b->name);
#define SORT_NAME FooName
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_name_cmp
#include "Sort.h"
static int foo_void_no_cmp(const void *a, const void *b)
return foo_no_cmp(a, b);
static int foo_void_name_cmp(const void *a, const void *b)
return foo_name_cmp(a, b);
static void foo_fill(struct Foo *const a)
const size_t num_chars = 4 + rand() / (RAND_MAX / (foo_name_size - 5) + 1);
size_t i;
int_fill(&a->no);
for(i = 0; i < num_chars; i++)
a->name[i] = (i ? 'a' : 'A') + rand() / (RAND_MAX + 1.0) * 26;
a->name[i] = '';
/* The control -- not going to include this. C89 */
#ifdef QSORT_ /* <-- my */
void
qsort_(void *a, size_t n, size_t es, int (*cmp)(const void *, const void *));
#endif /* my --> */
typedef int (*TestFn)(const size_t, double *const);
/* This is messy. */
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define TEST_FN(NAME, TYPE, FILL, TEST)
static int CAT(NAME, _test)(const size_t size, double *const p_ms_time)
TYPE *a = malloc(sizeof *a * size);
size_t i;
clock_t run;
if(!a) return 0;
for(i = 0; i < size; i++) (FILL)(a + i);
run = -clock();
(TEST);
run += clock();
*p_ms_time = (double)run / (CLOCKS_PER_SEC / 1000.0);
free(a);
return 1;
TEST_FN(int_q, int, int_fill, qsort(a, size, sizeof *a, &int_void_cmp))
TEST_FN(int_s, int, int_fill, IntSort(a, size))
TEST_FN(foo_no_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_no_s, struct Foo, foo_fill, FooNoSort(a, size))
TEST_FN(foo_name_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_name_cmp))
TEST_FN(foo_name_s, struct Foo, foo_fill, FooNameSort(a, size))
#ifdef QSORT_ /* <-- my */
TEST_FN(int_m, int, int_fill, qsort_(a, size, sizeof *a, &int_void_cmp))
TEST_FN(foo_no_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_name_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_name_cmp))
#endif /* my --> */
static const char *const gnu_name = "sort.gnu";
static const char *const graph_name = "sort.eps";
static const unsigned replicas = 5;
static const size_t max_elements = 100000;
int main(void)
struct TimeData const char *const name; TestFn fn; time_data[] =
"qsort int", &int_q_test ,
"IntSort", &int_s_test ,
"qsort foo::no", &foo_no_q_test ,
"FooNoSort", &foo_no_s_test ,
"qsort foo::name", &foo_name_q_test ,
"FooNameSort", &foo_name_s_test
#ifdef QSORT_ /* <-- my */
,
"my qsort_ int", &int_m_test ,
"my qsort_ foo::no", &foo_no_m_test ,
"my qsort_ foo::name", &foo_name_m_test
#endif /* my --> */
, *td;
const size_t time_data_size = sizeof time_data / sizeof *time_data;
unsigned td_i;
FILE *fp = 0, *gnu = 0;
size_t elements, x, r;
char fn_name[32] = "not a file";
unsigned seed = (unsigned)clock();
const char *err = 0;
srand(seed), rand(), printf("Seed %u.n", seed);
do /* Try. */
if(!(gnu = fopen(gnu_name, "w"))) err = gnu_name; break;
fprintf(gnu, "set term postscript eps enhanced colorn"
"set output "%s"n"
"set gridn"
"set xlabel "array elements"n"
"set ylabel "processor time, t (ms)"n"
"set yrange [0:]n"
"# seed %un"
"n"
"plot", graph_name, seed);
for(td_i = 0; td = time_data + td_i, td_i < time_data_size; td_i++) !(fp = fopen(fn_name, "w"))) err = fn_name; break;
fprintf(stderr, "Created/overwrote, "%s," to store data on %s "
"sort.n", fn_name, td->name);
/* Do several experiments, increasing number of elements. */
fprintf(fp, "# %s: elements msn", td->name);
for(elements = 1, x = 2; elements <= max_elements;
elements = (unsigned)pow((double)x++, 3.5))
int n = 0;
double dt_ms = 0.0, mean_ms = 0.0, delta_ms, ssdm = 0.0;
fprintf(stderr, "%s: sorting %lu elements distributed uniformly"
" at random.n", td->name, (unsigned long)elements);
/* citeWelford1962Note */
for(r = 0; r < replicas; r++)
fprintf(stderr, "#");
if(!td->fn(elements, &dt_ms)) err = "experiment"; break;
n++;
delta_ms = dt_ms - mean_ms;
mean_ms += delta_ms / n;
ssdm += delta_ms * (dt_ms - mean_ms);
fprintf(stderr, " done.n");
if(r != replicas) break;
fprintf(fp, "%lut%ft%fn", (unsigned long)elements, mean_ms,
sqrt(ssdm / (n - 1)));
/* close the file */
if(fclose(fp)) fp = 0; err = fn_name; break;
fp = 0;
/* write the graph */
fprintf(gnu,
"%st"%s" using 1:2:3 with errorlines lw 3 title "%s"",
td_i ? ", \n" : " ", fn_name, td->name);
fprintf(gnu, "n");
while(0); if(err) perror(err); /* Catch. */
/* Finally. */
if(fp && fclose(fp)) perror(fn_name);
if(gnu && fclose(gnu)) perror(gnu_name);
return err ? EXIT_FAILURE : EXIT_SUCCESS;
This is code taken from Bentley McIlroy 1993 and Berkeley; I've modified it to take parameters in the form of pre-processor #define
s. It must be run with optimisations on since I'm using C89
: doesn't support inline
. I've gotten rid of the warnings, but I'm not sure this compiles to the same code.
/*
* Copyright (c) 1995 NeXT Computer, Inc. All Rights Reserved
*
* Copyright (c) 1992, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* @(#)qsort.c 8.1 (Berkeley) 6/4/93
*/
/* Modified by Neil to make it parameterised.
The parameters are preprocessor macros and all are all undefined at the end of
the file for convenience.
@param SORT_NAME
A unique name associated with <S> that satisfies C naming conventions when
mangled.
@param SORT_TYPE
This type becomes <S>; required.
@param SORT_COMPARATOR
A function satisfying <PS>SortComparator.
@std C89
*/
/* Check defines. */
#ifndef SORT_NAME
#error Sort name SORT_NAME undefined.
#endif
#ifndef SORT_TYPE
#error Sort type SORT_TYPE undefined.
#endif
#ifndef SORT_COMPARATOR
#error Sort function SORT_COMPARATOR undefined.
#endif
/* Generics using the preprocessor;
url http://stackoverflow.com/questions/16522341/pseudo-generics-in-c . */
#ifdef CAT
#undef CAT
#endif
#ifdef CAT_
#undef CAT_
#endif
#ifdef PCAT
#undef PCAT
#endif
#ifdef PCAT_
#undef PCAT_
#endif
#ifdef S
#undef S
#endif
#ifdef S_
#undef S_
#endif
#ifdef PS_
#undef PS_
#endif
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define PCAT_(x, y) x ## _ ## y
#define PCAT(x, y) PCAT_(x, y)
#define S_(thing) CAT(SORT_NAME, thing)
#define PS_(thing) PCAT(sort, PCAT(SORT_NAME, thing)) /* Private. */
/* Troubles with this line? check to ensure that SORT_TYPE is a valid type,
whose definition is placed above #include "Sort.h". */
typedef SORT_TYPE PS_(Type);
#define S PS_(Type)
/** Must establish a partial order amongst <S> by returning
a < b, negative; a = b, 0; a > b, positive. */
typedef int (*PS_(SortComparator))(const S *const a, const S *const b);
/* Check that MAP_COMPARATOR is a function implementing
<PS>SortComparator. */
static const PS_(SortComparator) PS_(cmp) = (SORT_COMPARATOR);
#include <sys/types.h>
#ifndef SORT_H /* <-- h Idempotent. */
#define SORT_H
#define sort_min(a, b) (a) < (b) ? a : b
enum SortSwapType WORD, WORDS, BYTE ;
#endif /* h --> */
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*
* (How does that work with the copyright?)
*/
static void
PS_(swapfunc)(S *a, S *b, const size_t n, const enum SortSwapType swaptype)
switch(swaptype)
case WORDS:
case WORD:
long i = sizeof *a * n / sizeof(long);
long *pi = (long *)(void *)a;
long *pj = (long *)(void *)b;
do
long t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
case BYTE:
long i = sizeof *a * n;
char *pi = (char *)a;
char *pj = (char *)b;
do
char t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
static void
PS_(swap)(S *a, S *b, const enum SortSwapType swaptype)
if(swaptype == WORD)
long t = *(long *)(void *)(a);
*(long *)(void *)(a) = *(long *)(void *)(b);
*(long *)(void *)(b) = t;
else
PS_(swapfunc)(a, b, (size_t)1, swaptype);
static S *
PS_(med3)(S *a, S *b, S *c)
return PS_(cmp)(a, b) < 0 ?
(PS_(cmp)(b, c) < 0 ? b : (PS_(cmp)(a, c) < 0 ? c : a ))
: (PS_(cmp)(b, c) > 0 ? b : (PS_(cmp)(a, c) < 0 ? a : c ));
static void
S_(Sort)(S *a, size_t n)
sizeof *a % sizeof(long) + (0*n) ? BYTE
: sizeof *a == sizeof(long) ? WORD : WORDS;
swap_cnt = 0;
if (n < 7)
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pm = a + (n / 2);
if (n > 7)
pl = a;
pn = a + (n - 1);
if (n > 40)
d = n / 8;
pl = PS_(med3)(pl, pl + d, pl + 2 * d);
pm = PS_(med3)(pm - d, pm, pm + d);
pn = PS_(med3)(pn - 2 * d, pn - d, pn);
pm = PS_(med3)(pl, pm, pn);
PS_(swap)(a, pm, swaptype);
pa = pb = a + 1;
pc = pd = a + (n - 1);
for (;;)
while (pb <= pc && (r = PS_(cmp)(pb, a)) <= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pa, pb, swaptype);
pa++;
pb++;
while (pb <= pc && (r = PS_(cmp)(pc, a)) >= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pc, pd, swaptype);
pd--;
pc--;
if (pb > pc)
break;
PS_(swap)(pb, pc, swaptype);
swap_cnt = 1;
pb++;
pc--;
if (swap_cnt == 0) /* Switch to insertion sort */
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pn = a + n;
vec = sort_min(pa - a, pb - pa);
if(vec > 0) PS_(swapfunc)(a, pb - vec, vec, swaptype);
vec = sort_min((size_t)(pd - pc), (size_t)(pn - pd - 1));
if(vec > 0) PS_(swapfunc)(pb, pn - vec, vec, swaptype);
if ((size_t)(vec = pb - pa) > 1)
S_(Sort)(a, vec);
if ((vec = pd - pc) > 1)
/* Iterate rather than recurse to save stack space */
a = pn - vec;
n = vec;
goto loop;
/* Un-define all macros. Undocumented; allows nestled inclusion so long as:
CAT_, CAT, PCAT, PCAT_ conform, and S, is not used. */
#ifdef SORT_SUBTYPE /* <-- sub */
#undef SORT_SUBTYPE
#else /* sub --><-- !sub */
#undef CAT
#undef CAT_
#undef PCAT
#undef PCAT_
#endif /* !sub --> */
#undef S
#undef S_
#undef PS_
#undef SORT_NAME
#undef SORT_TYPE
#undef SORT_COMPARATOR
On my computer, it outputs, with optimisations on, via
gnuplot
. It is faster, but not by much.
I'm not sure about the changes I've made to qsort.c
. Specifically, why swaptype = (a-(char*)0 | es) % W ? 2 : es > W ? 1 : 0
(Bentley1993,) where es
is the size. Why is it called every time? Shouldn't it be a | es
without the subtracting null? Isn't a
aligned properly and it might be es
and taken out of the loop?
c sorting quick-sort c89
$endgroup$
add a comment |
$begingroup$
I understand that C++
STL template sort
runs faster than qsort
in C because qsort
needs to call the comparator multiple times and C++
makes the code specific to that type. However, C
's pre-processor can hack that, too. So I build a test using qsort
.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* printf fputc stdout sprintf */
#include <string.h> /* strcmp */
#include <errno.h> /* errno */
#include <time.h> /* clock */
#include <ctype.h> /* toupper */
#include <math.h> /* pow */
/* Linked with a qsort called qsort_ that acts as a control to test qsort.
With C89, no inline, one has to optimise to get anything. */
/*#define QSORT_*/
static int int_cmp(const int *const a, const int *const b)
return (*a > *b) - (*b > *a);
#define SORT_NAME Int
#define SORT_TYPE int
#define SORT_COMPARATOR &int_cmp
#include "Sort.h"
static int int_void_cmp(const void *a, const void *b)
return int_cmp(a, b);
static void int_fill(int *const a)
*a = rand();
struct Foo
int no;
char name[64];
;
static const size_t foo_name_size = sizeof ((struct Foo *)0)->name;
static int foo_no_cmp(const struct Foo *const a, const struct Foo *const b)
return (a->no > b->no) - (b->no > a->no);
#define SORT_NAME FooNo
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_no_cmp
#include "Sort.h"
static int foo_name_cmp(const struct Foo *const a, const struct Foo *const b)
return strcmp(a->name, b->name);
#define SORT_NAME FooName
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_name_cmp
#include "Sort.h"
static int foo_void_no_cmp(const void *a, const void *b)
return foo_no_cmp(a, b);
static int foo_void_name_cmp(const void *a, const void *b)
return foo_name_cmp(a, b);
static void foo_fill(struct Foo *const a)
const size_t num_chars = 4 + rand() / (RAND_MAX / (foo_name_size - 5) + 1);
size_t i;
int_fill(&a->no);
for(i = 0; i < num_chars; i++)
a->name[i] = (i ? 'a' : 'A') + rand() / (RAND_MAX + 1.0) * 26;
a->name[i] = '';
/* The control -- not going to include this. C89 */
#ifdef QSORT_ /* <-- my */
void
qsort_(void *a, size_t n, size_t es, int (*cmp)(const void *, const void *));
#endif /* my --> */
typedef int (*TestFn)(const size_t, double *const);
/* This is messy. */
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define TEST_FN(NAME, TYPE, FILL, TEST)
static int CAT(NAME, _test)(const size_t size, double *const p_ms_time)
TYPE *a = malloc(sizeof *a * size);
size_t i;
clock_t run;
if(!a) return 0;
for(i = 0; i < size; i++) (FILL)(a + i);
run = -clock();
(TEST);
run += clock();
*p_ms_time = (double)run / (CLOCKS_PER_SEC / 1000.0);
free(a);
return 1;
TEST_FN(int_q, int, int_fill, qsort(a, size, sizeof *a, &int_void_cmp))
TEST_FN(int_s, int, int_fill, IntSort(a, size))
TEST_FN(foo_no_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_no_s, struct Foo, foo_fill, FooNoSort(a, size))
TEST_FN(foo_name_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_name_cmp))
TEST_FN(foo_name_s, struct Foo, foo_fill, FooNameSort(a, size))
#ifdef QSORT_ /* <-- my */
TEST_FN(int_m, int, int_fill, qsort_(a, size, sizeof *a, &int_void_cmp))
TEST_FN(foo_no_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_name_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_name_cmp))
#endif /* my --> */
static const char *const gnu_name = "sort.gnu";
static const char *const graph_name = "sort.eps";
static const unsigned replicas = 5;
static const size_t max_elements = 100000;
int main(void)
struct TimeData const char *const name; TestFn fn; time_data[] =
"qsort int", &int_q_test ,
"IntSort", &int_s_test ,
"qsort foo::no", &foo_no_q_test ,
"FooNoSort", &foo_no_s_test ,
"qsort foo::name", &foo_name_q_test ,
"FooNameSort", &foo_name_s_test
#ifdef QSORT_ /* <-- my */
,
"my qsort_ int", &int_m_test ,
"my qsort_ foo::no", &foo_no_m_test ,
"my qsort_ foo::name", &foo_name_m_test
#endif /* my --> */
, *td;
const size_t time_data_size = sizeof time_data / sizeof *time_data;
unsigned td_i;
FILE *fp = 0, *gnu = 0;
size_t elements, x, r;
char fn_name[32] = "not a file";
unsigned seed = (unsigned)clock();
const char *err = 0;
srand(seed), rand(), printf("Seed %u.n", seed);
do /* Try. */
if(!(gnu = fopen(gnu_name, "w"))) err = gnu_name; break;
fprintf(gnu, "set term postscript eps enhanced colorn"
"set output "%s"n"
"set gridn"
"set xlabel "array elements"n"
"set ylabel "processor time, t (ms)"n"
"set yrange [0:]n"
"# seed %un"
"n"
"plot", graph_name, seed);
for(td_i = 0; td = time_data + td_i, td_i < time_data_size; td_i++) !(fp = fopen(fn_name, "w"))) err = fn_name; break;
fprintf(stderr, "Created/overwrote, "%s," to store data on %s "
"sort.n", fn_name, td->name);
/* Do several experiments, increasing number of elements. */
fprintf(fp, "# %s: elements msn", td->name);
for(elements = 1, x = 2; elements <= max_elements;
elements = (unsigned)pow((double)x++, 3.5))
int n = 0;
double dt_ms = 0.0, mean_ms = 0.0, delta_ms, ssdm = 0.0;
fprintf(stderr, "%s: sorting %lu elements distributed uniformly"
" at random.n", td->name, (unsigned long)elements);
/* citeWelford1962Note */
for(r = 0; r < replicas; r++)
fprintf(stderr, "#");
if(!td->fn(elements, &dt_ms)) err = "experiment"; break;
n++;
delta_ms = dt_ms - mean_ms;
mean_ms += delta_ms / n;
ssdm += delta_ms * (dt_ms - mean_ms);
fprintf(stderr, " done.n");
if(r != replicas) break;
fprintf(fp, "%lut%ft%fn", (unsigned long)elements, mean_ms,
sqrt(ssdm / (n - 1)));
/* close the file */
if(fclose(fp)) fp = 0; err = fn_name; break;
fp = 0;
/* write the graph */
fprintf(gnu,
"%st"%s" using 1:2:3 with errorlines lw 3 title "%s"",
td_i ? ", \n" : " ", fn_name, td->name);
fprintf(gnu, "n");
while(0); if(err) perror(err); /* Catch. */
/* Finally. */
if(fp && fclose(fp)) perror(fn_name);
if(gnu && fclose(gnu)) perror(gnu_name);
return err ? EXIT_FAILURE : EXIT_SUCCESS;
This is code taken from Bentley McIlroy 1993 and Berkeley; I've modified it to take parameters in the form of pre-processor #define
s. It must be run with optimisations on since I'm using C89
: doesn't support inline
. I've gotten rid of the warnings, but I'm not sure this compiles to the same code.
/*
* Copyright (c) 1995 NeXT Computer, Inc. All Rights Reserved
*
* Copyright (c) 1992, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* @(#)qsort.c 8.1 (Berkeley) 6/4/93
*/
/* Modified by Neil to make it parameterised.
The parameters are preprocessor macros and all are all undefined at the end of
the file for convenience.
@param SORT_NAME
A unique name associated with <S> that satisfies C naming conventions when
mangled.
@param SORT_TYPE
This type becomes <S>; required.
@param SORT_COMPARATOR
A function satisfying <PS>SortComparator.
@std C89
*/
/* Check defines. */
#ifndef SORT_NAME
#error Sort name SORT_NAME undefined.
#endif
#ifndef SORT_TYPE
#error Sort type SORT_TYPE undefined.
#endif
#ifndef SORT_COMPARATOR
#error Sort function SORT_COMPARATOR undefined.
#endif
/* Generics using the preprocessor;
url http://stackoverflow.com/questions/16522341/pseudo-generics-in-c . */
#ifdef CAT
#undef CAT
#endif
#ifdef CAT_
#undef CAT_
#endif
#ifdef PCAT
#undef PCAT
#endif
#ifdef PCAT_
#undef PCAT_
#endif
#ifdef S
#undef S
#endif
#ifdef S_
#undef S_
#endif
#ifdef PS_
#undef PS_
#endif
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define PCAT_(x, y) x ## _ ## y
#define PCAT(x, y) PCAT_(x, y)
#define S_(thing) CAT(SORT_NAME, thing)
#define PS_(thing) PCAT(sort, PCAT(SORT_NAME, thing)) /* Private. */
/* Troubles with this line? check to ensure that SORT_TYPE is a valid type,
whose definition is placed above #include "Sort.h". */
typedef SORT_TYPE PS_(Type);
#define S PS_(Type)
/** Must establish a partial order amongst <S> by returning
a < b, negative; a = b, 0; a > b, positive. */
typedef int (*PS_(SortComparator))(const S *const a, const S *const b);
/* Check that MAP_COMPARATOR is a function implementing
<PS>SortComparator. */
static const PS_(SortComparator) PS_(cmp) = (SORT_COMPARATOR);
#include <sys/types.h>
#ifndef SORT_H /* <-- h Idempotent. */
#define SORT_H
#define sort_min(a, b) (a) < (b) ? a : b
enum SortSwapType WORD, WORDS, BYTE ;
#endif /* h --> */
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*
* (How does that work with the copyright?)
*/
static void
PS_(swapfunc)(S *a, S *b, const size_t n, const enum SortSwapType swaptype)
switch(swaptype)
case WORDS:
case WORD:
long i = sizeof *a * n / sizeof(long);
long *pi = (long *)(void *)a;
long *pj = (long *)(void *)b;
do
long t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
case BYTE:
long i = sizeof *a * n;
char *pi = (char *)a;
char *pj = (char *)b;
do
char t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
static void
PS_(swap)(S *a, S *b, const enum SortSwapType swaptype)
if(swaptype == WORD)
long t = *(long *)(void *)(a);
*(long *)(void *)(a) = *(long *)(void *)(b);
*(long *)(void *)(b) = t;
else
PS_(swapfunc)(a, b, (size_t)1, swaptype);
static S *
PS_(med3)(S *a, S *b, S *c)
return PS_(cmp)(a, b) < 0 ?
(PS_(cmp)(b, c) < 0 ? b : (PS_(cmp)(a, c) < 0 ? c : a ))
: (PS_(cmp)(b, c) > 0 ? b : (PS_(cmp)(a, c) < 0 ? a : c ));
static void
S_(Sort)(S *a, size_t n)
sizeof *a % sizeof(long) + (0*n) ? BYTE
: sizeof *a == sizeof(long) ? WORD : WORDS;
swap_cnt = 0;
if (n < 7)
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pm = a + (n / 2);
if (n > 7)
pl = a;
pn = a + (n - 1);
if (n > 40)
d = n / 8;
pl = PS_(med3)(pl, pl + d, pl + 2 * d);
pm = PS_(med3)(pm - d, pm, pm + d);
pn = PS_(med3)(pn - 2 * d, pn - d, pn);
pm = PS_(med3)(pl, pm, pn);
PS_(swap)(a, pm, swaptype);
pa = pb = a + 1;
pc = pd = a + (n - 1);
for (;;)
while (pb <= pc && (r = PS_(cmp)(pb, a)) <= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pa, pb, swaptype);
pa++;
pb++;
while (pb <= pc && (r = PS_(cmp)(pc, a)) >= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pc, pd, swaptype);
pd--;
pc--;
if (pb > pc)
break;
PS_(swap)(pb, pc, swaptype);
swap_cnt = 1;
pb++;
pc--;
if (swap_cnt == 0) /* Switch to insertion sort */
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pn = a + n;
vec = sort_min(pa - a, pb - pa);
if(vec > 0) PS_(swapfunc)(a, pb - vec, vec, swaptype);
vec = sort_min((size_t)(pd - pc), (size_t)(pn - pd - 1));
if(vec > 0) PS_(swapfunc)(pb, pn - vec, vec, swaptype);
if ((size_t)(vec = pb - pa) > 1)
S_(Sort)(a, vec);
if ((vec = pd - pc) > 1)
/* Iterate rather than recurse to save stack space */
a = pn - vec;
n = vec;
goto loop;
/* Un-define all macros. Undocumented; allows nestled inclusion so long as:
CAT_, CAT, PCAT, PCAT_ conform, and S, is not used. */
#ifdef SORT_SUBTYPE /* <-- sub */
#undef SORT_SUBTYPE
#else /* sub --><-- !sub */
#undef CAT
#undef CAT_
#undef PCAT
#undef PCAT_
#endif /* !sub --> */
#undef S
#undef S_
#undef PS_
#undef SORT_NAME
#undef SORT_TYPE
#undef SORT_COMPARATOR
On my computer, it outputs, with optimisations on, via
gnuplot
. It is faster, but not by much.
I'm not sure about the changes I've made to qsort.c
. Specifically, why swaptype = (a-(char*)0 | es) % W ? 2 : es > W ? 1 : 0
(Bentley1993,) where es
is the size. Why is it called every time? Shouldn't it be a | es
without the subtracting null? Isn't a
aligned properly and it might be es
and taken out of the loop?
c sorting quick-sort c89
$endgroup$
add a comment |
$begingroup$
I understand that C++
STL template sort
runs faster than qsort
in C because qsort
needs to call the comparator multiple times and C++
makes the code specific to that type. However, C
's pre-processor can hack that, too. So I build a test using qsort
.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* printf fputc stdout sprintf */
#include <string.h> /* strcmp */
#include <errno.h> /* errno */
#include <time.h> /* clock */
#include <ctype.h> /* toupper */
#include <math.h> /* pow */
/* Linked with a qsort called qsort_ that acts as a control to test qsort.
With C89, no inline, one has to optimise to get anything. */
/*#define QSORT_*/
static int int_cmp(const int *const a, const int *const b)
return (*a > *b) - (*b > *a);
#define SORT_NAME Int
#define SORT_TYPE int
#define SORT_COMPARATOR &int_cmp
#include "Sort.h"
static int int_void_cmp(const void *a, const void *b)
return int_cmp(a, b);
static void int_fill(int *const a)
*a = rand();
struct Foo
int no;
char name[64];
;
static const size_t foo_name_size = sizeof ((struct Foo *)0)->name;
static int foo_no_cmp(const struct Foo *const a, const struct Foo *const b)
return (a->no > b->no) - (b->no > a->no);
#define SORT_NAME FooNo
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_no_cmp
#include "Sort.h"
static int foo_name_cmp(const struct Foo *const a, const struct Foo *const b)
return strcmp(a->name, b->name);
#define SORT_NAME FooName
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_name_cmp
#include "Sort.h"
static int foo_void_no_cmp(const void *a, const void *b)
return foo_no_cmp(a, b);
static int foo_void_name_cmp(const void *a, const void *b)
return foo_name_cmp(a, b);
static void foo_fill(struct Foo *const a)
const size_t num_chars = 4 + rand() / (RAND_MAX / (foo_name_size - 5) + 1);
size_t i;
int_fill(&a->no);
for(i = 0; i < num_chars; i++)
a->name[i] = (i ? 'a' : 'A') + rand() / (RAND_MAX + 1.0) * 26;
a->name[i] = '';
/* The control -- not going to include this. C89 */
#ifdef QSORT_ /* <-- my */
void
qsort_(void *a, size_t n, size_t es, int (*cmp)(const void *, const void *));
#endif /* my --> */
typedef int (*TestFn)(const size_t, double *const);
/* This is messy. */
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define TEST_FN(NAME, TYPE, FILL, TEST)
static int CAT(NAME, _test)(const size_t size, double *const p_ms_time)
TYPE *a = malloc(sizeof *a * size);
size_t i;
clock_t run;
if(!a) return 0;
for(i = 0; i < size; i++) (FILL)(a + i);
run = -clock();
(TEST);
run += clock();
*p_ms_time = (double)run / (CLOCKS_PER_SEC / 1000.0);
free(a);
return 1;
TEST_FN(int_q, int, int_fill, qsort(a, size, sizeof *a, &int_void_cmp))
TEST_FN(int_s, int, int_fill, IntSort(a, size))
TEST_FN(foo_no_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_no_s, struct Foo, foo_fill, FooNoSort(a, size))
TEST_FN(foo_name_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_name_cmp))
TEST_FN(foo_name_s, struct Foo, foo_fill, FooNameSort(a, size))
#ifdef QSORT_ /* <-- my */
TEST_FN(int_m, int, int_fill, qsort_(a, size, sizeof *a, &int_void_cmp))
TEST_FN(foo_no_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_name_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_name_cmp))
#endif /* my --> */
static const char *const gnu_name = "sort.gnu";
static const char *const graph_name = "sort.eps";
static const unsigned replicas = 5;
static const size_t max_elements = 100000;
int main(void)
struct TimeData const char *const name; TestFn fn; time_data[] =
"qsort int", &int_q_test ,
"IntSort", &int_s_test ,
"qsort foo::no", &foo_no_q_test ,
"FooNoSort", &foo_no_s_test ,
"qsort foo::name", &foo_name_q_test ,
"FooNameSort", &foo_name_s_test
#ifdef QSORT_ /* <-- my */
,
"my qsort_ int", &int_m_test ,
"my qsort_ foo::no", &foo_no_m_test ,
"my qsort_ foo::name", &foo_name_m_test
#endif /* my --> */
, *td;
const size_t time_data_size = sizeof time_data / sizeof *time_data;
unsigned td_i;
FILE *fp = 0, *gnu = 0;
size_t elements, x, r;
char fn_name[32] = "not a file";
unsigned seed = (unsigned)clock();
const char *err = 0;
srand(seed), rand(), printf("Seed %u.n", seed);
do /* Try. */
if(!(gnu = fopen(gnu_name, "w"))) err = gnu_name; break;
fprintf(gnu, "set term postscript eps enhanced colorn"
"set output "%s"n"
"set gridn"
"set xlabel "array elements"n"
"set ylabel "processor time, t (ms)"n"
"set yrange [0:]n"
"# seed %un"
"n"
"plot", graph_name, seed);
for(td_i = 0; td = time_data + td_i, td_i < time_data_size; td_i++) !(fp = fopen(fn_name, "w"))) err = fn_name; break;
fprintf(stderr, "Created/overwrote, "%s," to store data on %s "
"sort.n", fn_name, td->name);
/* Do several experiments, increasing number of elements. */
fprintf(fp, "# %s: elements msn", td->name);
for(elements = 1, x = 2; elements <= max_elements;
elements = (unsigned)pow((double)x++, 3.5))
int n = 0;
double dt_ms = 0.0, mean_ms = 0.0, delta_ms, ssdm = 0.0;
fprintf(stderr, "%s: sorting %lu elements distributed uniformly"
" at random.n", td->name, (unsigned long)elements);
/* citeWelford1962Note */
for(r = 0; r < replicas; r++)
fprintf(stderr, "#");
if(!td->fn(elements, &dt_ms)) err = "experiment"; break;
n++;
delta_ms = dt_ms - mean_ms;
mean_ms += delta_ms / n;
ssdm += delta_ms * (dt_ms - mean_ms);
fprintf(stderr, " done.n");
if(r != replicas) break;
fprintf(fp, "%lut%ft%fn", (unsigned long)elements, mean_ms,
sqrt(ssdm / (n - 1)));
/* close the file */
if(fclose(fp)) fp = 0; err = fn_name; break;
fp = 0;
/* write the graph */
fprintf(gnu,
"%st"%s" using 1:2:3 with errorlines lw 3 title "%s"",
td_i ? ", \n" : " ", fn_name, td->name);
fprintf(gnu, "n");
while(0); if(err) perror(err); /* Catch. */
/* Finally. */
if(fp && fclose(fp)) perror(fn_name);
if(gnu && fclose(gnu)) perror(gnu_name);
return err ? EXIT_FAILURE : EXIT_SUCCESS;
This is code taken from Bentley McIlroy 1993 and Berkeley; I've modified it to take parameters in the form of pre-processor #define
s. It must be run with optimisations on since I'm using C89
: doesn't support inline
. I've gotten rid of the warnings, but I'm not sure this compiles to the same code.
/*
* Copyright (c) 1995 NeXT Computer, Inc. All Rights Reserved
*
* Copyright (c) 1992, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* @(#)qsort.c 8.1 (Berkeley) 6/4/93
*/
/* Modified by Neil to make it parameterised.
The parameters are preprocessor macros and all are all undefined at the end of
the file for convenience.
@param SORT_NAME
A unique name associated with <S> that satisfies C naming conventions when
mangled.
@param SORT_TYPE
This type becomes <S>; required.
@param SORT_COMPARATOR
A function satisfying <PS>SortComparator.
@std C89
*/
/* Check defines. */
#ifndef SORT_NAME
#error Sort name SORT_NAME undefined.
#endif
#ifndef SORT_TYPE
#error Sort type SORT_TYPE undefined.
#endif
#ifndef SORT_COMPARATOR
#error Sort function SORT_COMPARATOR undefined.
#endif
/* Generics using the preprocessor;
url http://stackoverflow.com/questions/16522341/pseudo-generics-in-c . */
#ifdef CAT
#undef CAT
#endif
#ifdef CAT_
#undef CAT_
#endif
#ifdef PCAT
#undef PCAT
#endif
#ifdef PCAT_
#undef PCAT_
#endif
#ifdef S
#undef S
#endif
#ifdef S_
#undef S_
#endif
#ifdef PS_
#undef PS_
#endif
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define PCAT_(x, y) x ## _ ## y
#define PCAT(x, y) PCAT_(x, y)
#define S_(thing) CAT(SORT_NAME, thing)
#define PS_(thing) PCAT(sort, PCAT(SORT_NAME, thing)) /* Private. */
/* Troubles with this line? check to ensure that SORT_TYPE is a valid type,
whose definition is placed above #include "Sort.h". */
typedef SORT_TYPE PS_(Type);
#define S PS_(Type)
/** Must establish a partial order amongst <S> by returning
a < b, negative; a = b, 0; a > b, positive. */
typedef int (*PS_(SortComparator))(const S *const a, const S *const b);
/* Check that MAP_COMPARATOR is a function implementing
<PS>SortComparator. */
static const PS_(SortComparator) PS_(cmp) = (SORT_COMPARATOR);
#include <sys/types.h>
#ifndef SORT_H /* <-- h Idempotent. */
#define SORT_H
#define sort_min(a, b) (a) < (b) ? a : b
enum SortSwapType WORD, WORDS, BYTE ;
#endif /* h --> */
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*
* (How does that work with the copyright?)
*/
static void
PS_(swapfunc)(S *a, S *b, const size_t n, const enum SortSwapType swaptype)
switch(swaptype)
case WORDS:
case WORD:
long i = sizeof *a * n / sizeof(long);
long *pi = (long *)(void *)a;
long *pj = (long *)(void *)b;
do
long t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
case BYTE:
long i = sizeof *a * n;
char *pi = (char *)a;
char *pj = (char *)b;
do
char t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
static void
PS_(swap)(S *a, S *b, const enum SortSwapType swaptype)
if(swaptype == WORD)
long t = *(long *)(void *)(a);
*(long *)(void *)(a) = *(long *)(void *)(b);
*(long *)(void *)(b) = t;
else
PS_(swapfunc)(a, b, (size_t)1, swaptype);
static S *
PS_(med3)(S *a, S *b, S *c)
return PS_(cmp)(a, b) < 0 ?
(PS_(cmp)(b, c) < 0 ? b : (PS_(cmp)(a, c) < 0 ? c : a ))
: (PS_(cmp)(b, c) > 0 ? b : (PS_(cmp)(a, c) < 0 ? a : c ));
static void
S_(Sort)(S *a, size_t n)
sizeof *a % sizeof(long) + (0*n) ? BYTE
: sizeof *a == sizeof(long) ? WORD : WORDS;
swap_cnt = 0;
if (n < 7)
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pm = a + (n / 2);
if (n > 7)
pl = a;
pn = a + (n - 1);
if (n > 40)
d = n / 8;
pl = PS_(med3)(pl, pl + d, pl + 2 * d);
pm = PS_(med3)(pm - d, pm, pm + d);
pn = PS_(med3)(pn - 2 * d, pn - d, pn);
pm = PS_(med3)(pl, pm, pn);
PS_(swap)(a, pm, swaptype);
pa = pb = a + 1;
pc = pd = a + (n - 1);
for (;;)
while (pb <= pc && (r = PS_(cmp)(pb, a)) <= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pa, pb, swaptype);
pa++;
pb++;
while (pb <= pc && (r = PS_(cmp)(pc, a)) >= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pc, pd, swaptype);
pd--;
pc--;
if (pb > pc)
break;
PS_(swap)(pb, pc, swaptype);
swap_cnt = 1;
pb++;
pc--;
if (swap_cnt == 0) /* Switch to insertion sort */
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pn = a + n;
vec = sort_min(pa - a, pb - pa);
if(vec > 0) PS_(swapfunc)(a, pb - vec, vec, swaptype);
vec = sort_min((size_t)(pd - pc), (size_t)(pn - pd - 1));
if(vec > 0) PS_(swapfunc)(pb, pn - vec, vec, swaptype);
if ((size_t)(vec = pb - pa) > 1)
S_(Sort)(a, vec);
if ((vec = pd - pc) > 1)
/* Iterate rather than recurse to save stack space */
a = pn - vec;
n = vec;
goto loop;
/* Un-define all macros. Undocumented; allows nestled inclusion so long as:
CAT_, CAT, PCAT, PCAT_ conform, and S, is not used. */
#ifdef SORT_SUBTYPE /* <-- sub */
#undef SORT_SUBTYPE
#else /* sub --><-- !sub */
#undef CAT
#undef CAT_
#undef PCAT
#undef PCAT_
#endif /* !sub --> */
#undef S
#undef S_
#undef PS_
#undef SORT_NAME
#undef SORT_TYPE
#undef SORT_COMPARATOR
On my computer, it outputs, with optimisations on, via
gnuplot
. It is faster, but not by much.
I'm not sure about the changes I've made to qsort.c
. Specifically, why swaptype = (a-(char*)0 | es) % W ? 2 : es > W ? 1 : 0
(Bentley1993,) where es
is the size. Why is it called every time? Shouldn't it be a | es
without the subtracting null? Isn't a
aligned properly and it might be es
and taken out of the loop?
c sorting quick-sort c89
$endgroup$
I understand that C++
STL template sort
runs faster than qsort
in C because qsort
needs to call the comparator multiple times and C++
makes the code specific to that type. However, C
's pre-processor can hack that, too. So I build a test using qsort
.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* printf fputc stdout sprintf */
#include <string.h> /* strcmp */
#include <errno.h> /* errno */
#include <time.h> /* clock */
#include <ctype.h> /* toupper */
#include <math.h> /* pow */
/* Linked with a qsort called qsort_ that acts as a control to test qsort.
With C89, no inline, one has to optimise to get anything. */
/*#define QSORT_*/
static int int_cmp(const int *const a, const int *const b)
return (*a > *b) - (*b > *a);
#define SORT_NAME Int
#define SORT_TYPE int
#define SORT_COMPARATOR &int_cmp
#include "Sort.h"
static int int_void_cmp(const void *a, const void *b)
return int_cmp(a, b);
static void int_fill(int *const a)
*a = rand();
struct Foo
int no;
char name[64];
;
static const size_t foo_name_size = sizeof ((struct Foo *)0)->name;
static int foo_no_cmp(const struct Foo *const a, const struct Foo *const b)
return (a->no > b->no) - (b->no > a->no);
#define SORT_NAME FooNo
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_no_cmp
#include "Sort.h"
static int foo_name_cmp(const struct Foo *const a, const struct Foo *const b)
return strcmp(a->name, b->name);
#define SORT_NAME FooName
#define SORT_TYPE struct Foo
#define SORT_COMPARATOR &foo_name_cmp
#include "Sort.h"
static int foo_void_no_cmp(const void *a, const void *b)
return foo_no_cmp(a, b);
static int foo_void_name_cmp(const void *a, const void *b)
return foo_name_cmp(a, b);
static void foo_fill(struct Foo *const a)
const size_t num_chars = 4 + rand() / (RAND_MAX / (foo_name_size - 5) + 1);
size_t i;
int_fill(&a->no);
for(i = 0; i < num_chars; i++)
a->name[i] = (i ? 'a' : 'A') + rand() / (RAND_MAX + 1.0) * 26;
a->name[i] = '';
/* The control -- not going to include this. C89 */
#ifdef QSORT_ /* <-- my */
void
qsort_(void *a, size_t n, size_t es, int (*cmp)(const void *, const void *));
#endif /* my --> */
typedef int (*TestFn)(const size_t, double *const);
/* This is messy. */
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define TEST_FN(NAME, TYPE, FILL, TEST)
static int CAT(NAME, _test)(const size_t size, double *const p_ms_time)
TYPE *a = malloc(sizeof *a * size);
size_t i;
clock_t run;
if(!a) return 0;
for(i = 0; i < size; i++) (FILL)(a + i);
run = -clock();
(TEST);
run += clock();
*p_ms_time = (double)run / (CLOCKS_PER_SEC / 1000.0);
free(a);
return 1;
TEST_FN(int_q, int, int_fill, qsort(a, size, sizeof *a, &int_void_cmp))
TEST_FN(int_s, int, int_fill, IntSort(a, size))
TEST_FN(foo_no_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_no_s, struct Foo, foo_fill, FooNoSort(a, size))
TEST_FN(foo_name_q, struct Foo, foo_fill,
qsort(a, size, sizeof *a, &foo_void_name_cmp))
TEST_FN(foo_name_s, struct Foo, foo_fill, FooNameSort(a, size))
#ifdef QSORT_ /* <-- my */
TEST_FN(int_m, int, int_fill, qsort_(a, size, sizeof *a, &int_void_cmp))
TEST_FN(foo_no_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_no_cmp))
TEST_FN(foo_name_m, struct Foo, foo_fill,
qsort_(a, size, sizeof *a, &foo_void_name_cmp))
#endif /* my --> */
static const char *const gnu_name = "sort.gnu";
static const char *const graph_name = "sort.eps";
static const unsigned replicas = 5;
static const size_t max_elements = 100000;
int main(void)
struct TimeData const char *const name; TestFn fn; time_data[] =
"qsort int", &int_q_test ,
"IntSort", &int_s_test ,
"qsort foo::no", &foo_no_q_test ,
"FooNoSort", &foo_no_s_test ,
"qsort foo::name", &foo_name_q_test ,
"FooNameSort", &foo_name_s_test
#ifdef QSORT_ /* <-- my */
,
"my qsort_ int", &int_m_test ,
"my qsort_ foo::no", &foo_no_m_test ,
"my qsort_ foo::name", &foo_name_m_test
#endif /* my --> */
, *td;
const size_t time_data_size = sizeof time_data / sizeof *time_data;
unsigned td_i;
FILE *fp = 0, *gnu = 0;
size_t elements, x, r;
char fn_name[32] = "not a file";
unsigned seed = (unsigned)clock();
const char *err = 0;
srand(seed), rand(), printf("Seed %u.n", seed);
do /* Try. */
if(!(gnu = fopen(gnu_name, "w"))) err = gnu_name; break;
fprintf(gnu, "set term postscript eps enhanced colorn"
"set output "%s"n"
"set gridn"
"set xlabel "array elements"n"
"set ylabel "processor time, t (ms)"n"
"set yrange [0:]n"
"# seed %un"
"n"
"plot", graph_name, seed);
for(td_i = 0; td = time_data + td_i, td_i < time_data_size; td_i++) !(fp = fopen(fn_name, "w"))) err = fn_name; break;
fprintf(stderr, "Created/overwrote, "%s," to store data on %s "
"sort.n", fn_name, td->name);
/* Do several experiments, increasing number of elements. */
fprintf(fp, "# %s: elements msn", td->name);
for(elements = 1, x = 2; elements <= max_elements;
elements = (unsigned)pow((double)x++, 3.5))
int n = 0;
double dt_ms = 0.0, mean_ms = 0.0, delta_ms, ssdm = 0.0;
fprintf(stderr, "%s: sorting %lu elements distributed uniformly"
" at random.n", td->name, (unsigned long)elements);
/* citeWelford1962Note */
for(r = 0; r < replicas; r++)
fprintf(stderr, "#");
if(!td->fn(elements, &dt_ms)) err = "experiment"; break;
n++;
delta_ms = dt_ms - mean_ms;
mean_ms += delta_ms / n;
ssdm += delta_ms * (dt_ms - mean_ms);
fprintf(stderr, " done.n");
if(r != replicas) break;
fprintf(fp, "%lut%ft%fn", (unsigned long)elements, mean_ms,
sqrt(ssdm / (n - 1)));
/* close the file */
if(fclose(fp)) fp = 0; err = fn_name; break;
fp = 0;
/* write the graph */
fprintf(gnu,
"%st"%s" using 1:2:3 with errorlines lw 3 title "%s"",
td_i ? ", \n" : " ", fn_name, td->name);
fprintf(gnu, "n");
while(0); if(err) perror(err); /* Catch. */
/* Finally. */
if(fp && fclose(fp)) perror(fn_name);
if(gnu && fclose(gnu)) perror(gnu_name);
return err ? EXIT_FAILURE : EXIT_SUCCESS;
This is code taken from Bentley McIlroy 1993 and Berkeley; I've modified it to take parameters in the form of pre-processor #define
s. It must be run with optimisations on since I'm using C89
: doesn't support inline
. I've gotten rid of the warnings, but I'm not sure this compiles to the same code.
/*
* Copyright (c) 1995 NeXT Computer, Inc. All Rights Reserved
*
* Copyright (c) 1992, 1993
* The Regents of the University of California. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* @(#)qsort.c 8.1 (Berkeley) 6/4/93
*/
/* Modified by Neil to make it parameterised.
The parameters are preprocessor macros and all are all undefined at the end of
the file for convenience.
@param SORT_NAME
A unique name associated with <S> that satisfies C naming conventions when
mangled.
@param SORT_TYPE
This type becomes <S>; required.
@param SORT_COMPARATOR
A function satisfying <PS>SortComparator.
@std C89
*/
/* Check defines. */
#ifndef SORT_NAME
#error Sort name SORT_NAME undefined.
#endif
#ifndef SORT_TYPE
#error Sort type SORT_TYPE undefined.
#endif
#ifndef SORT_COMPARATOR
#error Sort function SORT_COMPARATOR undefined.
#endif
/* Generics using the preprocessor;
url http://stackoverflow.com/questions/16522341/pseudo-generics-in-c . */
#ifdef CAT
#undef CAT
#endif
#ifdef CAT_
#undef CAT_
#endif
#ifdef PCAT
#undef PCAT
#endif
#ifdef PCAT_
#undef PCAT_
#endif
#ifdef S
#undef S
#endif
#ifdef S_
#undef S_
#endif
#ifdef PS_
#undef PS_
#endif
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define PCAT_(x, y) x ## _ ## y
#define PCAT(x, y) PCAT_(x, y)
#define S_(thing) CAT(SORT_NAME, thing)
#define PS_(thing) PCAT(sort, PCAT(SORT_NAME, thing)) /* Private. */
/* Troubles with this line? check to ensure that SORT_TYPE is a valid type,
whose definition is placed above #include "Sort.h". */
typedef SORT_TYPE PS_(Type);
#define S PS_(Type)
/** Must establish a partial order amongst <S> by returning
a < b, negative; a = b, 0; a > b, positive. */
typedef int (*PS_(SortComparator))(const S *const a, const S *const b);
/* Check that MAP_COMPARATOR is a function implementing
<PS>SortComparator. */
static const PS_(SortComparator) PS_(cmp) = (SORT_COMPARATOR);
#include <sys/types.h>
#ifndef SORT_H /* <-- h Idempotent. */
#define SORT_H
#define sort_min(a, b) (a) < (b) ? a : b
enum SortSwapType WORD, WORDS, BYTE ;
#endif /* h --> */
/*
* Qsort routine from Bentley & McIlroy's "Engineering a Sort Function".
*
* (How does that work with the copyright?)
*/
static void
PS_(swapfunc)(S *a, S *b, const size_t n, const enum SortSwapType swaptype)
switch(swaptype)
case WORDS:
case WORD:
long i = sizeof *a * n / sizeof(long);
long *pi = (long *)(void *)a;
long *pj = (long *)(void *)b;
do
long t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
case BYTE:
long i = sizeof *a * n;
char *pi = (char *)a;
char *pj = (char *)b;
do
char t = *pi;
*pi++ = *pj;
*pj++ = t;
while (--i > 0);
break;
static void
PS_(swap)(S *a, S *b, const enum SortSwapType swaptype)
if(swaptype == WORD)
long t = *(long *)(void *)(a);
*(long *)(void *)(a) = *(long *)(void *)(b);
*(long *)(void *)(b) = t;
else
PS_(swapfunc)(a, b, (size_t)1, swaptype);
static S *
PS_(med3)(S *a, S *b, S *c)
return PS_(cmp)(a, b) < 0 ?
(PS_(cmp)(b, c) < 0 ? b : (PS_(cmp)(a, c) < 0 ? c : a ))
: (PS_(cmp)(b, c) > 0 ? b : (PS_(cmp)(a, c) < 0 ? a : c ));
static void
S_(Sort)(S *a, size_t n)
sizeof *a % sizeof(long) + (0*n) ? BYTE
: sizeof *a == sizeof(long) ? WORD : WORDS;
swap_cnt = 0;
if (n < 7)
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pm = a + (n / 2);
if (n > 7)
pl = a;
pn = a + (n - 1);
if (n > 40)
d = n / 8;
pl = PS_(med3)(pl, pl + d, pl + 2 * d);
pm = PS_(med3)(pm - d, pm, pm + d);
pn = PS_(med3)(pn - 2 * d, pn - d, pn);
pm = PS_(med3)(pl, pm, pn);
PS_(swap)(a, pm, swaptype);
pa = pb = a + 1;
pc = pd = a + (n - 1);
for (;;)
while (pb <= pc && (r = PS_(cmp)(pb, a)) <= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pa, pb, swaptype);
pa++;
pb++;
while (pb <= pc && (r = PS_(cmp)(pc, a)) >= 0)
if (r == 0)
swap_cnt = 1;
PS_(swap)(pc, pd, swaptype);
pd--;
pc--;
if (pb > pc)
break;
PS_(swap)(pb, pc, swaptype);
swap_cnt = 1;
pb++;
pc--;
if (swap_cnt == 0) /* Switch to insertion sort */
for (pm = a + 1; pm < a + n; pm++)
for (pl = pm; pl > a && PS_(cmp)(pl - 1, pl) > 0; pl--)
PS_(swap)(pl, pl - 1, swaptype);
return;
pn = a + n;
vec = sort_min(pa - a, pb - pa);
if(vec > 0) PS_(swapfunc)(a, pb - vec, vec, swaptype);
vec = sort_min((size_t)(pd - pc), (size_t)(pn - pd - 1));
if(vec > 0) PS_(swapfunc)(pb, pn - vec, vec, swaptype);
if ((size_t)(vec = pb - pa) > 1)
S_(Sort)(a, vec);
if ((vec = pd - pc) > 1)
/* Iterate rather than recurse to save stack space */
a = pn - vec;
n = vec;
goto loop;
/* Un-define all macros. Undocumented; allows nestled inclusion so long as:
CAT_, CAT, PCAT, PCAT_ conform, and S, is not used. */
#ifdef SORT_SUBTYPE /* <-- sub */
#undef SORT_SUBTYPE
#else /* sub --><-- !sub */
#undef CAT
#undef CAT_
#undef PCAT
#undef PCAT_
#endif /* !sub --> */
#undef S
#undef S_
#undef PS_
#undef SORT_NAME
#undef SORT_TYPE
#undef SORT_COMPARATOR
On my computer, it outputs, with optimisations on, via
gnuplot
. It is faster, but not by much.
I'm not sure about the changes I've made to qsort.c
. Specifically, why swaptype = (a-(char*)0 | es) % W ? 2 : es > W ? 1 : 0
(Bentley1993,) where es
is the size. Why is it called every time? Shouldn't it be a | es
without the subtracting null? Isn't a
aligned properly and it might be es
and taken out of the loop?
c sorting quick-sort c89
c sorting quick-sort c89
asked 6 mins ago
Neil EdelmanNeil Edelman
287110
287110
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216961%2fengineering-an-even-faster-qsort%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216961%2fengineering-an-even-faster-qsort%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown