Skip to content

Tune merge_sort insertion threshold #12092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zkamsler opened this issue Feb 7, 2014 · 1 comment
Closed

Tune merge_sort insertion threshold #12092

zkamsler opened this issue Feb 7, 2014 · 1 comment

Comments

@zkamsler
Copy link
Contributor

zkamsler commented Feb 7, 2014

In the course of optimizing sorting short vectors in #12029, I noticed that the maximum run length that merge sort uses has an impact on performance. Increasing the length to 32 improved performance when sorting u64s, but sorting larger types incurred the cost of the additional writes that insertion sort does.

There is definitely room for tuning this parameter, possibly varying it with the size of the type being sorted. It should probably be based on more than what happens to run fast on my specific computer.

Below are benchmarks:
in-place insertion (32 elem max run):

test vec::bench::sort_big_random_large    ... bench:   1512893 ns/iter (+/- 154790) = 211 MB/s
test vec::bench::sort_big_random_medium   ... bench:      8542 ns/iter (+/- 1369) = 374 MB/s
test vec::bench::sort_big_random_small    ... bench:       246 ns/iter (+/- 147) = 650 MB/s
test vec::bench::sort_big_sorted          ... bench:    536906 ns/iter (+/- 61414) = 595 MB/s
test vec::bench::sort_random_large        ... bench:    685387 ns/iter (+/- 86290) = 116 MB/s
test vec::bench::sort_random_medium       ... bench:      4277 ns/iter (+/- 915) = 187 MB/s
test vec::bench::sort_random_small        ... bench:       183 ns/iter (+/- 50) = 218 MB/s
test vec::bench::sort_sorted              ... bench:    364977 ns/iter (+/- 63217) = 219 MB/s

in-place insertion (16 elem max run)

test vec::bench::sort_big_random_large    ... bench:   1447282 ns/iter (+/- 247736) = 220 MB/s
test vec::bench::sort_big_random_medium   ... bench:      8063 ns/iter (+/- 1136) = 396 MB/s
test vec::bench::sort_big_random_small    ... bench:       242 ns/iter (+/- 56) = 661 MB/s
test vec::bench::sort_big_sorted          ... bench:    580873 ns/iter (+/- 347205) = 550 MB/s
test vec::bench::sort_random_large        ... bench:    693611 ns/iter (+/- 130341) = 115 MB/s
test vec::bench::sort_random_medium       ... bench:      4461 ns/iter (+/- 1039) = 179 MB/s
test vec::bench::sort_random_small        ... bench:       180 ns/iter (+/- 36) = 222 MB/s
test vec::bench::sort_sorted              ... bench:    382591 ns/iter (+/- 80074) = 209 MB/s

in-place insertion (8 elem max run):

test vec::bench::sort_big_random_large    ... bench:   1453949 ns/iter (+/- 225381) = 219 MB/s
test vec::bench::sort_big_random_medium   ... bench:      7975 ns/iter (+/- 1034) = 401 MB/s
test vec::bench::sort_big_random_small    ... bench:       241 ns/iter (+/- 42) = 663 MB/s
test vec::bench::sort_big_sorted          ... bench:    607379 ns/iter (+/- 78551) = 526 MB/s
test vec::bench::sort_random_large        ... bench:    791019 ns/iter (+/- 178413) = 101 MB/s
test vec::bench::sort_random_medium       ... bench:      4426 ns/iter (+/- 519) = 180 MB/s
test vec::bench::sort_random_small        ... bench:       180 ns/iter (+/- 26) = 222 MB/s
test vec::bench::sort_sorted              ... bench:    419979 ns/iter (+/- 50808) = 190 MB/s

out-of-place (32 run insertion sort)

test vec::bench::sort_big_random_large    ... bench:   1454360 ns/iter (+/- 276385) = 219 MB/s
test vec::bench::sort_big_random_medium   ... bench:      8259 ns/iter (+/- 1190) = 387 MB/s
test vec::bench::sort_big_random_small    ... bench:       242 ns/iter (+/- 72) = 661 MB/s
test vec::bench::sort_big_sorted          ... bench:    526373 ns/iter (+/- 168484) = 607 MB/s
test vec::bench::sort_random_large        ... bench:    692873 ns/iter (+/- 140006) = 115 MB/s
test vec::bench::sort_random_medium       ... bench:      4342 ns/iter (+/- 677) = 184 MB/s
test vec::bench::sort_random_small        ... bench:       177 ns/iter (+/- 35) = 225 MB/s
test vec::bench::sort_sorted              ... bench:    380630 ns/iter (+/- 66918) = 210 MB/s

out-of-place (16 run insertion sort)

test vec::bench::sort_big_random_large    ... bench:   1393579 ns/iter (+/- 224143) = 229 MB/s
test vec::bench::sort_big_random_medium   ... bench:      7802 ns/iter (+/- 1219) = 410 MB/s
test vec::bench::sort_big_random_small    ... bench:       245 ns/iter (+/- 39) = 653 MB/s
test vec::bench::sort_big_sorted          ... bench:    584058 ns/iter (+/- 143346) = 547 MB/s
test vec::bench::sort_random_large        ... bench:    702402 ns/iter (+/- 178147) = 113 MB/s
test vec::bench::sort_random_medium       ... bench:      4500 ns/iter (+/- 900) = 177 MB/s
test vec::bench::sort_random_small        ... bench:       183 ns/iter (+/- 36) = 218 MB/s
test vec::bench::sort_sorted              ... bench:    413594 ns/iter (+/- 65994) = 193 MB/s

out-of-place (8 run insertion sort)

test vec::bench::sort_big_random_large    ... bench:   1405837 ns/iter (+/- 285931) = 227 MB/s
test vec::bench::sort_big_random_medium   ... bench:      8434 ns/iter (+/- 2950) = 379 MB/s
test vec::bench::sort_big_random_small    ... bench:       251 ns/iter (+/- 60) = 637 MB/s
test vec::bench::sort_big_sorted          ... bench:    663738 ns/iter (+/- 109683) = 481 MB/s
test vec::bench::sort_random_large        ... bench:    752714 ns/iter (+/- 135182) = 106 MB/s
test vec::bench::sort_random_medium       ... bench:      5793 ns/iter (+/- 3496) = 138 MB/s
test vec::bench::sort_random_small        ... bench:       187 ns/iter (+/- 60) = 213 MB/s
test vec::bench::sort_sorted              ... bench:    430830 ns/iter (+/- 113507) = 185 MB/s
@gereeter
Copy link
Contributor

This is done with the merging of #12029 and should be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants