[Pdx-pm] string comparison vs hash
Marvin Humphrey
marvin at rectangular.com
Wed May 30 05:52:54 PDT 2007
On May 29, 2007, at 11:58 PM, chromatic wrote:
> Still, if the benchmark's fast enough to say "I didn't run enough
> iterations
> to get a reliable count", I start to suspect that seek time and
> transfer
> rates will suddenly start to matter a lot more than the difference
> between
> indexed and keyed aggregate access.
>
> Accurate benchmarking is Not Easy.
Amen to that.
Below you'll find the output from a benchmarking program I wrote to
test KinoSearch indexing speed. I intentionally ran it cold, so that
the first iteration wouldn't benefit from OS caching. And indeed, it
came up considerably slower: 2.46 seconds as opposed to the
"truncated mean" of 1.41 seconds.
There's also another outlier of 1.89 seconds at the 8th iter. Even
though I quit almost everything before running this app, OS X is
still a noisy operating system and every once in a while it hiccups.
The use of a "truncated mean" <http://en.wikipedia.org/wiki/
Truncated_mean> protects the stats from these glitches by discarding
the outermost scores. The same technique is commonly used in judged
sports: the highest and lowest scores are tossed out, then the rest
are averaged.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/
slothbear:~/projects/ks_variants/ks_sortfix/perl marvin$ perl -I../
devel/benchmarks/indexers/ -Mblib ../devel/benchmarks/indexers/
kinosearch_indexer.plx --docs=1000 --reps=10
------------------------------------------------------------
1 Secs: 2.46 Docs: 1000
2 Secs: 1.40 Docs: 1000
3 Secs: 1.39 Docs: 1000
4 Secs: 1.41 Docs: 1000
5 Secs: 1.40 Docs: 1000
6 Secs: 1.41 Docs: 1000
7 Secs: 1.41 Docs: 1000
8 Secs: 1.89 Docs: 1000
9 Secs: 1.42 Docs: 1000
10 Secs: 1.39 Docs: 1000
------------------------------------------------------------
KinoSearch 0.20_03
Perl 5.8.6
Thread support: yes
Darwin 8.9.0 Power Macintosh
Mean: 1.46 secs
Truncated mean (6 kept, 4 discarded): 1.41 secs
------------------------------------------------------------
slothbear:~/projects/ks_variants/ks_sortfix/perl marvin$
More information about the Pdx-pm-list
mailing list