[Pdx-pm] Software development and The Rules
jkeroes at eli.net
Thu Sep 26 19:07:32 CDT 2002
I have benchmarks, but I can't coax B::TerseSize to give me the size of
memory in use. (Help, anyone?) Nevertheless, I can make some guesses
about how much memory is in use:
What did we decide was the array memory usage before? 30 MB or
something? This solution uses 122K + Perl overhead, maybe 130K
all told. (!) That's a few orders of magnitude better.
I don't think I'd want to consider testing this on all of the
telephone numbers in the US, let alone the world. If we allow
for all possible numbers, it's 1000x the size of the 7 digit set,
30 GB. Yikes! Do you have 30 GB of RAM? Even is we wanted to
use swap, our computer would need quite a few GB or RAM. Yikes!
By comparison, this solution would use 122 MB of memory,
something possible on most of our computers.
I've tested seven digit numbers only, with two datasets, first
filling our million numbers 1/3 full and then 2/3 full. I'm
surprised to see better performance on the denser set. I would
have assumed that with fewer elements we'd have fewer operations
and therefore more speed. Wrong.
As we can see from the benchmarks, there was no speed gain.
Either I screwed up or Perl is great (which we already know).
I'll may have to eat my words. If someone figures out the
solution, we can discuss why I thought this would have been
'hash' is a hash-based solution. I wanted to use an array,
but the hash guarantees unique numbers. 'mine' is the
Benchmark: timing 100000 iterations of object initialization:
hash 333K: 32 wallclock secs (28.19 usr + 1.54 sys = 29.73 CPU) @ 3363.61/s (n=100000)
hash 666K: 29 wallclock secs (23.41 usr + 1.29 sys = 24.70 CPU) @ 4048.58/s (n=100000)
mine 333K: 28 wallclock secs (22.73 usr + 1.29 sys = 24.02 CPU) @ 4163.20/s (n=100000)
mine 666K: 27 wallclock secs (23.11 usr + 1.64 sys = 24.75 CPU) @ 4040.40/s (n=100000)
Benchmark: timing 100000 iterations of sorting:
hash 333K: 48 wallclock secs (44.18 usr + 0.00 sys = 44.18 CPU) @ 2263.47/s (n=100000)
hash 666K: 48 wallclock secs (44.09 usr + 0.00 sys = 44.09 CPU) @ 2268.09/s (n=100000)
mine 333K: 48 wallclock secs (43.95 usr + 0.00 sys = 43.95 CPU) @ 2275.31/s (n=100000)
mine 666K: 48 wallclock secs (44.00 usr + 0.00 sys = 44.00 CPU) @ 2272.73/s (n=100000)
PS I'll post the Programming Pearls implementation tomorrow if nobody gets it first.
More information about the Pdx-pm-list