[mplspm]: In a sorted list [better]
Ken Williams
ken at mathforum.org
Tue Mar 26 16:44:39 CST 2002
Huh, funny.
Well, it might have something to do with the hash keys. Numeric hash
keys (especially sequential ones) notoriously trip up the perl hashing
algorithm, sometimes making everything go into a single bucket. I'm not
sure whether this has been patched in recent perls or not. Maybe try
with random words from /usr/dict/words ?
The other thing is that perl maintains a global table of hash keys so
that it doesn't have to re-hash a key that it's seen elsewhere in a
different context.
But come to thing of it, maybe neither of these things affects this
benchmark. The Conventional Wisdom has been that slicing is faster for
this, but what does it know?
-Ken
On Wednesday, March 27, 2002, at 04:46 AM, John J. Trammell wrote:
> On Tue, Mar 26, 2002 at 10:16:27AM +1100, Ken Williams wrote:
>>
>> Since you only care about the keys and not the values, it's generally
>> quicker to construct the hash like so:
>>
>> my $hash = sub
>> {
>> my ($x,$a) = @_;
>> my %h;
>> @h{@$a} = ();
>> return exists $h{$x};
>> };
>>
>
> The results of the actual experiment (code below) surprised me. Is
> there something wrong in my test maybe?
>
> #!/usr/bin/perl -w
> use strict;
> use Benchmark;
>
> sub __map
> {
> my ($a) = @_;
> my %h = map { $_, 0 } @$a;
> return \%h;
> }
>
> sub __slice
> {
> my ($a) = @_;
> my %h;
> @h{ @$a } = ();
> return \%h;
> }
>
> foreach my $size ( 10, 100, 1000, 10_000 )
> {
> warn "$size-element array\n";
> my @x = ( 1 .. $size );
> timethese( 1_000_000, {
> map => q[__map(\@x)],
> slice => q[__slice(\@x)],
> });
> }
>
> __END__
>
> 10-element array
> Benchmark: timing 1000000 iterations of map, slice...
> map: 3 wallclock secs ( 3.77 usr + 0.00 sys = 3.77 CPU) @
> 265251.99/s (n=1000000)
> slice: 3 wallclock secs ( 3.94 usr + 0.00 sys = 3.94 CPU) @
> 253807.11/s (n=1000000)
> 100-element array
> Benchmark: timing 1000000 iterations of map, slice...
> map: 3 wallclock secs ( 3.81 usr + 0.00 sys = 3.81 CPU) @
> 262467.19/s (n=1000000)
> slice: 3 wallclock secs ( 3.94 usr + 0.00 sys = 3.94 CPU) @
> 253807.11/s (n=1000000)
> 1000-element array
> Benchmark: timing 1000000 iterations of map, slice...
> map: 3 wallclock secs ( 3.87 usr + 0.00 sys = 3.87 CPU) @
> 258397.93/s (n=1000000)
> slice: 3 wallclock secs ( 3.97 usr + 0.00 sys = 3.97 CPU) @
> 251889.17/s (n=1000000)
> 10000-element array
> Benchmark: timing 1000000 iterations of map, slice...
> map: 3 wallclock secs ( 3.89 usr + 0.00 sys = 3.89 CPU) @
> 257069.41/s (n=1000000)
> slice: 3 wallclock secs ( 4.06 usr + 0.00 sys = 4.06 CPU) @
> 246305.42/s (n=1000000)
>
>
>> The real question is whether the original list would be better stored
>> in a hash in the first place, avoiding that conversion each time. The
>> answer is usually yes. In that case, it's *way* quicker.
>
> Agreed.
>
> --
> I would like to see an anime where Love conquers all, then goes mad
> with the
> power and sets up a repressive totalitarian regime that ruthlessly
> crushes
> and oppresses all non-love. -- Frank Raymond
> Michaels
>
>
> --------------------------------------------------
> Minneapolis Perl Mongers mailing list
>
> To unsubscribe, send mail to majordomo at pm.org
> with "unsubscribe mpls" in the body of the message.
>
-Ken
--------------------------------------------------
Minneapolis Perl Mongers mailing list
To unsubscribe, send mail to majordomo at pm.org
with "unsubscribe mpls" in the body of the message.
More information about the Mpls-pm
mailing list