[mplspm]: In a sorted list [better]
John J. Trammell
trammell at trammell.dyndns.org
Fri Mar 22 10:59:12 CST 2002
On Fri, Mar 22, 2002 at 09:40:54AM -0600, Jim Anderson wrote:
> And with 2 million strings, building a hash is a non-trivial amount of
> time.
You said it:
#!/usr/bin/perl -w
use strict;
use Benchmark;
$| = 1;
my $binary = sub
{
use POSIX qw[ floor ];
my ($x,$a) = @_;
my ($p,$q) = (0,scalar(@$a)-1);
LOOP:
{
return 1 if $a->[$p] eq $x;
return 1 if $a->[$q] eq $x;
my $mid = floor( ($p+$q)/2 );
return 0 if ($mid == $p) || ($mid == $q);
for ($a->[$mid])
{
$x eq $_ && do { return 1; };
$x lt $_ && do { ($p,$q) = ($p,$mid); last; };
$x gt $_ && do { ($p,$q) = ($mid,$q); last; };
}
redo LOOP;
}
};
my $hash = sub
{
my ($x,$a) = @_;
my %h = map { $_, 0 } @$a;
return exists $h{$x};
};
{
print "benchmark for size 26\n";
my @a = ('a' .. 'z');
my $x = $a[rand(@a)];
my $y = "__foo__";
timethese(50000,
{
binary => sub { $binary->($x,\@a); $binary->($y,\@a); },
hash => sub { $hash->($x,\@a); $hash->($y,\@a); },
});
}
{
print "benchmark for size @{[ 26*26 ]}\n";
my @a = ('aa' .. 'zz');
my $x = $a[rand(@a)];
my $y = "__foo__";
timethese(50000,
{
binary => sub { $binary->($x,\@a); $binary->($y,\@a); },
hash => sub { $hash->($x,\@a); $hash->($y,\@a); },
});
}
__END__
[ ~ ] perl bar.pl
benchmark for size 26
Benchmark: timing 50000 iterations of binary, hash...
binary: 4 wallclock secs ( 3.68 usr + 0.00 sys = 3.68 CPU) @ 13586.96/s (n=50000)
hash: 7 wallclock secs ( 7.21 usr + 0.01 sys = 7.22 CPU) @ 6925.21/s (n=50000)
benchmark for size 676
Benchmark: timing 50000 iterations of binary, hash...
binary: 7 wallclock secs ( 6.86 usr + 0.00 sys = 6.86 CPU) @ 7288.63/s (n=50000)
hash: 187 wallclock secs (184.91 usr + 0.06 sys = 184.97 CPU) @ 270.31/s (n=50000)
[ ~ ] ls -l
Admittedly this benchmark isn't perfect, but I think it points
in the direction of the binary search solution scaling better.
--------------------------------------------------
Minneapolis Perl Mongers mailing list
To unsubscribe, send mail to majordomo at pm.org
with "unsubscribe mpls" in the body of the message.
More information about the Mpls-pm
mailing list