Directory munging [was: Re: Phoenix.pm: Three snippets]
Tim Ayers
tayers at bridge.com
Sun May 6 22:22:46 CDT 2001
>>>>> "E" == Eden Li <eden.li at asu.edu> writes:
E> Yes, quite a big one actually:
Well, that really depends on the situation. Let me say, I agree with
you. 'map' in void context is generally a bad idea because map's
purpose is to collect up the results of each iteration into a list and
if you aren't going to use the list, why bother. But your original
blanket statement "Uh oh... Never use map{} in void context." just
irked me.
Back to practical matters, in general you should not use 'map' in void
context, but I do have a bone to pick with your benchmark. I think it
is unfairly biased against map and does not measure what we are
interested in.
E> perl -MBenchmark -e "timethese (10000, {'map' => sub { map
E> {$n=$_}(0..10000) }, 'for' => sub { for (0..10000) {$n=$_}}})"
The map case is generating a list from 0 to 10000 10,000 times. That's
a serious disadvantage from the start. What we are really trying to
measure is the performance hit because map collects the return value
of each call into a list.
I think a more legitimate version of your test would be
perl -MBenchmark -e "@l=(0..10000); timethese (10000, {'map' => sub { map {$n=$_} @l }, 'for' => sub { for (@l) {$n=$_}}})"
Notice that I have 'for' loop over the same list. Since perl 5.005 (I
think) 'for' has a special case that recognizes (0..10000) as a
numeric iteration and it doesn't actually loop a pointer through a
list. It knows it should decompose to something more like
for ($_=0; $_<10000; $_++) {}
This will potentially run a lot differently than the list pointer way.
But I think a different benchmark is even more interesting.
#!/usr/bin/perl -w
use strict;
use Benchmark;
for (100, 1000, 10000) {
my @l=(0..$_);
my $n;
timethese (10000,
{"map$_" => sub { map {$n=$_} @l },
"for$_" => sub { for (@l) {$n=$_}}}
);
}
$ perl loop.pl
Benchmark: timing 10000 iterations of for100, map100...
for100: 1 wallclock secs ( 0.66 usr + 0.00 sys = 0.66 CPU) @ 15058.82/s (n=10000)
map100: 1 wallclock secs ( 0.91 usr + 0.00 sys = 0.91 CPU) @ 11034.48/s (n=10000)
'for' wins by some, but not overwhelming.
Benchmark: timing 10000 iterations of for1000, map1000...
for1000: 6 wallclock secs ( 6.34 usr + 0.01 sys = 6.34 CPU) @ 1576.35/s (n=10000)
map1000: 9 wallclock secs ( 8.98 usr + 0.00 sys = 8.98 CPU) @ 1114.01/s (n=10000)
'for' wins by the same ratio as the 100 loop case.
Benchmark: timing 10000 iterations of for10000, map10000...
for10000: 65 wallclock secs (63.65 usr + 0.02 sys = 63.67 CPU) @ 157.06/s (n=10000)
map10000: 113 wallclock secs (112.56 usr + 0.03 sys = 112.59 CPU) @ 88.81/s (n=10000)
Okay. Now we are getting closer to Eden's numbers. And it obviously
has to do with collecting up a big list for no reason.
E> Also, as I mentioned before... it's just the wrong construct
E> for plain ol' looping.
Agreed. But I feel your benchmark did not prove anything. Blanket
statements without explanation, followed by misleading benchmarks set
off my alarms. I hope my benchmarks prove your point legitimately. So
everyone, don't use map in void context. ;-) Hopefully I've explained
why a little bit.
I'll end with an almost relevant Larry quote.
It really doesn't bother me if people want to use grep or map in a
void context. It didn't bother me before there was a for modifier,
and now that there is one, it still doesn't bother me. I'm just not
very easy to bother.
-- Larry Wall in <199911012346.PAA25557 at kiev.wall.org>
Hope you have a very nice day, :-)
Tim Ayers (tayers at bridge.com), who has now probably made himself out
to be a pedantic SOB amongst his new aquaintances. :-/
More information about the Phoenix-pm
mailing list