[Pdx-pm] oh, gross (object method in regex)
Eric Wilhelm
scratchcomputing at gmail.com
Sun Mar 12 18:21:58 PST 2006
# from Randall Hansen
# on Sunday 12 March 2006 04:20 pm:
(snippets from your next e-mail for correctness)
Note that what you called temp was actually the one using the scalar
deref construct.
(renamed "temp" to "scalar reference")
> sub sref {
...
> return grep /${ \$Foo->foo }/ => @search;
(renamed "eric correct" to "array ref")
> sub aref {
...
> return grep /@{[ $Foo->foo ]}/ => @search;
(renamed "deref" to "real temp")
> sub rtmp {
...
> my $foo = $Foo->foo;
> return grep /$foo/ => @search;
And just the important numbers here:
> rtmp: 2 secs @ 68k/s
> aref: 4 secs @ 34k/s
> sref: 2 secs @ 46k/s
>david's method of assigning to a temporary variable works,
>and is what i've done before, but seemed ugly and wasteful because i
>only used it once.
Not only is it important to benchmark, it is really important to
benchmark correctly :-)
>so the reference/dereference syntax avoids the temporary variable, is
> faster[1], and explicit enough so that people who understand the
> rest of my code will get it.
Let's be clear what the four forms in your benchmark are.
The original "eric" sub is going to yield incorrect results because the
backslash turns it into /@{[\($Foo->foo)];}/ when you run deparse on it
(that's a list of one reference to a scalar once it gets captured in
the [] array referenced and flattened by the @{} cyclops.) So, best to
just throw that away and pretend we never saw it, since fast or slow
incorrect behavior is irrelevant.
The "eric_correct" sub is an array dereference construct, as hinted at
by my above renaming to "aref".
The one you called "temp" is actually the scalar dereference construct.
I would expect that this is faster than the array dereference by at
least a little because the code is following a "one value" path through
perl rather than a list path.
Finally, the one you called "deref" is using a temp variable ("rtmp"
above.)
Note that the temp variable is about 3/2 the speed of the scalar
dereference and twice as fast as the and twice the speed of the array
dereference.
Why is a temp variable faster? Feel free to play with B::Concise and
post the pertinent snippets of the optree here when you find them. The
lazy find something to blame and move on (Schwern called this the "User
Model" if you remember his talk on design.)
I was going to choose the garbage collector as my straw man. Seems that
pass-by-value to a nearby lexical would at least be easier to keep
track of than an anonymous reference inside a regex.
But, hey!
$ perl -e 'my $obj = "main";
sub foo {warn "hey\n"; "thing"};
print grep(/${\($obj->foo)}/, "a thing", "deal", "stuff");'
hey
hey
hey
a thing
Temp variable pops you out of the need to call the method every time, so
if you increase the size of @search, your numbers are going to get a
lot worse. Did you guess that would happen? I sure didn't!
--Eric
--
Turns out the optimal technique is to put it in reverse and gun it.
--Steven Squyres (on challenges in interplanetary robot navigation)
---------------------------------------------------
http://scratchcomputing.com
---------------------------------------------------
More information about the Pdx-pm-list
mailing list