SPUG: Christmas Benchmarks!

Tim Maher/CONSULTIX tim at consultix-inc.com
Sat Dec 25 18:48:13 CST 1999


Merry Christmas, SPUG-sters!

Hope everybody is having a wonderful holiday!

My favorite activity is Perl Programming, so I've spent the day trying
to get to the bottom of a vexing "RE Efficiency Question."

Specifically, in Jeffrey Friedl's "Mastering Regular Expressions" book,
he identifies many common programming practices (/i, $&, etc.) that
reportedly incur a performance penalty (although he doesn't report any
quantitative data).  However, he did his research in 1995-1996 for his
1997 book, and anybody who's been reading the Changes files distributed
with Perl since 1998 knows there have been many fixes applied in recent
years to address the performance problems Friedl identified.

People still quote Friedl about the imprudence of using the $& variable,
as evidenced at our own meeting last week, but as I can tell you from
much Christmas experience today, it's not easy any more to find examples
of code fragments that do the same job with substantially different
run times, even when using tests like these with huge target strings
(to maximize the "string copying penalty"):

$start=(times)[0];
while (lots) {
	/match/; $copy=$&;
	vs.
	/(match)/; $copy=$1;
	/another_match/
			# because $& is supposed to slow down all matches,
			#  but $1 just the particular one
	/yet_another_match/
}
$delta=(times)[0] - $start;

(Friedl (1997) warned that Benchmark is $& polluted already, and therefore
incapable of detecting speed difference in its use or disuse, so I've
used my own timing routines; but as far as I can tell, Benchmark.pm is
"clean" now anyway!)

AND

while (lots) {
	/match/; $after=$';
	vs.
	/match(.*)/; $after=$1;
	/another_match/
	/yet_another_match/
	. . .

I've exchanged some Email with Jeffrey today (he spends Christmas like
I do 8-} ) and he tells me that his (considerable) knowledge of Perl RE
efficiency peaked in 1996,  and he hasn't kept up with the latest Perl
improvements.  He also said he has heard "rumors" about the /i penalty
being removed, etc., but that he doesn't really know what's happened.

I'm supposed to be an expert on RE Efficiency, at least by the next
time my "Advanced Pattern Matching" class rolls around in January, so
I'd be grateful if somebody could show me some good benchmark programs
that actually show a real difference (>15%) in the speed of different
RE coding techniques, using the latest Perl version (5.00503).

Thanks, and Merry Christmas!

*==================================================================*
| Tim Maher, PhD  Consultix &        (206) 781-UNIX/8649           |
|  Pacific Software Gurus, Inc       Email: tim at consultix-inc.com  |
|  UNIX/Linux & Perl Training        http://www.consultix-inc.com  |
| Classes: 12/13 Perl; 12/17 Adv. Pattern Matching; 1/18 Int. Perl |
*==================================================================*

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    POST TO: spug-list at pm.org        PROBLEMS: owner-spug-list at pm.org
 Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
 SUBSCRIBE/UNSUBSCRIBE: Replace ACTION below by subscribe or unsubscribe
        Email to majordomo at pm.org: ACTION spug-list your_address





More information about the spug-list mailing list