[Melbourne-pm] An old one but an important one

Scott Penrose scottp at dd.com.au
Fri Jan 18 15:57:02 PST 2008


Also should have said - Text::Balanced has been fixed on CPAN.

Scooter

On 19/01/2008, at 10:51 AM, Scott Penrose wrote:

> Hey Dudes,
>
> I was fixing a performance problem with our portal. It was taking 2 or
> 3 minutes to render an HTML form. I narrowed it down to the fact that
> this particular form had a 500K XML file to parse, and it did it 10
> times. If the duplicate parsing and size wasn't bad enough, it was the
> hidden problem.
>
> Making command line code that did EXACTLY the same thing (so I
> thought) it ran in sub 1 second. So what was going on.
>
> After a full day of debugging I narrowed it down to one module
> Filter::Simple, only it wasn't, it was Text::Balanced
>
> And after even more work I found the real problem - "$&" after a
> regular expression.
>
> Yep - it is a known killer of regular expression performance, but here
> is the actual differences:
>
> (env53) vmwmi1:~/simple# time perl test_direct.pl
> 	Fake loop start - size 488604
> 	Fake loop end - for 21581
> 	real	0m26.586s
> 	user	0m9.305s
> 	sys	0m17.269s
>
> (env53) vmwmi1:~/simple# time perl test_direct.pl
> 	Fake loop start - size 488604
> 	Fake loop end - for 21581
> 	real	0m0.131s
> 	user	0m0.112s
> 	sys	0m0.012s
>
> Yep, even including compile time, and reading the file from disk - the
> time goes from 0.012s to 26.58 second - that is 2000 times slower !
>
> So... take heed !
>
> 	Don't use $&, $` or $'
>
> But more importantly - check the modules you use on CPAN.
>
> Scooter
> P.S. I have removed all uses in our code, and still it has a problem,
> so I suspect there is yet another CPAN module using one.
>
> _______________________________________________
> Melbourne-pm mailing list
> Melbourne-pm at pm.org
> http://mail.pm.org/mailman/listinfo/melbourne-pm



More information about the Melbourne-pm mailing list