[Melbourne-pm] An old one but an important one
Scott Penrose
scottp at dd.com.au
Fri Jan 18 15:51:18 PST 2008
Hey Dudes,
I was fixing a performance problem with our portal. It was taking 2 or
3 minutes to render an HTML form. I narrowed it down to the fact that
this particular form had a 500K XML file to parse, and it did it 10
times. If the duplicate parsing and size wasn't bad enough, it was the
hidden problem.
Making command line code that did EXACTLY the same thing (so I
thought) it ran in sub 1 second. So what was going on.
After a full day of debugging I narrowed it down to one module
Filter::Simple, only it wasn't, it was Text::Balanced
And after even more work I found the real problem - "$&" after a
regular expression.
Yep - it is a known killer of regular expression performance, but here
is the actual differences:
(env53) vmwmi1:~/simple# time perl test_direct.pl
Fake loop start - size 488604
Fake loop end - for 21581
real 0m26.586s
user 0m9.305s
sys 0m17.269s
(env53) vmwmi1:~/simple# time perl test_direct.pl
Fake loop start - size 488604
Fake loop end - for 21581
real 0m0.131s
user 0m0.112s
sys 0m0.012s
Yep, even including compile time, and reading the file from disk - the
time goes from 0.012s to 26.58 second - that is 2000 times slower !
So... take heed !
Don't use $&, $` or $'
But more importantly - check the modules you use on CPAN.
Scooter
P.S. I have removed all uses in our code, and still it has a problem,
so I suspect there is yet another CPAN module using one.
More information about the Melbourne-pm
mailing list