[Chicago-talk] Performance, and using a hash in a regex

Andy Lester andy at petdance.com
Thu Jul 13 17:03:59 PDT 2006

> If this is a one-off, or a nightly thing where speed doesn't  
> matter, and
> the file listing is sufficiently small (<1000?), and you're  
> comfortable
> with one solution as opposed to the other, I'd say take your pick, and
> Andy's point is more than valid.

Even more to the point, it may not matter in the context of the program.

If going thru the regexes to find the filename takes, say, 100  
milliseconds, and processing each file takes 5 seconds (or 5000  
milliseconds), then each file takes 5100 milliseconds.

Now, you go poking at the regex matching.  Usually when you go  
optimizing you might get 10-30% if you're lucky, but in this case  
that you can speed up the regex matching by 90%!  Now each file will  
take 5010 milliseconds.

So because you found a faster way to match filenames, you sped up  
from 5100ms to 5010ms.  That's an improvement of 1.8% of your total  
run time.

What if you'd profiled your code first, and found a way to improve  
the file processing time by only 10%?  Now you're going to take  
4600ms instead of 5100, an improvement of 9.8%.

Always remember the three rules of optimization:

1) Don't.
2) (for experts) Don't yet.
3) Profile first. 

(The original site of Schwern's slides is non-responsive)

How do you measure?  You use a profiler like Devel::DProf or  


Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance

More information about the Chicago-talk mailing list