[Neworleans-pm] split vs. match

David B. John djohn at archdiocese-no.org
Tue Oct 28 05:51:11 PDT 2008


On Tue, 2008-10-28 at 01:54 -0400, Donnie Cameron wrote:

> David,
> 
> The split function is not going to make things any faster. In fact,
> without resorting to the use of another language, I can't think of a
> faster way of doing it than you have suggested. Even if you were to
> split on something like a quote followed by a space (/" /) and then
> reattach the quote to the end of each resulting element (work that is
> vastly simpler than regex matching), the process would end up being
> slower than regex matching because the regex maching happens in
> machine language and the more efficient work happens in Perl. I'm
> convinced also that even if you were to use the index function, your
> Perl code would still be slower than the regex-based solution you
> described.
> 
> In the past, I have tried a number of tricks to try to beat simple
> regex matching for this type of work and I've seldom been able to beat
> the regex matching. (When I write "this type of work", I am of course
> excluding regular Apache-like log files and other files that are
> designed to be easy and fast to parse. I'm talking about more
> thoughtless file designs, such as the one you described.) 
> 
> You could roll out your own C extension, but that's just ridiculous
> because the hardware to process the slower and more general Perl regex
> would be less expensive than your time. 
> 
> I don't know how you timed the split function, but I suspect that it
> was much faster because its regex was probably much simpler. If you
> try the split function with a more complicated regex, I'm sure you'll
> find that split isn't so fast any more. 
> 
> You do need the /g at the end, of course.
> 
> --Donnie
> 

Thanks Donnie.  I can live with that.  :)

David


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/neworleans-pm/attachments/20081028/2dfaaddd/attachment.html>


More information about the NewOrleans-pm mailing list