[Mpls-pm] Interesting RegEx Problem

Robert Fischer rfischer at corradiation.net
Mon Sep 19 10:39:27 PDT 2005


List::Util::first is a great shortcut for implementing the
sequential-search algorithm, but I'm looking for something better than
sequential-search.

I need to rephrase the problem, though, because I realize I left a major
aspect out of the phrasing:  Given a *collection of* strings and a
collection of regular expression strings, where it is known that each
string is matched by precisely one regular expression, how do you most
efficiently develop the mapping?

Given that arrangement, I'm currently looking at lexically sorting the
strings and getting an MRU list for the pattern matches.  That
implementation assumes that strings which are lexically close are liable
to match the same (or similar) regular expressions.

As a tangential note: is there a concept of a "distance" between regular
expressions which can be reasonably implemented?  If so, has anyone
implemented it yet?  String distance certainly doesn't work, because \d{3}
and [1-90][1-90][1-90] are implementation-identical, but have a drastic
edit distance.

~~ Robert Fischer.
rfischer at corradiation.net
651-398-8010

> On Mon, 19 Sep 2005, Joshua ben Jore wrote:
>
>> I've been hoping someone would mention List::Util::first.
>>
>> use List::Util 'first';
>> $matching_expression = first { $text =~ $_ } @candidate_expressions
>
> Problem with that is that you get a regex out at the end, which you then
> need to hack on to make useful.
>
> List::Util has useful stuff in it, and if I was doing the list
> manipulations it covers frequently in a piece of code I'd probably use it,
> but it seems a shame to require the module just to get a for loop with a
> break in it.
>
> I also don't find the above syntax all that clear. Especially if the use
> and the call to first() are separated by a lot of code. The name 'first'
> is fairly self-explanatory, I suppose, but it's not a standard perl
> function, which might fox the unwary.
>
> All comes down to philosophy and house style, in the end. And
> documentation. There is, after all, more than one way to do it.
>
> Ian
>
> -
> ---------------------------------------------------------------------------
>
> The soul would have no rainbows if the eyes held no tears.
>
> Ian Malpass
> <ian at indecorous.com>
> _______________________________________________
> Mpls-pm mailing list
> Mpls-pm at pm.org
> http://mail.pm.org/mailman/listinfo/mpls-pm
>





More information about the Mpls-pm mailing list