[Purdue-pm] Regex Unification

Bradley Andersen bradley.d.andersen at gmail.com
Mon May 16 10:31:28 PDT 2011


Really?!

Ugh.  I will acknowledge readily that I have to squint very hard to
understand your regexs, Mark, but, I have called out Larry Wall for writing
less terse stuff than this

:)



On Mon, May 16, 2011 at 11:31 AM, Mark Senn <mark at ecn.purdue.edu> wrote:

>  > I have this bank of regexes in my code.
>  >
>  >             $requests2{ $request }->{ $href->{ accession_id } }
>  >                 ->{ library } =~ s/\W/\-/g ;
>  >             $requests2{ $request }->{ $href->{ accession_id } }
>  >                 ->{ library } =~ s/_/\-/g ;
>  >             $requests2{ $request }->{ $href->{ accession_id } }
>  >                 ->{ library } =~ s/-+/-/g ;
>  >             $requests2{ $request }->{ $href->{ accession_id } }
>  >                 ->{ library } =~ s/-$//g ;
>  >
>  > I'm thinking that I can simplify this a lot.
>  >
>  > Change the \W and _ regexes to \W+ and _+ and you reduce the need for
>  > the s/-+/-/, so we can reduce it, I think, to
>  >
>  >             $requests2{ $request }->{ $href->{ accession_id } }
>  >                 ->{ library } =~ s/[\W_-]+/\-/g ;
>  >
>  > I'd have to bash that regex before I'm comfortable putting it into
>  > production.
>  >
>  > But the last part, getting rid of dashes at the end of a string, can I
>  > roll that into the bigger regex? I'm not seeing how right now.
>
> I prefer the second of the three solutions below.
>
> #!/usr/local/bin/perl
>
> $s = '!@#$%^&*(){}--__--}{)(*&^%$#@!abc--';
> $_ = $S . "\n" . $s . "\n";
> s/\W/\-/g;
> s/_/\-/g;
> s/-+/-/g;
> s/-$//g;
> print "$_\n";
>
> $s = '!@#$%^&*(){}--__--}{)(*&^%$#@!abc--';
> $_ = $S . "\n" . $s . "\n";
> s/[\W_]/-/g;  # Change nonword or '_' characters to '-' everywhere.
>              # This will change any newlines to '-'.
> s/-+/-/g;     # Change two or more consecutive '-' characters
>              # to one '-' everywhere.
> s/-$//;       # Delete any '-' at the end of the string.
> print "$_\n";
>
> $s = '!@#$%^&*(){}--__--}{)(*&^%$#@!abc--';
> $_ = $S . "\n" . $s . "\n";
> s/[\W_-]+/-/g;  # Change consecutive nonword, '_', or '-'  characters
>                # to '-' everywhere.
>                # This will change any newlines to '-'.
> s/-$//;         # Delete any '-' at the end of the string.
> print "$_\n";
>
> -mark
> _______________________________________________
> Purdue-pm mailing list
> Purdue-pm at pm.org
> http://mail.pm.org/mailman/listinfo/purdue-pm
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/purdue-pm/attachments/20110516/f32237d3/attachment.html>


More information about the Purdue-pm mailing list