Regex stumper?

Kari Chisholm karic at lclark.edu
Thu Jul 27 15:39:08 CDT 2000


Thanks.  Your one-liner is very helpful.

The reason I led off with the \s instead of a more symmetrical \b is to
be able to match these bare emails (and convert them to mailto href's),
but to leave untouched any mailto's that are already hand-coded in the
text.  

The problem, then, is that there has to be a leading space in the text
I'm evaluating.  If it's 

	$text = "karic\@lclark.edu is my email address.";

then it doesn't match with the leading \s.  \b would solve the problem,
but then it converts this:

	Email me <a href=mailto:karic at lclark.edu>here</a>.

into this

	Email me <a href=mailto: <a
href=mailto:"karic at lclark.edu">karic at lclark.edu</a>>here</a>.

And that's no good.

Any suggestions?

-kari.



Ben Marcotte wrote:
> 
> I think you need \b instead of \B.  \B matches _except_ at a word boundry and
> would match the last character of a word.  Since it matches but is not retained
> for the replacement (i.e. it doesn't have parentheses around it), it gets
> "removed".  Here's the code I would suggest:
> 
> $text = "Email me at karic\@lclark.edu, or at wwwadmin\@lclark.edu, dude.";
> $text =~ s#\b([\w\-]+\@[\w\-]+\.[\w\-]+)\b#<a href="mailto:$1">$1</a>#g;
> print $text;
> 
> Which includes the following changes:
> 1) Changed the leading \s to a more symetrical \b.
> 2) Put a g (global replace) at the end, so you don't need a while loop.
> 3) Used # instead of / in the s/// expression so that we don't need to escape
> the / in </a> (not a big deal here but handy for bigger HTML chunks).
> 4) Put quotes around the mailto:.  Not a perl suggestion but a good HTML one.
> 5) Made it look like a highly efficient, but completely unreadable spew of line
> noise, but that's half the fun of perl regexes!
> 
> --- Kari Chisholm <karic at lclark.edu> wrote:
> >
> > Alright, this may be a moron question, but I can't figure it out.  I'm
> > trying to take a long string of text like this:
> >
> >       Email me at karic at lclark.edu, or at wwwadmin at lclark.edu, dude.
> >
> > And, make it output this:
> >
> >       Email me at <a href=mailto:karic at lclark.edu>karic at lclark.edu</a>, or
> >       at <a href=mailto:wwwadmin at lclark.edu>wwwadmin at lclark.edu<a/>, dude.
> >
> > My code-snippet is this:
> >
> > $text = "Email me at karic\@lclark.edu, or at wwwadmin\@lclark.edu,
> > dude.";
> >
> > while ($text =~ /\s([\w\-]+\@[\w\-]+\.[\w\-]+)\B/)
> >       {
> >       $link = $1;
> >       print "$link\n";
> >       while ($link !~ /\w$/)
> >               {
> >               $link = substr ($link,0,length($link)-1);
> >               print "$link\n";
> >               }
> >       $text =~ s/$link/<a href=mailto:$link>$link<\/a>/;
> >       }
> >
> > print $text;
> >
> > The problem is that the output looks like this:
> >
> >       Email me at <a href=mailto:karic at lclark.ed>karic at lclark.ed</a>u,
> >       or at <a href=mailto:wwwadmin at lclark.ed>wwwadmin at lclark.ed</a>u, dude.
> >
> > How come the final [\w\-] doesn't match the u in edu?
> >
> > I'm stumped.
> >
> > -kari.
> >
> > --
> > Kari Chisholm
> > Creative Director for New Media
> > Lewis & Clark College
> > Portland, Oregon
> > http://www.lclark.edu
> > TIMTOWTDI
> 
> =====
> ---------------------------------------
> Ben Marcotte <ben_a_marcotte at yahoo.com>
> ---------------------------------------
> 
> __________________________________________________
> Do You Yahoo!?
> Kick off your party with Yahoo! Invites.
> http://invites.yahoo.com/

-- 
Kari Chisholm
Creative Director for New Media
Lewis & Clark College
Portland, Oregon
http://www.lclark.edu
TIMTOWTDI



More information about the Pdx-pm-list mailing list