Regex stumper?

Ben Marcotte ben_a_marcotte at yahoo.com
Thu Jul 27 14:06:49 CDT 2000


I think you need \b instead of \B.  \B matches _except_ at a word boundry and
would match the last character of a word.  Since it matches but is not retained
for the replacement (i.e. it doesn't have parentheses around it), it gets
"removed".  Here's the code I would suggest:

$text = "Email me at karic\@lclark.edu, or at wwwadmin\@lclark.edu, dude.";
$text =~ s#\b([\w\-]+\@[\w\-]+\.[\w\-]+)\b#<a href="mailto:$1">$1</a>#g;
print $text;

Which includes the following changes:
1) Changed the leading \s to a more symetrical \b.
2) Put a g (global replace) at the end, so you don't need a while loop.
3) Used # instead of / in the s/// expression so that we don't need to escape
the / in </a> (not a big deal here but handy for bigger HTML chunks).
4) Put quotes around the mailto:.  Not a perl suggestion but a good HTML one.
5) Made it look like a highly efficient, but completely unreadable spew of line
noise, but that's half the fun of perl regexes!


--- Kari Chisholm <karic at lclark.edu> wrote:
> 
> Alright, this may be a moron question, but I can't figure it out.  I'm
> trying to take a long string of text like this:
> 
> 	Email me at karic at lclark.edu, or at wwwadmin at lclark.edu, dude.
> 
> And, make it output this:
> 
> 	Email me at <a href=mailto:karic at lclark.edu>karic at lclark.edu</a>, or
> 	at <a href=mailto:wwwadmin at lclark.edu>wwwadmin at lclark.edu<a/>, dude.
> 
> My code-snippet is this:
> 
> $text = "Email me at karic\@lclark.edu, or at wwwadmin\@lclark.edu,
> dude.";
> 
> while ($text =~ /\s([\w\-]+\@[\w\-]+\.[\w\-]+)\B/)
> 	{
> 	$link = $1;
> 	print "$link\n";
> 	while ($link !~ /\w$/)
> 		{
> 		$link = substr ($link,0,length($link)-1);
> 		print "$link\n";
> 		}
> 	$text =~ s/$link/<a href=mailto:$link>$link<\/a>/;
> 	}
> 
> print $text;
> 
> The problem is that the output looks like this:
> 
> 	Email me at <a href=mailto:karic at lclark.ed>karic at lclark.ed</a>u, 
> 	or at <a href=mailto:wwwadmin at lclark.ed>wwwadmin at lclark.ed</a>u, dude.
> 
> How come the final [\w\-] doesn't match the u in edu?
> 
> I'm stumped.  
> 
> -kari.
> 
> -- 
> Kari Chisholm
> Creative Director for New Media
> Lewis & Clark College
> Portland, Oregon
> http://www.lclark.edu
> TIMTOWTDI


=====
---------------------------------------
Ben Marcotte <ben_a_marcotte at yahoo.com>
---------------------------------------

__________________________________________________
Do You Yahoo!?
Kick off your party with Yahoo! Invites.
http://invites.yahoo.com/
TIMTOWTDI



More information about the Pdx-pm-list mailing list