LPM: Regexps for email.

Joe Hourcle oneiros at dcr.net
Wed Jan 26 20:40:05 CST 2000

On Wed, 26 Jan 2000, Mike Andrews wrote:

> Not that this really answers the question, but I'm a little wary of using
> mailto: URL's *anywhere* anymore.  The instant you put an email address on
> a web page as-is is the same instant you get added to a pile of spammer's
> lists.  Most of them use web-crawling bots to harvest addresses.  We've
> got an amusing Apache mod_rewrite + Perl script combo that defeats most of
> them here...
> Just something to think about.
> http://www.turnstep.com/Spambot/avoidance.html has some interesting
> suggestions for getting around it -- keeping the email address out of the
> HTML source but still have the page look and work the same.  Javascript,
> creative use of tables, putting the address into a .gif, and so on...

One interesting approach I've seen is URI encoding e-mail addresses, as
web browsers will decode it, but it won't (unless they get smarter)
match the patterns for an e-mail address.

	<A HREF="mailto:oneiros at dcr.net">oneiros at dcr.net</A>

won't match if it's encoded as:
	<A HREF="mailto:oneiros%40dcr.net">oneiros&#64;dcr.net</A>

(after all, if the spammers can send us crap that's URI encoded to make it
harder to track down, we might as well do the same to them)

Joe Hourcle

More information about the Lexington-pm mailing list