[Melbourne-pm] OT: Re: FW: Bamboozled by perl

Toby Corkindale toby.corkindale at strategicdata.com.au
Sun Oct 4 21:37:23 PDT 2009


Sam Watkins wrote:
> On Mon, Oct 05, 2009 at 11:52:25AM +1100, Toby Corkindale wrote:
>> Sam Watkins wrote:
>>>> text processing is where it really shines.
>>> # perl invocation to extract email addresses from text, 4 all ur spamming 
>>> needs
>>> perl -ne 'print "$1\n" while 
>>> /(\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b)/ig'
>> Which fails to match some email addresses.
>> You may want to use these CPAN modules, which follow the appropriate RFC:
> 
> It doesn't fail to match any email addresses that are actually used by anyone.

I have some friends whose email addresses would not be matched. Yes, 
they may have intentionally picked email addresses that are uncommon, 
but hey, it's in the standard, so why shouldn't they?

> The RFC-based regexps on email addresses are brain-damaged in the extreme, no
> one uses comments inside emails and all that crap.  One should follow what is
> actually done, not the RFC.

I don't think that "meeting the standard requirements" can be equated 
with "brain damaged", in fact I would say it's good software engineering 
- but apparently that's just a difference of opinion between us.

As you say, you were just giving an example, so let's drop it here.

Cheers,
Toby


More information about the Melbourne-pm mailing list