[oak perl] Regular Expressions
David Fetter
david at fetter.org
Thu Mar 11 13:25:35 CST 2004
On Thu, Mar 11, 2004 at 08:52:52AM -0800, Belden Lyman wrote:
> On Wed, 2004-03-10 at 17:51, David Fetter wrote:
> > On Wed, Mar 10, 2004 at 05:10:46PM -0800, Belden Lyman wrote:
> >
> > Belden, hats off for the ingenious use of regex, but...I don't
> > quite get this approach. Why try to cram it all into one regex?
> >
>
> To prove to myself that it can all be done in one regex.
Cool :)
> > Here's how I'd do a thing like this. I suppose I lose obfuscation
> > points, but it's easy to use, understand, modify, maintain, &c.,
> > and it's bumpin' fast.
>
> Sure, I wouldn't use anything like the above (err, the snipped?)
heh
> in production code, exactly for the reasons you mentioned. It was
> an exercise, not much more.
Roight.
> > #!/usr/bin/perl -wl
> > use strict;
> > use warnings;
> > use Getopt::Long;
> >
> > my $file = '/usr/dict/words';
> > my $length = 2;
> > my $result = GetOptions(
> > "length=i" => \$length
> > , "file=s" => \$file
> > );
> >
> > open F, "<$file" or die "Couldn't open $file: $!\n";
> > while(<F>) {
> > chomp;
> next unless /^\w+$/; # ignore contractions
> > next unless length == $length; # Quickly removes most things we don't want.
> > next if $_ eq uc($_); # No shouting.
> > next unless /[aeiouwy]/io; # cwm is a word.
> > # more simple tests, if needed.
> > print;
> > }
> > close F;
>
> Benchmarking certainly upholds your claim of bumpin' fast!
:)
I suspect it might be even faster if you put the length test first.
The algorithmic principle for mine is that it discards 1st--think of
an increasingly fine-meshed set of filters on a water intake. First,
something that excludes furniture, then something that excludes
bottles & cans, then bits of paper, then sand, then volatile organics,
sulfur...
Cheers,
D
--
David Fetter david at fetter.org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778
Remember to vote!
More information about the Oakland
mailing list