[Omaha.pm] A regex "best fit" finder?

Christopher Cashell topher-pm at zyp.org
Thu Sep 29 16:03:11 PDT 2011


2011/9/29 Dan Linder <dan at linder.org>:
> Yeah, my simple list was very simple compared to the actual list of files.
>  I envision the final solution would be a number of match strings strung
> together with "|"...

Do you have a complete list of hostnames that you want to match against?

If your list is that big, but you do have the full list, you could use
Regexp::Trie (or the slightly slower, but potentially more flexible
Regexp::Assemble).  Both are available from CPAN.

Using one of those, you can read in the list of items from an external
names file, and programmatically add them to your regex.

Basically, you'd do something like:

----------------------------------------------------------------------
use Regexp::Trie;

open my ($regex_fh), '<', './hostnames' or die "Couldn't open file";
my $rt = Regexp::Trie->new;

while (<$regex_fh>) {
  chomp;
  $rt->add($_);
}
my $super_regex = $rt->regexp;

while (<>) {
  chomp;
  say if /$super_regex/;
}
----------------------------------------------------------------------

> Thanks, I'll keep looking.
> Dan

-- 
Christopher


More information about the Omaha-pm mailing list