[Omaha.pm] A regex "best fit" finder?

Sterling Hanenkamp sterling at hanenkamp.com
Thu Sep 29 14:07:25 PDT 2011


Nothing like that comes to mind. If it has to be something that is in the
predefined match, most of your examples wouldn't do that. If it's just to be
a heuristic to help you throw out something, it would depend on the
heuristic.

Personally, I'd probably use a tied hash or maybe MongoDB or something
similar, fill that with the list and then hit the database for a
verification. The database "hash table" could be reloaded every week.

This is old school, but it works quite well for this simple task:

# Common Code
use DB_File;
tie %valid,  'DB_File', 'valid_things.db', O_RDWR|O_CREAT, 0644, $DB_HASH;

# Load the latest (assuming the input is one valid string per line)
%valid = ();
while (<>) { chomp; $valid{$_}++ }

# Check for valid strings
if ($valid{ $unvalidated_input }) { print "YES!\n" }
else { print "NO!\n" }


2011/9/29 Dan Linder <dan at linder.org>

> I have a list of server names that I want to create a regex match against.
>  It could be done by hand, but the list changes (adds, removes) on a weekly
> basis.
>
> Does anyone know of a program that can take a list of matches and create a
> regular expression that will match them?
>
> Example:
> OMAWWW001
> OMAWWW002
> OMADNS001
> ORDWWW001
> ORDWWW002
> ORDWWW003
> ORDDNS001
> ORDDNS002
>
> I guess the "shortest" match would be /O.......[123]/ but it's kinda
> 'loose'.
>
> I *think* what I'd like is something like this:
> /O[MR][AD][WD][WN][WS]00[123]/
> (But a smarter regex tool might find something tighter...)
>
> What I *don't* want is: /OMAWWW001|OMAWWW002|...|ORDDNS002/
> I don't have enough space in my tool for a 10K long string! :)
>
> Any thoughts?
>
> Thanks,
> DanL
>
> --
> ***************** ************* *********** ******* ***** *** **
> "Quis custodiet ipsos custodes?"
>     (Who can watch the watchmen?)
>     -- from the Satires of Juvenal
> "I do not fear computers, I fear the lack of them."
>     -- Isaac Asimov (Author)
> ** *** ***** ******* *********** ************* *****************
>
> _______________________________________________
> Omaha-pm mailing list
> Omaha-pm at pm.org
> http://mail.pm.org/mailman/listinfo/omaha-pm
>



-- 
Andrew Sterling Hanenkamp
sterling at hanenkamp.com
785.370.4454
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/omaha-pm/attachments/20110929/dbc40d4f/attachment.html>


More information about the Omaha-pm mailing list