SPUG: Searching for multiple strings
greg heil
gheil at uswest.net
Thu Jul 6 13:42:43 CDT 2000
Andy
The problem you outlined is known as the "exact
set matching" problem. Actually it is slightly simpler
- as you do not need to know the location of all
matches, just the existence. However, the ESM is well
enough solved, Eg p52 of:
"Algorithms on Strings, Trees, and Sequences:
Computer Science and Computational Biology" '97 by
Dan Gusfield, http://wwwcsif.cs.ucdavis.edu/~gusfield/
It is solved there to O(n+m+k) with n the size of the
text, m the sum of the sizes of the patterns and k the
number of hits of P in T.
It sounds like your Perl solutions are having a hard
time achieving a reduction from O(nm+cm) to O(nm) - ie
effectively factoring out the compilation time for the
search automata.
If your problem is important enough you might want
to implement a better algorithm directly. It is not
clear to me how the designers of Perl could be given
syntactic clues signaling you want to solve an ESM;-)
Thus you may need to resort to a language that exposes
more of the hardwares capabilities.
--
greg heil
mailto:gheil at acm.org
http://www.scn.org/tl/anvil
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
POST TO: spug-list at pm.org PROBLEMS: owner-spug-list at pm.org
Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
Subscriptions; Email to majordomo at pm.org: ACTION LIST EMAIL
Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
For full traffic, use spug-list for LIST ; otherwise use spug-list-digest
More information about the spug-list
mailing list