<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:arial,helvetica,sans-serif;font-size:10pt"><div>try Text::Soundex</div><div style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"><br><div style="font-family: arial,helvetica,sans-serif; font-size: 10pt;"><font size="2" face="Tahoma"><hr size="1"><b><span style="font-weight: bold;">From:</span></b> Richard Reina <richard@rushlogistics.com><br><b><span style="font-weight: bold;">To:</span></b> chicago-talk@pm.org<br><b><span style="font-weight: bold;">Sent:</span></b> Wed, February 2, 2011 7:55:36 AM<br><b><span style="font-weight: bold;">Subject:</span></b> [Chicago-talk] Regular expression discussion.<br></font><br>
Tired of shoveling snow. Well sit right down and lets have a regex discussion. I have a perl script that at the moment just uses grep to look though text files that have been converted from pdf2text to see what sort of documents they are. What I am finding however is that a lot of searches fail by just a few characters. <br>For example, if I am looking for "This first document is a contract between" the text string in the file might look like this <br>"This tirst document is a coniract betweeo" and the grep search fails. However, as you can see these two statements are 93% alike. Is there a way with perl regular expressions to match strings that are say 90, 95 or 98% alike?<br><br>Any ideas would be greatly appreciated.<br><br>Stay Warm!<br>-- <br>Richard Reina<br>Rush Logistics, Inc.<br>Watch our 3 minute movie: <br><span><a target="_blank"
href="http://www.rushlogistics.com/movie">http://www.rushlogistics.com/movie</a></span><br><br>_______________________________________________<br>Chicago-talk mailing list<br><a ymailto="mailto:Chicago-talk@pm.org" href="mailto:Chicago-talk@pm.org">Chicago-talk@pm.org</a><br><span><a target="_blank" href="http://mail.pm.org/mailman/listinfo/chicago-talk">http://mail.pm.org/mailman/listinfo/chicago-talk</a></span><br></div></div>
</div></body></html>