[mplspm]: finding plagerism

Ken Williams ken at mathforum.org
Wed Mar 13 10:29:26 CST 2002


On Tuesday, March 12, 2002, at 07:04 PM, Thomas Eibner wrote:

> On Tue, Mar 12, 2002 at 07:43:54PM -0500, Dan Oelke wrote:
> [..]
>> Any other ideas are appreciated.  Is there one of the search/matching
>> modules that might work better than others?  If I can't find something
>> I'll probably write it and put it out as my first real module of
>> something I can actually release.
>
> I wonder if Ken's AI::Categorize would be of any help, at least for
> sorting out parts of the documents to go through?

That may be.  It might be a handy part of a two-pass system or something.

My first inclination would be to use something like Algorithm::Diff's 
LCS (Longest Common Subsequence) function.  You could find the LCSs of 
the current document and each of a corpus of existing documents.

Plus, the module is by Ned Konz (perl at bike-nomad.com), so that's cool.


  -Ken



--------------------------------------------------
Minneapolis Perl Mongers mailing list

To unsubscribe, send mail to majordomo at pm.org
with "unsubscribe mpls" in the body of the message.



More information about the Mpls-pm mailing list