[mplspm]: finding plagerism
Ken Williams
ken at mathforum.org
Wed Mar 13 10:29:26 CST 2002
On Tuesday, March 12, 2002, at 07:04 PM, Thomas Eibner wrote:
> On Tue, Mar 12, 2002 at 07:43:54PM -0500, Dan Oelke wrote:
> [..]
>> Any other ideas are appreciated. Is there one of the search/matching
>> modules that might work better than others? If I can't find something
>> I'll probably write it and put it out as my first real module of
>> something I can actually release.
>
> I wonder if Ken's AI::Categorize would be of any help, at least for
> sorting out parts of the documents to go through?
That may be. It might be a handy part of a two-pass system or something.
My first inclination would be to use something like Algorithm::Diff's
LCS (Longest Common Subsequence) function. You could find the LCSs of
the current document and each of a corpus of existing documents.
Plus, the module is by Ned Konz (perl at bike-nomad.com), so that's cool.
-Ken
--------------------------------------------------
Minneapolis Perl Mongers mailing list
To unsubscribe, send mail to majordomo at pm.org
with "unsubscribe mpls" in the body of the message.
More information about the Mpls-pm
mailing list