[DFW.pm] example deduplication code and full disclosure

Tommy Butler dfwpm at internetalias.net
Tue Dec 24 10:41:43 PST 2013


Full disclosure: I'm not competing in the contest as John and I are
hosting it and have written the code that generates the random dataset.

However I wrote some example code that does work and which I'd like to
share to help give others a gentle push if anyone is having trouble
getting started.

Feel free to steal/fork/laugh at the code as much as you like.  The code
isn't extensively commented but it is very readable.  It's also simple
and concise and makes use of  CPAN modules, some of which use XS code to
get performance gains -- which is within the rules for the "traditional
Perl solution" competition category.

One provision is that my code purposely does not solve all the
problems.  In particular it doesn't handle hard links.  That's up to you
to solve.

https://github.com/tommybutler/dupfind

--Tommy Butler
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/dfw-pm/attachments/20131224/53241009/attachment.html>


More information about the Dfw-pm mailing list