[sf-perl] shuffling large numbers of image files
kvale at phy.ucsf.edu
Mon Oct 16 10:24:42 PDT 2006
On Oct 16, 2006, at 9:57 AM, Rich Morin wrote:
> Even ignoring the file names, the image files can be matched up by
> their content. For example, I can create an MD5 checksum for each
> file, look for matching checksums, and then (as a safety net) do a
> bit-for-bit comparison of putative duplicates.
> However, a report of all duplicate files might well swamp the user
> in data. It would be better to identify and present duplicate (or
> evolving) folders and let the user determine which one(s) to save.
> Although the identification part is a bit tricky, I'm sure that I
> can handle that part. The hard part, however, is deciding exactly
> what information to present and how to present it. Suggestions?
> Also, any other ideas on approaches and/or tools are solicited.
Although not a Perl solution, the open source program imgSeek is
fairly good at detecting both duplicate and similar images and can
also do simple photo organization.
For a Perl duplicate image detector, Whatpix, http://
whatpix.sourceforge.net/ , is a simple app that works fine.
More information about the SanFrancisco-pm