[sf-perl] shuffling large numbers of image files

Mark Kvale kvale at phy.ucsf.edu
Mon Oct 16 10:24:42 PDT 2006

On Oct 16, 2006, at 9:57 AM, Rich Morin wrote:

> Even ignoring the file names, the image files can be matched up by
> their content.  For example, I can create an MD5 checksum for each
> file, look for matching checksums, and then (as a safety net) do a
> bit-for-bit comparison of putative duplicates.
> However, a report of all duplicate files might well swamp the user
> in data.  It would be better to identify and present duplicate (or
> evolving) folders and let the user determine which one(s) to save.
> Although the identification part is a bit tricky, I'm sure that I
> can handle that part.  The hard part, however, is deciding exactly
> what information to present and how to present it.  Suggestions?
> Also, any other ideas on approaches and/or tools are solicited.
Although not a Perl solution, the open source program imgSeek is  
fairly good at detecting both duplicate and similar images and can  
also do simple photo organization.


For a Perl duplicate image detector, Whatpix, http:// 
whatpix.sourceforge.net/ , is a simple app that works fine.


More information about the SanFrancisco-pm mailing list