<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <font face="Helvetica, Arial, sans-serif">For your deduplication

      hackathon code entry, the output of your Perl app should be as

      follows:<br>

    </font>

    <ol>

      <li><font face="Helvetica, Arial, sans-serif">Each grouping of

          duplicates should be sorted and printed out all on one line,

          by filename, deliminated by a tab character.</font></li>

      <li><font face="Helvetica, Arial, sans-serif">The lines of output

          should be sorted.</font></li>

      <li><font face="Helvetica, Arial, sans-serif">The sort you should

          use for both the lines of output and the file name groupings

          themselves is: </font><tt><big>sort { $a cmp $b }</big></tt></li>

      <li><font face="Helvetica, Arial, sans-serif">

          Any output leading up to a delimiter of 30 dashes on its own

          line will be ignored.  Any output coming after a second line

          comprised of 30 dashes is also ignored.  These delimiter lines

          are optional if your output is solely comprised of the sorted

          results and nothing else.  Otherwise, use the space to prefix

          your results with status messages or a status indicator

          (progress bar, etc), and optionally follow up your results

          with a summary of what your code encountered.  See example at

          bottom of message.<br>

        </font></li>

    </ol>

    <font face="Helvetica, Arial, sans-serif">Your code can actually

      output whatever it wants, so long as there is a way to call it

      where it produces output according to the spec as outlined above.<br>

      <br>

      An example is provided in the lines below, and in the screenshot

      that follows.  This output is generated by the code as found on

      github at <a href="https://github.com/tommybutler/dupfind">https://github.com/tommybutler/dupfind</a><br>

    </font><br>

    <font face="Helvetica, Arial, sans-serif"><font face="Helvetica,

        Arial, sans-serif"><font face="Helvetica, Arial, sans-serif"><font

            face="Helvetica, Arial, sans-serif">In just a few minutes I

            will put up on (github at the same url) the correct output

            for the reference data that is currently on the contest

            server under /dedup.  </font></font></font><b><i><font

            face="Helvetica, Arial, sans-serif">Please take time to

            compare your code output to the output of the "reference

            design" code on github. If your output is not identical,

            then you will be disqualified for producing incorrect

            results.  </font></i></b><font face="Helvetica, Arial,

        sans-serif">If you believe the reference design is incorrect,

        then please submit a bug report and/or a patch!!</font><br>

      <br>

      --Tommy Butler<br>

    </font>

    <meta http-equiv="content-type" content="text/html;

      charset=ISO-8859-1">

    <hr size="2" width="100%">

    <blockquote><big><tt>$ ./dupfind --format robot --dir .<br>

          ** SCANNING ALL FILES<br>

          ** CHECKSUMMING SIZE DUPLICATES<br>

          ** DISPLAYING OUTPUT<br>

          ------------------------------<br>

          ./.git/logs/HEAD    ./.git/logs/refs/heads/master<br>

          ./.git/refs/heads/master    ./.git/refs/remotes/origin/master<br>

          ./bar    ./baz    ./foo<br>

          ------------------------------<br>

          ** TOTAL SCANNED: 86<br>

          ** TOTAL DUPES:   4<br>

          ** SCAN TIME:     0.00824308 wallclock secs ( 0.00 usr +  0.01

          sys =  0.01 CPU)<br>

          ** DELETION TIME: 0</tt></big><br>

    </blockquote>

    <hr size="2" width="100%"><br>

    <img src="cid:part2.06000203.08030409@internetalias.net" alt=""><br>

  </body>

</html>