<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<font face="Helvetica, Arial, sans-serif">For your deduplication
hackathon code entry, the output of your Perl app should be as
follows:<br>
</font>
<ol>
<li><font face="Helvetica, Arial, sans-serif">Each grouping of
duplicates should be sorted and printed out all on one line,
by filename, deliminated by a tab character.</font></li>
<li><font face="Helvetica, Arial, sans-serif">The lines of output
should be sorted.</font></li>
<li><font face="Helvetica, Arial, sans-serif">The sort you should
use for both the lines of output and the file name groupings
themselves is: </font><tt><big>sort { $a cmp $b }</big></tt></li>
<li><font face="Helvetica, Arial, sans-serif">
Any output leading up to a delimiter of 30 dashes on its own
line will be ignored. Any output coming after a second line
comprised of 30 dashes is also ignored. These delimiter lines
are optional if your output is solely comprised of the sorted
results and nothing else. Otherwise, use the space to prefix
your results with status messages or a status indicator
(progress bar, etc), and optionally follow up your results
with a summary of what your code encountered. See example at
bottom of message.<br>
</font></li>
</ol>
<font face="Helvetica, Arial, sans-serif">Your code can actually
output whatever it wants, so long as there is a way to call it
where it produces output according to the spec as outlined above.<br>
<br>
An example is provided in the lines below, and in the screenshot
that follows. This output is generated by the code as found on
github at <a href="https://github.com/tommybutler/dupfind">https://github.com/tommybutler/dupfind</a><br>
</font><br>
<font face="Helvetica, Arial, sans-serif"><font face="Helvetica,
Arial, sans-serif"><font face="Helvetica, Arial, sans-serif"><font
face="Helvetica, Arial, sans-serif">In just a few minutes I
will put up on (github at the same url) the correct output
for the reference data that is currently on the contest
server under /dedup. </font></font></font><b><i><font
face="Helvetica, Arial, sans-serif">Please take time to
compare your code output to the output of the "reference
design" code on github. If your output is not identical,
then you will be disqualified for producing incorrect
results. </font></i></b><font face="Helvetica, Arial,
sans-serif">If you believe the reference design is incorrect,
then please submit a bug report and/or a patch!!</font><br>
<br>
--Tommy Butler<br>
</font>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
<hr size="2" width="100%">
<blockquote><big><tt>$ ./dupfind --format robot --dir .<br>
** SCANNING ALL FILES<br>
** CHECKSUMMING SIZE DUPLICATES<br>
** DISPLAYING OUTPUT<br>
------------------------------<br>
./.git/logs/HEAD ./.git/logs/refs/heads/master<br>
./.git/refs/heads/master ./.git/refs/remotes/origin/master<br>
./bar ./baz ./foo<br>
------------------------------<br>
** TOTAL SCANNED: 86<br>
** TOTAL DUPES: 4<br>
** SCAN TIME: 0.00824308 wallclock secs ( 0.00 usr + 0.01
sys = 0.01 CPU)<br>
** DELETION TIME: 0</tt></big><br>
</blockquote>
<hr size="2" width="100%"><br>
<img src="cid:part2.06000203.08030409@internetalias.net" alt=""><br>
</body>
</html>