<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix"><font face="Helvetica, Arial,

        sans-serif">On 12/30/2013 04:28 PM, Tom Metro wrote:<br>

      </font></div>

    <blockquote cite="mid:52C1F3A8.3020007@gmail.com" type="cite"><font

        face="Helvetica, Arial, sans-serif">Tommy Butler wrote:

        <br>

      </font>

      <blockquote type="cite"><font face="Helvetica, Arial, sans-serif">...other

          hard links should be considered, as already

          <br>

          stated in the rules, "files already deduped".

          <br>

        </font>

        <font face="Helvetica, Arial, sans-serif"><br>

                SCENARIO:

          <br>

        </font>

        <font face="Helvetica, Arial, sans-serif"><br>

          The three files below have identical content:

          <br>

          /foo/bar/baz.txt -> ( inode 12345 )

          <br>

          /foo/car/daz.txt -> ( inode 12345 )

          <br>

          /foo/far/gaz.txt -> ( inode 67890 )

          <br>

        </font>

        <font face="Helvetica, Arial, sans-serif"><br>

        </font>

        <font face="Helvetica, Arial, sans-serif"><br>

                OUTCOME:

          <br>

        </font>

        <font face="Helvetica, Arial, sans-serif"><br>

          /foo/far/gaz.txt should be reported as a duplicate of

          /foo/bar/baz.txt

          <br>

          because /foo/bar/baz.txt comes before /foo/car/daz.txt in a

          sort and

          <br>

          because /foo/car/daz.txt is a hard link.

          <br>

        </font></blockquote>

      <font face="Helvetica, Arial, sans-serif"><br>

        So then the output might look like:

        <br>

        /foo/bar/baz.txt /foo/far/gaz.txt

        <br>

      </font></blockquote>

    <font face="Helvetica, Arial, sans-serif">YES! :)<br>

      <br>

    </font>

    <blockquote cite="mid:52C1F3A8.3020007@gmail.com" type="cite"><font

        face="Helvetica, Arial, sans-serif">

        while /foo/car/daz.txt is simply eliminated from consideration

        and not output at all?

        <br>

      </font></blockquote>

    <font face="Helvetica, Arial, sans-serif">Yep.<br>

      <br>

    </font>

    <blockquote cite="mid:52C1F3A8.3020007@gmail.com" type="cite"><font

        face="Helvetica, Arial, sans-serif">

        The problem with this approach, if you are striving for a useful

        tool and not just a programming exercise, is that you don't know

        which of the aliases is the name most familiar to the user who

        will be reviewing the report.

        <br>

      </font></blockquote>

    <font face="Helvetica, Arial, sans-serif">And just when I've about

      finalized the output spec and created an output file to put up on

      the git repo for diffing ... this.  <span class="moz-smiley-s1"><span>

          :-) </span></span><br>

      <br>

      You are right, insofar as we are working to develop a useful tool

      and not create throw-away code to use once for a competition.  I

      won't pick nits over the likelihood of a real-world scenario where

      hardlinks exist in Joe User's music collection.  However for the

      sake of simplicity we're not going to require contestants to go

      this extra mile at this time.  Everyone is free to implement an

      output format that reports hard link groupings and to do so for

      unredeemable "bonus" points.  Should anyone, like me, want a

      useful tool when they are done with their code, they should strive

      to make it as robust and feature-ful as possible without

      sacrificing too much performance.  After all, there is a winnings

      category for code that provides the most comprehensive feature

      set.<br>

      <br>

      As promised recently, a finalized "expected" output format

      (against which the product of each contestant's code will be

      diff'd) is forthcoming.  It seems like a glaring oversight that

      this wasn't part of the original rules specification.  I don't

      fault myself too much for this given the fact that the output

      format is to be so simple and up to this point everyone is

      expected to be coding against the problem and not the output. 

      Watch for that email later on this evening.<br>

      <br>

      Thanks Tom, and thanks all!<br>

      <br>

      --Tommy Butler<br>

    </font>

  </body>

</html>