[tpm] OID sorting

Wed May 6 10:15:39 PDT 2009

On Wed, May 06, 2009 at 12:06:51PM -0400, Fulko Hew wrote:
> I just thought I'd follow up on the discussion I started last week.
> (Thanks to all who participated!)
> 
> Attached is my current version of the test harness based on
> Abram's work, and most if not all supplied routines and
> suggestions (even your's Uri, now that I had the time to
> look!)
> 
> My conclusions:
> 
> 1/ Abram's 'nounpacksort' was the most efficient.
> 2/ Uri's rehash of my original code wins as the most readable.
>    (I know it had ugly 'C' constructs, but I couldn't think of a better
>    more Perl'ish way (see bonus points below))
> 3/ My original code wasn't all that bad :-)
> 4/ Adding the tuple idea yields noticeable improvements.
> 
> Uri also gets bonus points for showing me that I could do this
> kind of in-place substitution... on a list:
> 
>    s/ //g foreach @oids;
> 
> Uri also gets extra bonus points for teaching me to use
> regex's with embedded code.  I always knew it could be done
> but never spent the time to learn it, nor recognize when
> it would be useful.  Now (dammit), thanks to Uri, I know.
> 
>     s/(\d+)/sprintf('%8s', $1)/ge foreach @oids;
> 
> And oh yes... printing the 'number as a string' _is_ faster
> than printing a 'number as a number'!
> 
> Here's the benchmark results:
> 
>                      s/iter    Improvement
> abram_whilecmp         7.05             --
> indy_slow_sort         6.89             2%
> henrys_sort            5.52            28%
> fulko_sort_uri         1.79           294%
> fulko_sort_orig        1.77           299%
> abram_swartzpacksort   1.55           356%
> abram_packsort         1.41           400%
> fulko_sort             1.14           519%
> abram_nounpacksort    0.905           679%
> 
> Oh... another thing I learned...
> 
> When running benchmarks... turn off Centrino auto-CPU-speed-mechanism!
> I was having a hell of a time trying to figure out
> why my results were always all over the map.  :-(
> 
> Finally, for those who want to see the (current) winners,
> without having to look at the attachment, they are:
> 
> For Readability:
> 
> sub fulko_sort_uri {
>     my @oids = @_;
> 
>     s/(\d+)/sprintf('%8s', $1)/ge foreach @oids;
>     @oids = sort @oids;
>     s/ //g foreach @oids;
>     return @oids;
> }
> 
> For Fastest:
> 
> sub abram_nounpacksort {
>     my @data = map { [ split(/\./, $_) ] } @_;
>     my @sorted = map { $_->[1] }
>         sort { $a->[0] cmp $b->[0] }
>             map { [pack("N*",@$_),$_] } @data;
>     @sorted = map { join(".",@$_) } @sorted;
>     return @sorted;
> }

Since the numbers are the only significant content of the oid,
I wonder whether it would be cheaper to recreate it at the end.
Sorting with no sort function is faster than having to do
2 array subscripts for each compare. (Maybe that was what
abram_packsort did?)

sub jmm {
    return
        map { join( '.', unpack('N*', $_) ) }
        sort
        map { pack('N*', split(/\./, $_) ) }
        @_;
}

Instead of a 2-element array, I sometimes just use a single
string, consisting of a leading portion which is directly
sortable, if necessary a marker, and finally the original
string.  The final map just removes the leading string (and
the marker).

> _______________________________________________
> toronto-pm mailing list
> toronto-pm at pm.org
> http://mail.pm.org/mailman/listinfo/toronto-pm