SPUG: SUMMARY: sorting hierarchical section numbers
Michael R. Wolf
MichaelRWolf at att.net
Thu Aug 17 10:58:17 PDT 2006
I've appended a test file incorporating some of the suggestions I got.
Thanks to all responders.
See results embedded in code above the __DATA__ tag.
As a test of the test, I tried passing the code in as a code reference and
as a string because PBP mentioned it was better as a string. That was
contrary to my previous practice, so I thought I'd try it.
Summary:
1. For a short, already ordered list, pack and split were equal. I haven't
gotten to test this on long lists, reversed lists, and shuffled lists.
2. The subroutine call overhead is substantial for this code/data
combination, creating a 20% rate decrease. (See previous disclaimer on
reversed, shuffled, and large lists.)
CODE STARTS HERE:...
#! /usr/bin/perl
use warnings;
use strict;
# use Smart::Comments;
use Benchmark qw(:all);
my @sections = <DATA>;
chomp @sections;
foreach (@sections) {
s/\s//g;
}
# ================
# Sort a "section number" string (i.e. dot-separated list of numbers).
sub by_section_number_splitting {
# Turn the section string into a list of numbers...
my @a = split /[.]/, $a;
my @b = split /[.]/, $b;
# While there's still something in both lists...
while (@a && @b) {
# Grab the first thing in each list...
my ($lhs, $rhs) = (shift @a, shift @b);
# and compare them numerically.
my $rc = $lhs <=> $rhs;
# If the pair is unequal, we have our result...
return $rc if $rc;
}
# If a list (or both) became empty, the shortest one compares
smallest...
return @a <=> @b;
}
sub by_section_number_packing {
# Create a string, packing each number into an unsigned character...
my $lhs = pack 'C*', split /[.]/, $a;
my $rhs = pack 'C*', split /[.]/, $b;
### lhs: $a, $lhs, length($lhs)
### rhs: $b, $rhs, length($rhs)
# then compare them as strings...
return $lhs cmp $rhs;
}
my @sorted_by_splitting = sort by_section_number_splitting @sections;
my @sorted_by_packing = sort by_section_number_packing @sections;
### original : @sections
### split sort : @sorted_by_splitting
### pack sort : @sorted_by_packing
my $results
= timethese(0,
{
# The two sort algorithms as anonymous sub references...
'split' => sub { sort by_section_number_splitting
@sections },
'pack' => sub { sort by_section_number_packing
@sections },
# The two sort algorithms as strings to be eval'd...
'split_str' => q{ sort by_section_number_splitting
@sections },
'pack_str' => q{ sort by_section_number_packing
@sections },
}
);
cmpthese($results);
exit;
# ================================================================
=begin comment
Benchmark: running pack, pack_str, split, split_str for at least 3 CPU
seconds...
pack: 4 wallclock secs ( 3.51 usr + 0.00 sys = 3.51 CPU) @
3889417.00/s (n=13636296)
pack_str: 4 wallclock secs ( 3.16 usr + 0.00 sys = 3.16 CPU) @
4982422.25/s (n=15764384)
split: 2 wallclock secs ( 3.28 usr + -0.02 sys = 3.26 CPU) @
3942718.67/s (n=12837492)
split_str: 3 wallclock secs ( 3.17 usr + 0.00 sys = 3.17 CPU) @
4966724.95/s (n=15764385)
Rate pack split split_str pack_str
pack 3889417/s -- -1% -22% -22%
split 3942719/s 1% -- -21% -21%
split_str 4966725/s 28% 26% -- -0%
pack_str 4982422/s 28% 26% 0% --
=end comment
=cut
__DATA__
1.1
1.11
1.2
1.21
1.3
1.3.1
1.3.1.1
1.3.1.2
1.3.1.3
1.3.1.20
1.3.11.2
1.3.2
1.30.1.2
2
2.1
3.
3.1
11
11.1
11.2
11.3
11.3.1
11.3.1.1
11.3.1.2
11.3.2
--
Michael R. Wolf
All mammals learn by playing!
MichaelRWolf at att.net
> -----Original Message-----
> From: spug-list-bounces+michaelrwolf=att.net at pm.org [mailto:spug-list-
> bounces+michaelrwolf=att.net at pm.org] On Behalf Of Michael R. Wolf
> Sent: Thursday, August 10, 2006 10:14 PM
> To: 'SPUG Members'
> Subject: SPUG: sorting hierarchical section numbers
>
> I want to sort strings that represent outline numbers.
>
> 1
> 1.1
> 1.2
> 1.3
> 1.3.1
> 1.3.2
> 1.3.2.1
> 1.2
>
> I can't find a module. Could you point me in the correct direction so
> that
> I can read the fine manual? I can't seem to find the right search term;
> these are too general 'hierarchy', 'outline', 'number', 'sort'.
>
> --
> Michael R. Wolf
> All mammals learn by playing!
> MichaelRWolf at att.net
>
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
> POST TO: spug-list at pm.org
> SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
> MEETINGS: 3rd Tuesdays
> WEB PAGE: http://seattleperl.org/
More information about the spug-list
mailing list