[pm-h] Performance discussion after the meeting.

G. Wade Johnson gwadej at anomaly.org
Mon Oct 13 09:53:03 PDT 2014


On Mon, 13 Oct 2014 15:43:09 +0000 (GMT)
Robert Stone <drzigman at drzigman.com> wrote:

> Greetings,
> 
> This is a very fascinating example of making use of Benchmark to
> empirically discover the most efficient method of performing an
> operation.  Thank you for putting it together!
> 
> This is the first time I've seen Benchmark used
> (https://metacpan.org/pod/Benchmark) and I think it's very powerful
> stuff.  A few questions come to mind for the list...
> 
> 1. Who wants to do a presentation on Benchmark? :)
> 2. Who wants to do a presentation on Devel::NYTProf? :) :)
> 
> In all seriousness though, this looks great.  Thanks again for taking
> the time to point out that while push/shift is a workable solution, I
> can squeak out a few more cycles with bit shifting.  Moreover, the
> bit shifting solution is likely what I will use in the XS version of
> the module.

I've done presentations on both.

Remember that the results of Benchmark (and to a lesser extent,
NYTProf) are actually not very precise. They give a pretty good guess
if the ratios are high (>25%). The author of Benchmark has repeated
pointed out that improvements of ~7% are actually noise. The module
cannot really distinguish changes that small.

G. Wade

> Best Regards,
> Robert Stone
> 
> ----- Original Message -----
> From: "G. Wade Johnson via Houston" <houston at pm.org>
> To: "Houston Perl Mongers" <houston at pm.org>
> Sent: Friday, October 10, 2014 10:37:18 PM
> Subject: [pm-h] Performance discussion after the meeting.
> 
> After the meeting last night, I decided to do some investigation of
> the discussion we had about the word rotation code. I used Benchmark
> to get a rough idea of the speed of the different approaches we
> talked about.
> 
> The four approaches I tested were:
>   push    - Robert's original approach
>   reanon  - recreate anonymous array each time to save memory creep
>   slice   - this was the array slice assignment we looked at last
>   twiddle - bit twiddling of the integer as we suggested for C
> 
> Each test runs the actual code 10_000 times so that the subroutine
> overhead does not swamp the actual work. The loop overhead is probably
> not much of an issue.
> 
> The program is between the lines:
> ----------------------------------------------------------
> use strict;
> use warnings;
> use 5.010;
> 
> use Benchmark qw/cmpthese/;
> use Devel::Size qw/total_size/;
> 
> my @word = ("\x12", "\x34", "\x56", "\x78");
> my $int_word = 0x12345678;
> my $iters = 10_000;
> 
> my %sizes;
> 
> print "Orig size: ", total_size( \@word ), "\n";
> 
> cmpthese( -2,
>     {
>         slice  => sub {
>             my $w = [ @word ];
>             foreach ( 0 .. $iters ) {
>                 @{$w}[0..3] = @{$w}[1,2,3,0];
>             };
>             $sizes{slice} = total_size( $w );
>         }, 
>         push   => sub {
>             my $w = [ @word ];
>             foreach ( 0 .. $iters ) {
>                 push @{$w}, shift @{$w};
>             };
>             $sizes{push} = total_size( $w );
>         }, 
>         reanon => sub {
>             my $w = [ @word ];
>             foreach ( 0 .. $iters ) {
>                 $w = [ @{$w}[1,2,3,0] ];
>             };
>             $sizes{reanon} = total_size( $w );
>         }, 
>         twiddle => sub {
>             my $w = $int_word;
>             foreach ( 0 .. $iters ) {
>                 $w = ($w >> 1 | $w << 3);
>             }
>         }, 
>     }
> );
> 
> foreach my $t ( sort keys %sizes )
> {
>     print "$t: $sizes{$t}\n"; 
> }
> ----------------------------------------------------------
> 
> The output is below:
> Orig size: 320
>          Rate   slice  reanon    push twiddle
> slice   127/s      --    -14%    -77%    -83%
> reanon  148/s     17%      --    -74%    -81%
> push    560/s    341%    278%      --    -27%
> twiddle 763/s    501%    415%     36%      --
> push: 400
> reanon: 320
> slice: 320
> -----------------------------------------------------------
> 
> As you can see, Robert's solution (push) is about three times as fast
> as either of the other two. (It was 5-6 times as fast on another
> machine.) The memory issue I brought up has apparently been corrected.
> Although the array does grow in size, it's only 80 bytes, so that
> would not be a real issue.
> 
> Interestingly, the bit twiddling response is only another 36% faster
> in Perl. Some of that is due to other changes in the sub. In C, I
> suspect it would be a whole lot faster.
> 
> So, it looks like Robert had the best solution after all.
> 
> G. Wade


-- 
Contrary to popular opinion, the plural of 'anecdote' is not 'fact'.


More information about the Houston mailing list