[Omaha.pm] Unexpected internal vs shell command speed result...

Dan Linder dan at linder.org
Fri Feb 18 09:07:23 PST 2011


My code currently calls the UNIX "du" command to get the size of a directory
structure:
        $size = `/usr/bin/du -sk $DATA_DIR | cut -f1`;

Knowing that shells are CPU time expensive and generally not portable across
platforms I am looking into replacing it with a pure perl implementation:
        find( sub { -f and ( $size += -s _ ) }, $DATA_DIR );

Wanting to be able to brag about the speed increase, I timed them with the
Benchmark routines, and got a shock when I tested against my /tmp directory:
           Rate Internal Shell_du
Internal 11.6/s       --     -99%
Shell_du 1538/s   13123%       --

WOW!  The shell to du was 13 TIMES faster than the internal find code.
 (FYI, the /tmp/ directory has 349MB across 6400 files.)

As a test, I created a very small directory structure (12 files, 2
sub-directories, 120KB) and the results for 10,000 timings are opposite:
           Rate Shell_du Internal
Shell_du 1664/s       --     -68%
Internal 5208/s     213%       --

This time the internal code was faster...

My test system is a CentOS 5.5 64-bit (2GB RAM, mostly free RAM used for
caching), with Perl 5.8.8, and the /tmp filesystem is an EXT3.

This bit of code isn't time critical and the actual data that will be
processed is closer to the 120K test case, so I may continue and remove the
shell/du line, but I'd like to know how this got so slow!

Dan

Just in case I made a blunder, here's the test code:
#!/usr/bin/perl -w
use strict;
use Benchmark qw(:all);
use File::Find;

my $foo               = 0;
my $count             = shift || 2000;
my $DATA_DIR          = shift || "/tmp";

sub shell_du {
        my $size = 0;
        $size = `/usr/bin/du -sk $DATA_DIR | cut -f1`;
        chomp $size;
        return $size;
}

sub internal_du {
        my $size = 0;
        find( sub { -f and ( $size += -s _ ) }, $DATA_DIR );
        return $size;
}

cmpthese ($count, {
        'Shell_du' => sub { $foo = shell_du();    },
        'Internal' => sub { $foo = internal_du(); },
});

-- 
***************** ************* *********** ******* ***** *** **
"Quis custodiet ipsos custodes?"
    (Who can watch the watchmen?)
    -- from the Satires of Juvenal
"I do not fear computers, I fear the lack of them."
    -- Isaac Asimov (Author)
** *** ***** ******* *********** ************* *****************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/omaha-pm/attachments/20110218/bdf6c32f/attachment.html>


More information about the Omaha-pm mailing list