SPUG: selecting/removing lines meeting a criteria : -pi -e vs -i -e / List::SkipList

Aaron W. West tallpeak at hotmail.com
Wed Jun 9 03:10:44 CDT 2004


1) Just a little hint, obvious though this may be to many.

-i can be used without -p.

perldoc perlrun gives info on using -pi, but not on -i by itself. The usage is easy; just add the while and print yourself.

To remove all filenames containing /Temporary Internet Files/ from a list of filenames I saved in ~/allfiles:

$ perl -i -e 'while(<>){print unless /Temporary Internet Files/}' ~/allfiles ; wc allfiles allfiles.bak
 283986  498823 17066061 allfiles
 382211 1087456 28422428 allfiles.bak

So when I last generated that list (with find / > ~/allfiles ), I had over half a million Temporary Internet Files. Phew.. maybe I read too much.

2) List::SkipList

If anyone wants to know a fast & efficient sorting module, you might try List::SkipList. It's a way of creating a sorted list in memory. I can generate a million-element list in RAM in about a minute on my Athlon 2400 XP laptop, under ActiveState Perl, with List::SkipList 0.70. That's about 15 times slower than GNU sort, which is still often the best way to sort large lists, and perhaps no more efficient than the sort built into Perl, but the ability to maintain a list in sorted order as you insert elements is sometimes desirable. (You might also do just as well with BerkeleyDB, however, since it's fast efficient C code, and since disk writes can be cached quite well for a small database...)

If you'd like to test for yourself, you can use the following program.

I put a long comment in there at a place I had a problem on Cygwin Perl 5.8.2 64-bit ints, in case anyone else has the problem. Works fine on ActiveState 5.8.2, or Cygwin 5.8.4 32-bit ints I compiled recently.

#!perl -w
$|=1;
use strict;
use Time::HiRes qw(time);
use List::SkipList;
#$list->insert( 'key1', 'value' );
#$list->insert( 'key2', 'another value' );
#$value = $list->find('key2');
#$list->delete('key1');
my $n = 1000.0/64;
#print "If the algorithm is of complexity O(n**1.5) then factor should be about 8 (4**1.5)\n";
my $last = 0;
my $factor = 0;
my $list;
my $svptr = 0;
my @svarray;
while ($n <= 1024001.0)
{
#    print "deallocating list..";
# The following line caused perl to silently bomb-out 
# for me without the push @svarray 
# in Cygwin Perl 5.8.2 64-bit using AutoLoader 5.60
# after a list of 4061 elements or more has been created,
# and is apparently being deallocated, whether running
# List::SkipList 0.64, 0.65, or 0.70.
# It may be something wrong with my Perl installation.
# A debugger trace showed the program ended after a line with this statement,
# in line 96 of an unknown module, which exists in line 96 of AutoLoader:
# *$sub = sub {}; 
# But it works fine with Cygwin Perl 5.8.4 (compiled myself, 32-bit ints) 
# or ActiveState 5.8.2, with exactly the same AutoLoader, according to diff -b
    $list = 0; 
#    print "creating list..";
    $list = new List::SkipList( max_level => 8 );
#    push @svarray, $list; # 
    my @a = ();
    for (0..$n) {$a[$_] = sprintf("%d",rand()*1e11)};
    my $t0 = time;
    for (0..$n) {$list->insert($a[$_],1)};
    my $tend = time;
    my $elapsed = $tend - $t0;
    $factor = $elapsed / $last if $last;
    printf "n=%6d elapsed:%7.3f, factor=%5.2f\n", 
 $n, $elapsed, $factor;
    $n *= 4.0;
    $last = $elapsed;
};
print "n=$n\n";
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/spug-list/attachments/20040609/d1c9c7d1/attachment.htm


More information about the spug-list mailing list