Lists

Eugene Tsyrklevich eugene at securityarchitects.com
Fri May 26 12:43:04 CDT 2000


~sdpm~
> my @lines = (<FILE>)[-3..-1];
> 
> I'm not sure if this is less a memory hog than Eugene's response, but it
> also assumes that the file is the standard
> append-the-last-one-to-the-end format. Check out
> http://www.globalspin.com/list_test.vep to see it in action.

it is just as bad :-)

(<FILE>)[-3..-1]  expression will read the contents of the whole file into memory, take the last 3 lines and throw away the rest.

check this out:

bash-2.03$ ls -l /usr/share/dict/words
-r--r--r--  2 root  bin  2486893 Mar 14 14:26 /usr/share/dict/words




bash-2.03$ perl bench.pl
USER       PID %CPU %MEM   VSZ   RSS TT   STAT STARTED       TIME COMMAND
eugene    9198  0.0  1.1   336  1428 p1  SN+   10:20AM    0:00.06 perl bench.pl

USER       PID %CPU %MEM   VSZ   RSS TT   STAT STARTED       TIME COMMAND
eugene    9198  0.0 10.5 12732 13720 p1  SN+   10:20AM    0:02.66 perl bench.pl

                               ^^^^^ (ouch!)
Elapsed time: 2.845168

estimated memory used (rss2-rss): 13720 - 1428 -> ~12 Megs




bash-2.03$ perl bench2.pl
USER       PID %CPU %MEM   VSZ   RSS TT   STAT STARTED       TIME COMMAND
eugene    8870  0.0  1.1   336  1428 p1  SN+   10:20AM    0:00.06 perl bench2.p

USER       PID %CPU %MEM   VSZ   RSS TT   STAT STARTED       TIME COMMAND
eugene    8870  0.0  1.1   400  1452 p1  SN+   10:20AM    0:01.96 perl bench2.p

Elapsed time: 2.072702
              ^^^^^^^^ second version is way way more memory efficient and is even 30% faster

estimated memory used (rss2-rss): 1452 - 1428 -> 24 bytes ;-)




bash-2.03$ cat bench.pl
#!/usr/bin/perl -wl

use Time::HiRes qw(gettimeofday tv_interval);


@ARGV = '/usr/share/dict/words';

my $t0 = [gettimeofday];

print qx/ps aux -p $$/;

# bench.pl contains this line
my @last_3_lines = (<>)[-3..-1];

=comment
while bench2.pl uses
(still very slow but at least it's memory efficient)

push @last_3_lines, scalar <>, scalar <>, scalar <>;

while(<>) {
        shift @last_3_lines;
        push @last_3_lines, $_;
}
=cut

print qx/ps aux -p $$/;

my $elapsed = tv_interval($t0, [gettimeofday]);

print "Elapsed time: $elapsed\n";



if you want speed + memory efficiency you would need to use the algorithm tail is using (map file into memory (reduce I/O, increase speed) and then start counting \n):

f ((start = mmap(NULL, (size_t)size, PROT_READ, MAP_PRIVATE,
            fileno(fp), (off_t)0)) == (caddr_t)-1)
                return (1);
        p = start + size - 1;

        if (style == RBYTES && off < size)
                size = off;

        /* Last char is special, ignore whether newline or not. */
        for (llen = 1; --size; ++llen)
                if (*--p == '\n') {
                        WR(p + 1, llen);
                        llen = 0;
                        if (style == RLINES && !--off) {
                                ++p;
                                break;
                        }
                }
        if (llen)
                WR(p, llen);
        if (munmap(start, (size_t)sbp->st_size))
                ierr();



cheers.
~sdpm~

The posting address is: san-diego-pm-list at hfb.pm.org

List requests should be sent to: majordomo at hfb.pm.org

If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:

    unsubscribe san-diego-pm-list

If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.




More information about the San-Diego-pm mailing list