Lists
Eugene Tsyrklevich
eugene at securityarchitects.com
Fri May 26 12:43:04 CDT 2000
~sdpm~
> my @lines = (<FILE>)[-3..-1];
>
> I'm not sure if this is less a memory hog than Eugene's response, but it
> also assumes that the file is the standard
> append-the-last-one-to-the-end format. Check out
> http://www.globalspin.com/list_test.vep to see it in action.
it is just as bad :-)
(<FILE>)[-3..-1] expression will read the contents of the whole file into memory, take the last 3 lines and throw away the rest.
check this out:
bash-2.03$ ls -l /usr/share/dict/words
-r--r--r-- 2 root bin 2486893 Mar 14 14:26 /usr/share/dict/words
bash-2.03$ perl bench.pl
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
eugene 9198 0.0 1.1 336 1428 p1 SN+ 10:20AM 0:00.06 perl bench.pl
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
eugene 9198 0.0 10.5 12732 13720 p1 SN+ 10:20AM 0:02.66 perl bench.pl
^^^^^ (ouch!)
Elapsed time: 2.845168
estimated memory used (rss2-rss): 13720 - 1428 -> ~12 Megs
bash-2.03$ perl bench2.pl
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
eugene 8870 0.0 1.1 336 1428 p1 SN+ 10:20AM 0:00.06 perl bench2.p
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
eugene 8870 0.0 1.1 400 1452 p1 SN+ 10:20AM 0:01.96 perl bench2.p
Elapsed time: 2.072702
^^^^^^^^ second version is way way more memory efficient and is even 30% faster
estimated memory used (rss2-rss): 1452 - 1428 -> 24 bytes ;-)
bash-2.03$ cat bench.pl
#!/usr/bin/perl -wl
use Time::HiRes qw(gettimeofday tv_interval);
@ARGV = '/usr/share/dict/words';
my $t0 = [gettimeofday];
print qx/ps aux -p $$/;
# bench.pl contains this line
my @last_3_lines = (<>)[-3..-1];
=comment
while bench2.pl uses
(still very slow but at least it's memory efficient)
push @last_3_lines, scalar <>, scalar <>, scalar <>;
while(<>) {
shift @last_3_lines;
push @last_3_lines, $_;
}
=cut
print qx/ps aux -p $$/;
my $elapsed = tv_interval($t0, [gettimeofday]);
print "Elapsed time: $elapsed\n";
if you want speed + memory efficiency you would need to use the algorithm tail is using (map file into memory (reduce I/O, increase speed) and then start counting \n):
f ((start = mmap(NULL, (size_t)size, PROT_READ, MAP_PRIVATE,
fileno(fp), (off_t)0)) == (caddr_t)-1)
return (1);
p = start + size - 1;
if (style == RBYTES && off < size)
size = off;
/* Last char is special, ignore whether newline or not. */
for (llen = 1; --size; ++llen)
if (*--p == '\n') {
WR(p + 1, llen);
llen = 0;
if (style == RLINES && !--off) {
++p;
break;
}
}
if (llen)
WR(p, llen);
if (munmap(start, (size_t)sbp->st_size))
ierr();
cheers.
~sdpm~
The posting address is: san-diego-pm-list at hfb.pm.org
List requests should be sent to: majordomo at hfb.pm.org
If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at happyfunball.pm.org> with the following
command in the body of your email message:
unsubscribe san-diego-pm-list
If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at happyfunball.pm.org> .
This is the general rule for most mailing lists when you need
to contact a human.
More information about the San-Diego-pm
mailing list