Last night's meeting

Thu Jun 20 18:55:31 CDT 2002

... was a moderate success; four of us migrated to a room and had a 
darned good discussion ranging over improving performance of a script 
that has to parse .5GB of DNA data to how arrays are represented internally.

According to a guy sitting by the fountain, there were other people 
trying to find us, but their information about the location didn't seem 
to go further than the fountain.  Perhaps the documentation was too lengthy :-)

We agreed to meet monthly - we'll put the date up for review next time 
to avoid any conflicts like WEAV - and have a lecture followed by open 
season in the same location.  Possible lecture topics include 
Object-Oriented Perl.  Let us know what you'd like.

Unresolved questions from last night:

(1) Yes, turning warnings on reveals that when a hash is followed by a 
scalar in a list assignment, the hash will swallow the corresponding 
scalar even though there's no corresponding value:

% perl -Mstrict -wle 'foo(1..5); sub foo{ my (%hash, $scalar) = @_ }'
Odd number of elements in hash assignment at -e line 1.

(2) Could the internal representation of an array be different (and 
affect performance) if the same array was constructed by different 
means (push vs list assignment, that kind of thing)?

Best reference I know of is http://gisle.aas.no/perl/illguts/.  Look 
for the section "AV".

Given the structure there, it would appear that no matter how a given 
array was constructed, it ought to have the same internal 
representation (modulo some possible but insignificant differences in a 
couple of pointers).

(3) Is there any performance difference between iterating through an 
array by index number and by element?  Answer: Yes, the more natural 
way is also faster:

% perl -MBenchmark=cmpthese -le '@x = 1..1000; cmpthese(1000, { for => 
sub { for(my $i=0; $i <= $#x; $i++){ $x[$i]++ } }, foreach => sub { for 
(@x) { $_++ } } })'
Benchmark:
timing 1000 iterations of
  for, foreach
...

        for:  9 wallclock secs ( 9.51 usr +  0.00 sys =  9.51 CPU) @ 
105.15/s (n=1000)

    foreach:  2 wallclock secs ( 3.45 usr +  0.00 sys =  3.45 CPU) @ 
289.86/s (n=1000)

          Rate     for foreach
for     105/s      --    -64%
foreach 290/s    176%      --

--
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com/