SPUG: "Programming Challenge"

Wed Dec 17 13:14:13 CST 2003

On Wed, Dec 17, 2003 at 02:23:59AM -0800, Ross Wolin wrote:
> At this evening's SPUG meeting, Tim made reference to a "programming 
> challenge" to be posted to the list within 24 hours.   I'm not sure I 
> would go as far as to name this a challenge, but I am interested in 
> getting a better understanding of what's going on.

> 
>    #Method 1

>    #Method 2

>    #Method 3 (same as #1, but buffered writes)

Method 4
> 
>    #Now interleave the low and high bytes and write to STDOUT
>    for (my $x=0; $x <= $#LSB; $x++) {
>       print $LSB[$x],$MSB[$x];
>    }
> 
> 
> (Of course checking beforehand that $#LSB == $#MSB)   I thought this 
> method would have more overhead since now I have 4M scalars in an array 
> instead of a 4M string of bytes.... but of course I was wrong.  =)   
> This approach took 1-2 **seconds** to write 8M, which was roughly the 
> same speed as the C program I wrote to try to speed things up.
> 
> 
> My question is: did I do something horribly wrong in the scalar/binary 
> string implementations that made it go so slow, or is this just God 
> (Larry) smacking me around for trying to use a byte/character as the 
> basic unit of operation rather than a string?   And also, is there a 
> better way to perform this operation than the array version I settled on 
> (I can either put the bytes from the analyzer into an array or a 
> scalar/binary string, it doesn't matter to me.)

1 and 3 probably take so long because of the regular expressions. 2 writes
byte by byte with lots of syswrite() calls, which also adds up. 4 uses print
instead, which does buffering and uses fewer writes of more data each, and
thus is faster.

Here's some suggestions: See if there's any modules on CPAN that allow you
to use memory mapped files, and go that way instead of normal file I/O. Or
create huge long strings with the data, and then print them, instead of
bytes at a time.

-- 
Shawn Wagner
shawnw at speakeasy.org