SPUG: "Programming Challenge"

Thu Dec 18 01:00:56 CST 2003

On Wed, 17 Dec 2003 at 17:46 -0800, David Dyck <david.dyck at fluke.com> wrote:

> On Wed, 17 Dec 2003 at 17:12 -0800, Yitzchak Scott-Thoennes <sthoenna at efn.o...:
>
> > Avoid loops at all costs (or keep them in C where they belong):
> >
> > use Encode;
> > print encode("UCS-2LE", $LSB) ^ encode("UCS-2BE", $MSB);
>
> That's clever, and I'll try to remember it, but
> this loop version is a tad faster in my testcases.
>
> for (my $i=0; $i < length $LSB; $i++) {
>     print substr($LSB,$i,1).substr($MSB,$i,1)
> }

After thinking about what encode was doing I used pack and unpack
to do the same think.  I needed to block up the chunks because I
was running out of RAM (and swapping slowed it down anyway)
The following code is twice as fast as either of the above.

sub block_size { 0x200 };
for (my $i=0; $i < length $LSB; $i+= block_size ) {

    syswrite STDOUT,      pack( "v*", unpack "C*", substr($LSB, $i, block_size ) )
                        | pack( "n*", unpack "C*", substr($MSB, $i, block_size ) );
}

Then I realized that my benchmark of the encode solution may have been
swapping also, so I chunked it up, and it was the fastest.

use Encode;
sub block_size { 0x2000 };
for (my $i=0; $i < length $LSB; $i+= block_size ) {
    syswrite STDOUT,   encode("UCS-2LE", substr($LSB, $i, block_size ))
                     ^ encode("UCS-2BE", substr($MSB, $i, block_size ));
}

I'm not sure how the original $LSB and $MSB were being read in, but
it might have been good to read it in by a similar chunk to use less RAM.

It was educational to look into the Encode module, as
it looks like the .xs (C) implementation of encode is specialized version
of pack, even using the "v" and "n" characters for endian flags.

My thanks to Yitzchak and Ross for this learning opportunity,
 David