SPUG: binary data with the high bit set

jlb jlb at io.com
Mon Mar 7 09:56:34 PST 2005


I'm seeing some really odd behavior in trying to print binary data to a 
file.  It worked fine on one of my systems, but on another I'm running 
into problems. The one where it works is 5.6.1, and the one where I have 
problems is 5.8.0.

I can write one byte of data fine up to 0x7e, but once I reach 0x80, even 
though the data in memory appears to be only one byte, and unpacking it 
also shows it's only one byte, if I write it to the file, two bytes are 
actually written.

>From 0x80 to 0xbf, an additional byte of 0xc2 is prepended.

>From 0xc0 to 0xff, the byte is actually wrong (it's 64 lower than 
it should be), and the byte 0xc3 is prepended (for 0xc0, 0xc380 is 
written.)

I'm guessing this is unicode related, and if it's important, I'm opening 
the file with sysopen, but I tried it with regular open and it didn't 
seem to make a difference.

I looked at the unicode docs and tried "use bytes" but it didn't seem to 
affect the behavior.

Here is a short bit of code that does it on my 5.8 system.

---

open OUT, ">>test";
my $size = -s "test";

my $byte = pack("C", 255);
print "data: ".  unpack("H*", $byte) ."\n";

print OUT $byte;
close OUT;

my $dsize = -s "test";
if ($dsize != ++$size) {
   print "size should be $size but it's $dsize\n";
}

--

Any hints as to how I can avoid this?


More information about the spug-list mailing list