Do you expect the web log to be encoded as UTF-8?  If not, you may need
to specify the correct encoding when you open it.  If your LANG
environment variable specifies that the default system text encoding is
utf8, then Perl expects strings to be UTF-8 encoded.

The gotcha here is that unlike single-byte and fixed-width multibyte
character encodings, UTF-8 uses a variable width scheme.  This makes
UTF-8 compatible with ASCII, because all 1-byte ASCII characters are
valid UTF-8 characters.  Not so with high-order bytes.  In order for
UTF-8 to be able to encode all 95,221 characters which are included in
the Unicode 3.2 repertoire, the other 95,094 characters in addition to
the ASCII character set are represented by multiple-byte sequences.
This means that an arbitrary stream of bytes which contains high-order
bytes is more likely to be invalid UTF-8 as valid.

If your source file is single-byte encoded and not UTF-8 encoded, then
you can use binmode() to specify byte-oriented input, or a non-default
encoding scheme (encode.pm module required for the latter).  You can
also set the LANG environment variable to specify a system language
which is not a UTF-8 locale prior to running your script.

It seems odd that a web log file would not be vanilla ASCII, though.


Basically I'm just parsing a web log. But this is the line where is
showing the error:

my @temp=split(' ',$line);

And is very weird. Why should break just splitting a string. 

Any thoughts.


According to perldoc perldiag:

Malformed UTF-8 character (%s)

    Perl detected something that didn't comply with UTF-8 encoding

    One possible cause is that you read in data that you thought to be
    UTF-8 but it wasn't (it was for example legacy 8-bit data).  Another
    possibility is careless use of utf8::upgrade().

Can you provide any other information about the application you're
having trouble with?

