[Pdx-pm] UTF-8, Perl and XML: Is MSXML superior? :)

Chris Dawson cdawson at webiphany.com
Mon Jul 14 18:31:24 CDT 2003


I am trying to parse the file referenced below using Perl and any of the 
"standard" XML parsers.  By standard I mean XML::Parser or XML::XPath.  
I don't want to use XML::LibXML (relying on libxml2 isn't portable for 
me) and definitely don't want to use Win32::OLE with MSXML!  This file 
parses file in IE but not Mozilla or anything else I try on Linux.  
xmllint says it is bad as well.  The Japanese characters all look fine 
to me.  Jcode.pm tells me this is a UTF-8 encoding, which is what XML 
parsers supposedly support.
http://63.105.19.181/testing/japanese-out.xml

Any clues?  How would I reformat this?  Stick everything in a CDATA 
section?

Thanks,
Chris




More information about the Pdx-pm-list mailing list