Joshua ben Jore
twists at gmail.com
Sat Feb 10 20:51:31 PST 2007
On 2/8/07, Michael R. Wolf <MichaelRWolf at att.net> wrote:
> I've got some almost_XML code. That is, it is not well-formed. Almost well
> formed, but "almost" is "not". It appears to be line-oriented enough that a
> simple-minded line processing could clean it up, but I don't want to rely on
> simple-minded if there's a TagSoup::Parser that I could use to clean it up.
XML::LibXML has an "HTML" feature which lets it handle badly formed
input. I've even used it to scrape web sites. Works neat.
More information about the spug-list