[Chicago-talk] HTML Parsing
Shlomi Fish
shlomif at iglu.org.il
Sat Dec 25 02:30:49 PST 2010
Hi Darren,
On Friday 24 December 2010 19:39:34 Young, Darren wrote:
> I have HTML that contains a table that I need to extract fields from. In
> the end I want to take this data and shove it in a MySQL table but CSV in
> the interim would suffice. HTML::Parser and HTML::TreeBuilder appear like
> they can do this but does anyone know any "simpler" modules for this? It's
> been a long while since I tried this type of thing.
>
Well, http://search.cpan.org/dist/HTML-TableExtract/ has a good reputation and
good reviews on CPAN (and quite a few open bugs which indicate people actually
tried to use it.).
If that fails, you should try
http://search.cpan.org/dist/HTML-TreeBuilder-LibXML/ , while not "simpler"
than plain HTML::TreeBuilder, it is more powerful and also gives you XPath and
other nice features.
> Oh, content is coming from LWP's $res->content.
>
I think both modules should be able to handle these fine.
Regards,
Shlomi Fish
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
UNIX Fortune Cookies - http://www.shlomifish.org/humour/fortunes/
Chuck Norris can make the statement "This statement is false" a true one.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
More information about the Chicago-talk
mailing list