[Chicago-talk] parsing HTML
Randal L. Schwartz
merlyn at stonehenge.com
Fri Feb 23 22:23:41 PST 2007
>>>>> "Jonathan" == Jonathan Rockway <jon at jrock.us> writes:
Jonathan> I'm thinking you want HTML::TreeBuilder::XPath. (XPath is like SQL for
Jonathan> trees.)
I have an upcoming Linux Magazine article that shows how to use that, but
compares it with XML::LibXML in HTML mode and shows that there's an *order of
magnitude* speed difference for parsing a moderate page.
So, if the speed of H::TB::X is fast enough for you, go ahead, but I'll
be using XML::LibXML instead.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn at stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
More information about the Chicago-talk
mailing list