[Pdx-pm] HTML::Parser help

John Labovitz johnl at johnlabovitz.com
Fri Mar 4 13:35:55 PST 2005


On Mar 4, 2005, at 1:14 PM, Thomas J Keller wrote:

> I find I have to parse an html file to extract some data. I installed 
> HTML::Parser today, but  I'm having trouble understanding how to write 
> the subs that get me what I want. Does anyone know of a good tutorial, 
> or some well commented examples?

I recommend using HTML::Tree (and its associated classes, 
HTML::TreeBuilder and HTML::Element), not HTML::Parser directly.

Once you've got HTML::Tree installed, look at the HTML::Element 
documentation, specifically at the look_down() method.  That's one that 
I tend to use a lot.  Also extract_links().


--
John Labovitz
Macintosh support, research, and software development
John Labovitz Consulting, LLC
johnl at johnlabovitz.com |  +1 503.949.3492 | 
www.johnlabovitz.com/consulting



More information about the Pdx-pm-list mailing list