[Pdx-pm] HTML::Parser help

Thomas J Keller kellert at ohsu.edu
Fri Mar 4 13:14:02 PST 2005


Greetings all.

The HTML::parser module provides methods for, literally, parsing HTML. 
It can handle HTML text from a string or file and can separate out the 
syntactic structures and data. You shouldn't use HTML::Parser directly, 
however, since its interface hasn't been designed to make your life 
easy when you parse HTML. It's merely a base class from which you can 
build your own parser to deal with HTML in any way you want.

I've been away from Perl for a couple of months (grant due). But now 
I'm back to tasks that are way more fun.
I find I have to parse an html file to extract some data. I installed 
HTML::Parser today, but  I'm having trouble understanding how to write 
the subs that get me what I want. Does anyone know of a good tutorial, 
or some well commented examples?

muchas gracias,
Tom K.

Thomas J. Keller, Ph.D.
Director, MMI Core Facility
Oregon Health & Science University
3181 SW Sam Jackson Park Rd.
Portland, OR, USA,   97239

http://www.ohsu.edu/research/core
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1844 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20050304/2b276a43/attachment.bin


More information about the Pdx-pm-list mailing list