[Pdx-pm] HTML::Parser help
Thomas J Keller
kellert at ohsu.edu
Fri Mar 4 13:14:02 PST 2005
Greetings all.
The HTML::parser module provides methods for, literally, parsing HTML.
It can handle HTML text from a string or file and can separate out the
syntactic structures and data. You shouldn't use HTML::Parser directly,
however, since its interface hasn't been designed to make your life
easy when you parse HTML. It's merely a base class from which you can
build your own parser to deal with HTML in any way you want.
I've been away from Perl for a couple of months (grant due). But now
I'm back to tasks that are way more fun.
I find I have to parse an html file to extract some data. I installed
HTML::Parser today, but I'm having trouble understanding how to write
the subs that get me what I want. Does anyone know of a good tutorial,
or some well commented examples?
muchas gracias,
Tom K.
Thomas J. Keller, Ph.D.
Director, MMI Core Facility
Oregon Health & Science University
3181 SW Sam Jackson Park Rd.
Portland, OR, USA, 97239
http://www.ohsu.edu/research/core
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1844 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20050304/2b276a43/attachment.bin
More information about the Pdx-pm-list
mailing list