[Pdx-pm] HTML::Parser help
Ovid
publiustemp-pdxpm at yahoo.com
Fri Mar 4 13:46:13 PST 2005
Hi Thomas,
HTML::Parser is great, but not everyone can wrap their head around it
that easily. Many prefer a procedural approach as this better maps to
how we're used to getting things done. If that is appealing at all,
you may find HTML::TokeParser::Simple of use. It's ridiculously easy
to use. For example, to only print out the "visible" text in an HTML
file:
my $parser = HTML::TokeParser::Simple->new($html_file);
while (my $token = $parser->get_token) {
print $token->as_is if $token->is_text;
}
There are several more comprehensive examples in the distribution. Of
course, while I admit to being biased, I do find it easier to use than
HTML::Parser.
Cheers,
Ovid
--- Thomas J Keller <kellert at ohsu.edu> wrote:
> Greetings all.
>
> The HTML::parser module provides methods for, literally, parsing
> HTML.
> It can handle HTML text from a string or file and can separate out
> the
> syntactic structures and data. You shouldn't use HTML::Parser
> directly,
> however, since its interface hasn't been designed to make your life
> easy when you parse HTML. It's merely a base class from which you can
>
> build your own parser to deal with HTML in any way you want.
>
> I've been away from Perl for a couple of months (grant due). But now
> I'm back to tasks that are way more fun.
> I find I have to parse an html file to extract some data. I installed
>
> HTML::Parser today, but I'm having trouble understanding how to
> write
> the subs that get me what I want. Does anyone know of a good
> tutorial,
> or some well commented examples?
>
> muchas gracias,
> Tom K.
>
> Thomas J. Keller, Ph.D.
> Director, MMI Core Facility
> Oregon Health & Science University
> 3181 SW Sam Jackson Park Rd.
> Portland, OR, USA, 97239
>
> http://www.ohsu.edu/research/core>
_______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
If this message is a response to a question on a mailing list, please send
follow up questions to the list.
Web Programming with Perl -- http://users.easystreet.com/ovid/cgi_course/
More information about the Pdx-pm-list
mailing list