[Pdx-pm] NEWBIE question - Am i making this too complex parsing HTML
Pete Lancashire
nix at petelancashire.com
Sat May 7 14:06:12 PDT 2005
I have a URL that returns two MAPs. I need to extract
all the HREFs from one of the maps. The one with the
name 'region'. What I came up with is:
#!/usr/local/bin/perl
use warnings;
use strict; $|++;
my $VERSION = "0.01";
use LWP;
use HTML::TokeParser::Simple;
# using LWP instead of Simple for future needs
my $browser = LWP::UserAgent->new;
my $url = "http://www.undeerc.org/wind/winddb";
my $response = $browser->get( $url );
die "Canât get $url -- ", $response->status_line
unless $response->is_success;
my $content = $response->content;
$content =~ s/\r//g;
my $p=HTML::TokeParser::Simple->new(\$content);
my ($href, $token);
while ( $token = $p->get_token ) {
if ( $token->is_start_tag('map') && ( $token->get_attr('name') eq
'region' ) ) {
until ($token->is_end_tag('map') ) {
$token = $p->get_token;
if ($token->is_start_tag('area') ) {
$href = $token->get_attr('href');
print "HREF:$href\n";
}
}
last;
}
}
TIA
-pete
More information about the Pdx-pm-list
mailing list