Weather Forecast

Craig Sanders cas at taz.net.au
Sun Sep 15 23:02:32 CDT 2002


On Mon, Sep 16, 2002 at 11:16:02AM +1000, Jens Porup wrote:
> Given the fickle, strange weather we've had over the last several
> days, the following script might be of some use to you. 
> 
> With not enough work to keep me busy at the moment, I've been playing
> around with LWP and HTML::TokeParser, and I've come up with a 

both are excellent modules.  i've used them for several web robot tools,
including a script to fetch the weather (see below) and another script
to search the melbourne trading post web site and present the results in
a single page with simple html (IMO, much more useful than the
bletcherous 10-results-per-page, click-next-to-continue,
too-much-javascript-and-other-rubbish crap offered by the actual site).

i'd post a copy of my trading post script but a) they change the page
format often enough that the script needs to be updated regularly and b)
i strongly suspect that if more people did what i'm doing, they'd do
whatever it took to make it impossible....past discussions with them
have highlighted the fact that usability of the site is not at all
important to them.


> Anyway, here it is. Any suggestions on coding style are, as always,
> most welcome.

neat.  that's very similar to the script i wrote a few years ago:


---cut here---
#! /usr/bin/perl

use strict ;
use LWP::UserAgent;
use HTML::TokeParser;

# URLs
my $base_url = "http://www.BoM.GOV.AU";
my $forecasts = "$base_url/weather/vic/forecasts.shtml";
my %URLs;

#$ua->proxy('http', 'http://localhost:3128/');
#my $ua = LWP::UserAgent->new(env_proxy => 1, keep_alive => 5);
my $ua = LWP::UserAgent->new(env_proxy => 0, keep_alive => 5);
$ua->agent('Mozilla/4.76 [en] (X11; U; Linux 2.4.2 i686; Nav)');

my $request = HTTP::Request->new('GET', $forecasts);
my $response = $ua->request($request); 

my $p = HTML::TokeParser->new(\$response->content);

while (my $token = $p->get_tag("a")) {
	my $url = $token->[1]{href} || "-";
	my $text = $p->get_trimmed_text("/a");
	$URLs{$text} = "$base_url$url";
}

my @forecasts = ( 'Melbourne Precis Forecast', 'Melbourne Forecast', 
                  'State Forecast', 'Future Developments' ) ;

foreach (@forecasts) {
	print $_,"\n", "-" x length($_), "\n";

    my $request = HTTP::Request->new('GET', $URLs{$_});
	my $response = $ua->request($request); 

    my $p = HTML::TokeParser->new(\$response->content);
	$p->get_tag("pre");
	my $txt = $p->get_text("/pre");
	foreach (split(/\n/,$txt)) {
	  s/^\s+$//;
	  next if (/^IDV|^BUREAU OF METEOROLOGY|^VICTORIAN REGIONAL OFFICE|^P.O. Box 1636/);
	  print $_,"\n";
	} ;

	print "\n\n";
}
---cut here---

it sends its output (stripping some repetitive stuff i don't want to see
4+ times in each message) to stdout so that i can see the results on the
command line and/or choose to mail it to myself or whatever.  i much
prefer writing tools so that they can be used in a pipeline.

the script is called from cron like so:

/usr/local/bin/getweather.pl | uniq | mail -s "weather report $DT" cas at taz.net.au


craig

-- 
craig sanders <cas at taz.net.au>

Fabricati Diem, PVNC.
 -- motto of the Ankh-Morpork City Watch



More information about the Melbourne-pm mailing list