SPUG: downloading from web pages

Christopher Howard choward at indicium.us
Wed Nov 26 17:27:26 PST 2008


On Mon, 24 Nov 2008, DeRykus, Charles E wrote:

> LWP::Simple may be useful... eg,
>
> perl -MLWP::Simple -e 'getprint "http://foo.bar?x=input1&y=input2"
>
> Or others in the LWP suite for more functionality such as
> LWP::UserAgent.
>
> -- 
> Charles DeRykus
>
> -----Original Message-----
> From: Christopher Howard [mailto:choward at indicium.us]
> Sent: Monday, November 24, 2008 5:30 PM
> To: Seattle Perl Users Group
> Subject: SPUG: downloading from web pages
>
> Hi.  Boss gave me a very simple task, but it's not something I've had to
> do before.
>
> Basically there is this scripted page on the Internet, where the visitor
> enters some date-time conditions, and the page spits back out a list of
> links to scientific datafiles that have been automatically generated on
> the server.  All the data-time information is passed to the script as
> simple GET data.
>
> So I need to make a simple script that accesses the page (using the
> correct GET data) and downloads all files listed on the page which match
> a certain pattern.  Kindergarten stuff for you guys I'm sure.  Could
> someone point me in the direction of a helpful module for the
> accessing/downloading part?
>
> --
> Christopher Howard
> choward at indicium.us
> http://www.indicium.us
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
>     POST TO: spug-list at pm.org
> SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
>    MEETINGS: 3rd Tuesdays
>    WEB PAGE: http://seattleperl.org/
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
>     POST TO: spug-list at pm.org
> SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
>    MEETINGS: 3rd Tuesdays
>    WEB PAGE: http://seattleperl.org/
>

Thanks for the help everyone.  I ended-up use LWP::Simple because it was, 
well... simple.

I don't know much about wget, but I think Perl was better in this case. 
Lot of dynamic stuff going on here with the URIs, and the script was meant 
actually to be run by someone else on a different system.  All the user 
has to do is enter start 
date and end date, and save directory, and the script takes care of all 
the details.  Plus, there were several thousand files to download, each 
with a unique URI.

I finished the script and it works great! (Er, well, so far...)

--
Christopher Howard
choward at indicium.us
http://www.indicium.us


More information about the spug-list mailing list