LWP::UserAgent and referer?

Keary Suska aksuska at webflyer.com
Mon Nov 26 12:50:34 CST 2001


Assuming there is no authentication required, and the server doesn't set any
cookies, and javascript or some kind of plug-in isn't involved, it's hard to
say what is happening.

If you mean by, "If I go to the index.html and select "save as" it downloads
fine," that it works in a browser, but in LWP you get a 404, it could mean:
1) remember that www.domain.com/ and www.domain.com/index.html are not
necessarily (and not usually) the same thing. Sometimes it's the simple
things that escape us ;-)
2) The host is a name-based virtual server. If SSL isn't involved, this very
well could be the case. LWP is by default only HTTP 1.0 compliant, but in
the latest version 1.1 is available but considered "experimental" (I don't
know what that really means). A 1.1 browser will send the "Host" header (I
am not sure of the syntax, but any decent book on HTTP 1.1 or the RFCs
should help), perhaps that is necessary. You could try adding it and see if
it works, or try assembling a proper 1.1 request, although I am not familiar
with how or if you can construct the "GET" request line, or if it is
necessary.

If any of my assumptions above are not correct, there could be a host of
other issues. It would be difficult to troubleshoot further without more
detail (the actual urls and code, etc.).

Keary Suska
Esoteritech, Inc.
"Leveraging Open Source for a better Internet"

> From: "Robert L. Harris" <Robert.L.Harris at rdlg.net>
> Date: Mon, 26 Nov 2001 10:42:47 -0700
> To: John Evans <evansj at kilnar.com>
> Cc: "Robert L. Harris" <Robert.L.Harris at rdlg.net>, Pikes-Peak Perl Mongers
> <pikes-peak-pm-list at happyfunball.pm.org>
> Subject: Re: LWP::UserAgent and referer?
> 
> 
> 
> Hmm, I did this one.  Looks nice.  Problem is now I'm getting a 404.  I put
> a few prints in.  If I go to the index.html and select "save as" it downloads
> fine.  If I go directly to the link I'm trying to get I get a 403, denied.
> When
> I run this segment of code I get a 404.  If I check the URL I'm accessing in
> "$url" below against my save as, they're identicle.
> 
> Thoughts?
> 
> 
> 
> Thus spake John Evans (evansj at kilnar.com):
> 
>> On Wed, 21 Nov 2001, Robert L. Harris wrote:
>> 
>>> 
>>> 
>>> I'm trying to use a perl script to mirror some patches.  The maker wishes
>>> you to manually go and download each time, which gets a bit tedious.
>>> 
>>> I've got it getting a list of patches, etc, but when I use "getstore"
>>> from LWP::Simple I get an error message:
>>> 
>> 
>> Here's how I do it and an example:
>> 
>> Imagine that the page http://www.foo.com/patches.html has links to:
>> http://www.foo.com/files/patch011119.tar.gz
>> http://www.foo.com/files/patch011120.tar.gz
>> http://www.foo.com/files/patch011121.tar.gz
>> 
>> You want the tar.gz files:
>> 
>> Try this script. It was written quickly and not tested:
>> 
>> #!/usr/local/bin/perl
>> 
>> use LWP::UserAgent;
>> use HTTP::Request;
>> use HTTP::Response;
>> use URI::Heuristic;
>> 
>> ### These two lines make sure that the URL is properly formatted
>> my $raw_url = shift or die "usage: $0 [URL to Fetch]\n";
>> my $url = URI::Heuristic::uf_urlstr($raw_url);
>> 
>> ### Build a new web client that will make our requests.
>> $client = LWP::UserAgent->new();
>> 
>> ### Make our perl script look like IE 5.5 under Windows 98
>> $client->agent("Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)");
>> 
>> ### Build the request for the file we want.
>> $request = HTTP::Request->new(GET => $url);
>> 
>> ### Make it look like we came from their site
>> $request->referer("http://www.foo.com/patches.html");
>> 
>> $response = $client->request($request);
>> 
>> if ($response->is_error()) {
>> printf("%s\n", $response->status_line);
>> }
>> else {
>> my $file = $response->content();
>> }
>> 
>> 
>> -- 
>> John Evans
>> http://evansj.kilnar.com/
>> http://www.foo.com/files/patch011121.tar.gz
>> 
>> -----BEGIN GEEK CODE BLOCK-----
>> Version: 3.1
>> GCS d- s++:- a- C+++>++++ ULSB++++$ P+++$ L++++$
>> E--- W++ N+ o? K? w O- M V PS+ !PE Y+ PGP t(--) 5-- X++(+++)
>> R+++ tv+ b+++(++++) DI+++ D++>+++ G+ e h--- r+++ y+++
>> ------END GEEK CODE BLOCK------
>> 
>> 
>> 
> 
> 
> 
> :wq!
> ---------------------------------------------------------------------------
> Robert L. Harris                |  Micros~1 :
> Senior System Engineer          |    For when quality, reliability
> at RnD Consulting             |      and security just aren't
> \_       that important!
> DISCLAIMER:
> These are MY OPINIONS ALONE.  I speak for no-one else.
> FYI:
> perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'
> 
> 




More information about the Pikes-peak-pm mailing list