[San-Diego-pm] Photo download problem
Joel Fentin
joel at fentin.com
Sun Aug 22 16:37:25 PDT 2010
Chris Grau wrote:
>> http://fentin.com/cgi-bin/temp.pl
> [snip]
>
>> Wikimedia permits the downloading of photos and hotlinking. Yet I
>> can't get it to do anything with perl. The two URLs above (same photo
>> in each case) work in the FF browser. The third example is from my own
>> website as test. It downloads content.
>
> I seem to have the opposite results when running your test code. I
> receive an error attempting to download wikimedia's content (trimmed):
>
> $ HEAD http://upload.wikimedia.org/wikipedia/commons/1/19/PacificSilverFir_7645.jpg
> 403 Forbidden
> X-Squid-Error: ERR_ACCESS_DENIED 0
>
> They must be blocking LWP's user agent, because wget works fine. I
> didn't bother testing a modified user agent string in LWP.
Are you saying that it works fine in the browser because the
browser has a different user agent? And if so, are you saying that
I have to spoof a user agent to get the photos?
>> $f = 'http://commons.wikimedia.org/wiki/File:PacificSilverFir_7645.jpg';
>> $Content2 = get($f);
>> print length($Content2)."<br>";
>>
>> $f = 'http://upload.wikimedia.org/wikipedia/commons/1/19/PacificSilverFir_7645.jpg';
>> $Content2 = get($f);
>> print length($Content2)."<br>";
>>
>> $f = 'http://fentin.com/Ecuador/B_Tuncarta-Children.jpg';
>> $Content2 = get($f);
>> print length($Content2)."<br>";
>> print "<BR>DONE";
>
> What are you trying to accomplish?
My client wants to download thousands of plant pictures. He is
being very careful about photo credits and copyright issues.
I currently have a list of about 20,000 photos he wants. Wikimedia
is one of the sources. I wrote a Perl program to loop through the
list. It failed at the first photo. I've been try a variety of
ways to get that first photo.
All three get() statements
> theoretically download the content of the URL. In fact, running your
> script, the file on your website was the only one to successfully
> download (again, the apparent user agent problem).
The fact that you have identified this as a user agent problem is
helpful. I'm not sure what to do with it yet, but it's much better
than nothing.
--
Joel Fentin tel: 760-749-8863
Biz Website: http://fentin.com
Personal Website: http://fentin.com/me
More information about the San-Diego-pm
mailing list