[Chicago-talk] Testing if Page exists.
Jim Jacobus
JJacobus at PonyX.com
Mon Mar 30 12:09:44 PDT 2015
At 01:30 PM 3/30/2015, you wrote:
>How about the Command "curl"?
Curl or wget would take longer since I'd have to
make a system() call and then parse the resulting file.
>On Mon, Mar 30, 2015 at 1:00 PM, Andy Lester
><<mailto:andy at petdance.com>andy at petdance.com> wrote:
>>LWP::Simple & LWP::Useragent returned the page,
>>but the pages are fairly dense with a lot of
>>embedded javascript, embedded forms and ads the
>>are being served up. All of which I don't need.
>>It's just taking a lot of time and memory. I
>>was just looking for something that would just
>>give me a 404 or 200 or stop reading at the
>>some place like the end of the /head tag. I'm
>>trying to test out thousands of URLs which is
>>the real problem. (This may not be possible.)
>
>You can use the LWP::Simple head() function like
>David said, but head() vs. get() is
>all-or-nothing. There’s no way to say
>“Give me the page up to the and of the <head> tagâ€.
>
>I’m curious as to how these pages are taking a
>lot of memory. You’re not storing them, are
>you? What memory problems are you running into?
>
>What’s the problem that you’re actually
>trying to solve? Is it taking too long to do
>those 1000 URL checks? How long is it taking,
>and how long wouldx you like it to take?
>
>--
>Andy Lester =>Â <http://www.petdance.com>www.petdance.com
>
>
>_______________________________________________
>Chicago-talk mailing list
><mailto:Chicago-talk at pm.org>Chicago-talk at pm.org
>http://mail.pm.org/mailman/listinfo/chicago-talk
>
>
>
>
>--
>-----------------
>Hal Wigoda
>Chicago
>_______________________________________________
>Chicago-talk mailing list
>Chicago-talk at pm.org
>http://mail.pm.org/mailman/listinfo/chicago-talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20150330/e4d275e3/attachment.html>
More information about the Chicago-talk
mailing list