[Chicago-talk] Testing if Page exists.

Andy Lester andy at petdance.com
Mon Mar 30 11:00:25 PDT 2015


> LWP::Simple & LWP::Useragent returned the page, but the pages are fairly dense with a lot of embedded javascript, embedded forms and ads the are being served up. All of which I don't need. It's just taking a lot of time and memory. I was just looking for something that would just give me a 404 or 200 or stop reading at the some place like the end of the /head tag. I'm trying to test out thousands of URLs which is the real problem. (This may not be possible.)


You can use the LWP::Simple head() function like David said, but head() vs. get() is all-or-nothing.  There’s no way to say “Give me the page up to the and of the <head> tag”.

I’m curious as to how these pages are taking a lot of memory.  You’re not storing them, are you?  What memory problems are you running into?

What’s the problem that you’re actually trying to solve?  Is it taking too long to do those 1000 URL checks?  How long is it taking, and how long would you like it to take?

--
Andy Lester => www.petdance.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/chicago-talk/attachments/20150330/e1f57f71/attachment-0001.html>


More information about the Chicago-talk mailing list