[Chicago-talk] How would you do this?

David Young davidy at nationalcycle.com
Mon Nov 12 09:55:42 PST 2007


I'm on another mailing list of colleagues, and this question came up.  What
would you all recommend?


-----Original Message-----
Sent: Monday, November 12, 2007 10:32 AM
To: techies at xxxxxxxxxxxxxxxxxxxx
Subject: How would you do this?

More and more, I'm finding the need to do some page-scraping of web pages
that are Web 2.0-ey.

My normal scraping tool is perl's WWW::Mechanize and such.  But it is 
completely javascript brain-dead.

For example...I'd like to write something that goes to YouTube, looks up
a set of videos, analyzes the comments and ratings, and presents a 
summary report.  That's just an example of something that is conceptually
simple (easily done manually) but can't be done with WWW::Mechanize.

So if you wanted to do something like that, what would you use?

I'm not adverse to using something other than perl...something Windows-based
would be my last resort.  I'd consider a commercial product, though 
preferably not a multi-hundred-dollar one...

Obviously, something that submits as well as scrapes would be ideal.

More information about the Chicago-talk mailing list