[Chicago-talk] How would you do this?
warren.lindsey at gmail.com
Mon Nov 12 10:44:44 PST 2007
Half of Web2.0 is flashy websites, the other half is making an API
available. Many sites push out data in JSON format to their apps and
the public. Some have published an API to allow you to work with the
site data, while others have been reverse-engineered.
YouTube has an API that you can connect to and retrieve the info you
want. Much better than hacking functionality into WWW::Mechanize.
The first page of google results for "youtube json feed" will give you
what you want to know in the "YouTube Data API: Developer's Guide: The
But why bother, because somebody has already written a module to do
the grunt work for you.
On Nov 12, 2007 11:55 AM, David Young <davidy at nationalcycle.com> wrote:
> I'm on another mailing list of colleagues, and this question came up. What
> would you all recommend?
> -----Original Message-----
> Sent: Monday, November 12, 2007 10:32 AM
> To: techies at xxxxxxxxxxxxxxxxxxxx
> Subject: How would you do this?
> More and more, I'm finding the need to do some page-scraping of web pages
> that are Web 2.0-ey.
> My normal scraping tool is perl's WWW::Mechanize and such. But it is
> For example...I'd like to write something that goes to YouTube, looks up
> a set of videos, analyzes the comments and ratings, and presents a
> summary report. That's just an example of something that is conceptually
> simple (easily done manually) but can't be done with WWW::Mechanize.
> So if you wanted to do something like that, what would you use?
> I'm not adverse to using something other than perl...something Windows-based
> would be my last resort. I'd consider a commercial product, though
> preferably not a multi-hundred-dollar one...
> Obviously, something that submits as well as scrapes would be ideal.
> Chicago-talk mailing list
> Chicago-talk at pm.org
More information about the Chicago-talk