[Chicago-talk] parsing HTML

Andy Lester andy at petdance.com
Fri Feb 23 13:35:36 PST 2007

On Feb 23, 2007, at 3:18 PM, Jay Strauss wrote:

> Would you suggest using a regex (that I can't get to work) or some
> module (like HTML::Parser)?

If all you want is the text, look at WWW::Mechanize's  ->content()  


$mech->content( format => "text" )

     Returns a text-only version of the page, with all HTML markup  
stripped. This feature requires HTML::TreeBuilder to be installed, or  
a fatal error will be thrown.

