[Chicago-talk] WWW::Mechanize can't recognize a html file with no file extention.

Whitney Jackson whjackson at gmail.com
Mon Mar 6 21:36:54 PST 2006


On Sun, 2006-03-05 at 21:22 -0800, James.Q.L wrote:
> hi, 
> 
> i am reading many html files locally using WWW::Mechanize and URI. 
> the html files have no extention. however, WWW::Mechanize can't seem to recognize the file as
> html.
> the followign code spits out the html source file instead of text only.
> a file with ht[m|ml] extention works fine. anyway i can get around it without renaming all the
> files (not that it is a big deal though, but i am wondering..)
> 
> use strict;
> use warnings;
> use URI::file;
> use WWW::Mechanize;
> 
> my $m = WWW::Mechanize->new();
> $m->get(URI::file->new('/home/pub/html/test/151683_10')); 
> print $m->content(format => 'text');
> 
> TIA,
> 
> Qiang
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk


This works if you don't mind being a little evil:

my $m = WWW::Mechanize->new();
$m->get(URI::file->new('/home/pub/html/test/151683_10'));
$m->{ct} = 'text/html'; # <-- the evil part
print $m->content(format => 'text');

It would be nice if the ct method could take an argument that lets you
set the content type.  Or even better if you could change the default to
whatever you want.  I once had to use WWW::Mechanize against a strange
web server that refused to define 'Content-Type' and it was a real pain.

Whitney




More information about the Chicago-talk mailing list