[Charlotte.PM] Tidying HTML output / Temp Files

diona kidd diona at studio12a.com
Thu Apr 21 08:29:26 PDT 2005


For temp files you could use File::Temp to generate unique temp file
names if you're concerned at all about overwriting or race conditions.
Couldn't you just hold it in memory though? I'm not familar with the
requirements of tidy.exe...does it need a file?

Diona



On Thu, 2005-04-21 at 09:23 -0400, Cory Foy wrote:

> This is a two-in-one post question. :)
> 
> I am working on a little script that works with some HTML being 
> returned. Because I ultimately need to make XPath queries into it, and 
> the HTML is not XHTML, I need to tidy it up.
> 
> The solution I have was to get the content back, write it to a temp 
> file, make an external call to Tidy telling it to write back to the 
> file, and then re-reading in the file. It looks like:
> 
> ####################
> my $out = $mech->content();
> $out =~ s/&/&/g;
> 
> open TEMP, '>tmp1.1';
> print TEMP $out;
> close TEMP;
> 
> `c:\\perl\\tidy.exe --write-back true --output-xhtml true c:\\perl\\tmp1.1`;
> 
> my $file = 'c:\\perl\\tmp1.1';
> my $xp = XML::XPath->new(filename => $file);
> ####################
> 
> (by the way, I'm sure my Perl is rusty - just getting back into it after 
> a while, so syntactic suggestions are welcome too)
> 
> So two questions:
> 
> 1) What is a better way to get a temp file? I don't like the hardcoding 
> of a file name - it smells to me.
> 
> 2) Is there a better way to tidy the output so that I don't have to rely 
> on writing to a temp file which has to be processed by tidy?
> 
> Thanks!
> 
> Cory
> 
> _______________________________________________
> charlotte mailing list
> charlotte at pm.org
> http://mail.pm.org/mailman/listinfo/charlotte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/charlotte/attachments/20050421/0a34e820/attachment.htm


More information about the charlotte mailing list