[Melbourne-pm] unicode html->pdf?

Tue Oct 23 17:42:28 PDT 2007

Guy Morton wrote:
> I've been struggling with this for a while so I thought I'd ask here  
> to see if someone else has come across this problem and found a  
> workable solution.
> 
> I have a client for whom I maintain a number of html-formatted  
> document templates. These are editable via a web interface using the  
> tinyMCE editor, which is very groovy and which works well. I use TT  
> to render these templates into documents for further processing into  
> PDFs.
> 
> HTMLDOC is the application I've used in the past to convert these  
> documents into PDF, however it does not support unicode and therefore  
> cannot render chinese characters in documents.
> 
> So, what I need is a way to convert my utf-8 charset HTML-formatted  
> documents into PDF.
> 
> Anyone got a suggestion as to what might work?

I successfully did this with XSL-FO.
(ie. produce the Unicode XML with TT, then use an XSLT to produce
XSL-FO, and Apache FOP to get a PDF from it). You can do it all with
Perl right up to the system() call to run fop, unless they've produced
perl bindings for it by now. (I was doing this in 2003).

Toby