SPUG: PS/PDF parsing
El JoPe Magnifico
jope-spug at jope.net
Fri Jun 8 20:11:48 CDT 2001
On Fri, 8 Jun 2001, Richard Seymour UW-NPL wrote:
> Don't you consider PDF pretty much already "web-enabled",
> since (almost) all browsers carry around Acrobat Reader?
Personally, yes. I should point out that the original request that
spurred me on to this line of questioning is _not_ my sole purpose
for asking. As usual, it got me thinking off on a tangent. =)
There are various other useful reasons to want to be able to parse
PS/PDF content, besides converting to HTML, which actually interests
me very little: Batch-modifying PDF's, cataloguing content in PDF's
found by a crawler, as an import filter for a layout program, etc.
On Fri, 8 Jun 2001, Scott Blachowicz wrote:
> That's going to be rather tricky since PS is a programming language
> that can be abused...you need a language interpreter (e.g.
> Ghostscript) to make sense of out of it.
First, that'd be cheatin'. =)
Second, COME ON, THIS IS PERL! If Damian hasn't taught you by now
that it can parse anything on the planet, then I'm not going to try.
Yes, it's a difficult problem, but therefore also an interesting one.
I just want to know whether anyone has already made headway on it.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
POST TO: spug-list at pm.org PROBLEMS: owner-spug-list at pm.org
Subscriptions; Email to majordomo at pm.org: ACTION LIST EMAIL
Replace ACTION by subscribe or unsubscribe, EMAIL by your Email-address
For daily traffic, use spug-list for LIST ; for weekly, spug-list-digest
Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
More information about the spug-list