SPUG: Day meeting in Bellevue
John Costello
cos at indeterminate.net
Wed Dec 7 15:41:56 PST 2005
On Wed, 7 Dec 2005, Tim Maher wrote:
> On Wed, Dec 07, 2005 at 02:57:57PM -0800, DeRykus, Charles E wrote:
> >
> > >>I know Word has a built-in "find" utility with its own (lame) regex dialect, but
> > >>I need to automate my searches, not babysit them with mouse in hand.
> >
> > May not help but the Open Source 'antiword' does a better job than 'strings' at
> > yanking text out of Word while preserving formatting. Feeding the stream into Perl
> > should be a win in many cases...
> > --
> > Charles DeRykus
>
> Thanks for mentioning "antiword", which I'd never heard of.
>
> It only provides access to the "plain text" of the Word doc, but
> it might be useful--and I get to have the source!
The freshmeat link is <http://freshmeat.net/projects/antiword/>, with
links to source code.
Antiword doesn't reveal everything about Word docs, possibly because
MSFT hasn't released the format yet, but it is able to handle different
versions somewhat.
> -Tim
John
-----
John Costello - cos at indeterminate dot net
"You cannot propel yourself forward by patting yourself on the back."--Unknown
More information about the spug-list
mailing list