SPUG: Day meeting in Bellevue

John Costello cos at indeterminate.net
Wed Dec 7 15:41:56 PST 2005


On Wed, 7 Dec 2005, Tim Maher wrote:

> On Wed, Dec 07, 2005 at 02:57:57PM -0800, DeRykus, Charles E wrote:
> > 
> > >>I know Word has a built-in "find" utility with its own (lame) regex dialect, but 
> > >>I need to automate my searches, not babysit them with mouse in hand.
> > 
> > May not help but the Open Source 'antiword' does a better job than 'strings' at 
> > yanking text out of Word while preserving formatting. Feeding the stream into Perl
> > should be a win in many cases...
> > --
> > Charles DeRykus
> 
> Thanks for mentioning "antiword", which I'd never heard of.
> 
> It only provides access to the "plain text" of the Word doc, but
> it might be useful--and I get to have the source!

The freshmeat link is <http://freshmeat.net/projects/antiword/>, with 
links to source code.

Antiword doesn't reveal everything about Word docs, possibly because 
MSFT hasn't released the format yet, but it is able to handle different 
versions somewhat.  
 
> -Tim 

John
-----
John Costello - cos at indeterminate dot net
"You cannot propel yourself forward by patting yourself on the back."--Unknown



More information about the spug-list mailing list