SPUG: Day meeting in Bellevue

Tim Maher tim at consultix-inc.com
Wed Dec 7 13:36:33 PST 2005


On Wed, Dec 07, 2005 at 12:50:19PM -0800, John Costello wrote:
> Duane,
> 
> Sorry for the late response.  Wednesday the 14th in Bellevue sounds great.  
> Azteca is close to my office, but Dixie's BBQ works as well.
> 
> In advance of the meeting:  Could someone point me to an app (preferable)
> or C library (less preferable but oh well) that decodes MS Word docs? 
> John
> -----
> John Costello - cos at indeterminate dot net

As someone who's been recently forced to convert a large
manuscript into Word (my upcoming Perl book), I suddenly
find myself in need of a grep-like utility for word docs.

I'd naturally prefer an Open Source, Perlish solution, but I'd
consider other options that do the job well. Apart from using
regexes to match and extract plain text, I'd like to 
match text by /attributes/ such as style and font in
addition to character patterns (JGsoft's $149 "powergrep"
sounds like "strings file.doc | grep 'pattern'", which isn't
quite good enough.)  

I know Word has a built-in "find" utility with its own (lame)
regex dialect, but I need to automate my searches, not babysit
them with mouse in hand.

-Tim
*-------------------------------------------------------------------*
|  Tim Maher, PhD  (206) 781-UNIX   (866) DOC-PERL  (866) DOC-UNIX  |
|   tim(AT)Consultix-Inc.Com   TeachMePerl.Com   TeachMeUnix.Com    |
*-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-*
|  Watch for my upcoming book: "Minimal Perl for UNIX/Linux People" |
|  See MinimalPerl.com for details, ordering, and email-list signup |
*-------------------------------------------------------------------*


More information about the spug-list mailing list