Spell Check ...

Al Tobey albert.tobey at priority-health.com
Mon Jan 28 09:49:14 CST 2002


Matt,  you really can't keep doing this!  It takes me hours to clean the
flame-thrower! ;)

Umm.  There are already a bunch of spell checkers out there that are
both perl and C/C++ (cross-platform).  Linux Journal actually published
a neat article that will spell check your typos and make sure they fit
with the desired keyboard layout ;)  In the article, the author talked
about how typos differ from QWERTY keybards to DVORAK keyboards.  His
program would detect QWERTY typos and translate them to the proper
keyboard-typo-type.

$500 for a spell checker in perl (or any other language) is rediculous. 
For one in C, it is knavery as there are 3 or 4 well-known API's in the
wild that are free (as in beer or GPL).  Beyond that, AFAIK, in the
Microsoft world you have COM objects so you can tie into the same spell
checker used by Word and Outlook.  You can also connect to COM objects
from ActiveState perl, last I saw.  I'm not a Microsoft programmer, so
I'm not too sure about that.  I know you feel that software should have
a stiff price, but this is downright rediculous.  None of the options I
mention below require open sourcing your project, and furthermore, a
couple of them don't even require you to give a link to the source
(although, with perl it would be a good idea if you didn't bundle it).

A quick search on CPAN results in these:
Lingua::*  There are at least two spell checkers for English in this
group on a quick scan ...
Text::Pspell, an interface to the C library pspell
Text::Ispell, an interface to the unix spell checker, ispell

There is also the Apache module mod_speling (yes, that's what it is
called), among many other options.

But, all in all, if there is some sucker^H^H^H^H^H^HCIO out there that
will pay $500 for something he can have (or might already have) for
free, then I guess that's up to him.  Maybe I can sell him some stock in
my website which will make money by not asking for any and not selling
anything ;)

-Al
Heckler Extraordinaire and Unix Administrator

On Mon, 2002-01-28 at 07:49, matthew_heusser at mcgraw-hill.com wrote:
> 
> This weekend, I was thinking that XP, Peopleware,
> SMNP, and wireless networking are cool, but they
> aren't really ... perl.
> 
>   So, my proposal is that interspersed with various CS-
> appealing topics, we drill down deep into perl and cover
> some things that many folks are 'scared' of.
> 
>   Regular Expressions.  Hash Tables of Arrays of Hash
> tables (and thier uses).  Packages.
> 
>   Which got me to think: Making a spell checking package
> would be pretty darn easy(*).  In fact, I could do it for around
> $500, (assuming we find a sponsor ... isn't there some design
> firm that wants this functionality somewhere in GR? :-)
> and get GR.PM to Code Review it an QA it.  I would donate
> 80% of the proceeds to GR.PM, and we could use the money
> to pay our speakers, or subsidise lunch, or have door prizes, etc.
> (I'm guessing this would make attendence and presentations
> shoot up.)  This could make an interesting open-source project ...
> 
> Then again, If someone asked nicely, I'd probably do this for free.
> (Or use it as an independent study in Grad School or something ...)
> 
> Comments?
> 
> regards,
> 
> Matt H.
> (*) - Literally, you'd pass in a string, I'd call a function
> to remove punctuation (which also turns "can't" into "can not", etc.),
> then I'd call split() ... then read the dictionary into a hash
> table, do a foreach loop, and return an array ...
> 
> The biggest challenges are getting an ASCII english dictionary
> and performance ... I think if we embed the dictionary into the
> perl script and pre-compile it (then put it onto a caching server)
> we're okay ... hmm ..
> 
> 
> 




********************************************************************
This email and any files transmitted with it are confidential
and intended solely for the use of the individual or entity
to whom they are addressed.  If you have received this 
email in error please notify the Priority Health Information
Services Department at (616) 942-0954.
********************************************************************




More information about the grand-rapids-pm-list mailing list