[Boulder.pm] Word Association Tool
Keanan Smith
KSmith at netLibrary.com
Mon Dec 30 10:41:02 CST 2002
This is a relatively simple, and special-use, perl script, I can't imagine
anyone has bothered to turn it into a package, something like:
open (IN,"<$filename");
undef $/; # make <> operator use the whole file
$file = <IN>;
%assoctable =
(
word => associated,
...
);
foreach $word (keys %assoctable)
{
if ($file =~ /\b$word\b/ && $file =~ /\b$assoctable{$word}\b/)
{
print "$filename contains both $word and
$assoctable{$word}!\n";
}
}
should do what you've described...
-----Original Message-----
From: Walter Pienciak [mailto:walter at frii.com]
Sent: Saturday, December 28, 2002 11:54 AM
To: Lewis, Donald G
Cc: boulder-pm at mail.pm.org
Subject: Re: [Boulder.pm] Word Association Tool
On Sat, 28 Dec 2002, Walter Pienciak wrote:
> On Fri, 27 Dec 2002, Lewis, Donald G wrote:
>
> > Does anyone know of a table based word association tool. It would check
to
> > see if a list of words is in a file and if so is another word in the
same
> > file. Optionally you could say how many words away the association can
be
> > detected.
> >
> > Thanks,
> >
> > Donald G. Lewis (303-581-4879)
> > Lockheed Martin M&DS
> > Boulder, CO
>
> I don't have the time right now to check out the various packages, but
> this sounds like something common to many search engines. I'd look
> at swish, glimpse, and htdig. If any of them return the offset in a file
> for a particular word, you're in.
>
> Walter
Replying to myself again . . .
Depending on what you want to do -- how many files, how often they
change, etc. -- you might also want to hack a script using GNU grep,
which has a --byte-offset switch.
I'm going outside now: it's a beautiful day, and it would be a crime
to spend it all puttering at the keyboard.
Walter
_______________________________________________
Boulder-pm mailing list
Boulder-pm at mail.pm.org
http://mail.pm.org/mailman/listinfo/boulder-pm
More information about the Boulder-pm
mailing list