[Boulder.pm] Word Association Tool

Keanan Smith KSmith at netLibrary.com
Mon Dec 30 10:41:02 CST 2002


This is a relatively simple, and special-use, perl script, I can't imagine
anyone has bothered to turn it into a package, something like:

open (IN,"<$filename");
undef $/; # make <> operator use the whole file
$file = <IN>;
%assoctable =
(
	word => associated,
	...
);

foreach $word (keys %assoctable)
{
	if ($file =~ /\b$word\b/ && $file =~ /\b$assoctable{$word}\b/)
	{
		print "$filename contains both $word and
$assoctable{$word}!\n";
	}
}


should do what you've described...

-----Original Message-----
From: Walter Pienciak [mailto:walter at frii.com]
Sent: Saturday, December 28, 2002 11:54 AM
To: Lewis, Donald G
Cc: boulder-pm at mail.pm.org
Subject: Re: [Boulder.pm] Word Association Tool


On Sat, 28 Dec 2002, Walter Pienciak wrote:

> On Fri, 27 Dec 2002, Lewis, Donald G wrote:
>
> > Does anyone know of a table based word association tool. It would check
to
> > see if a list of words is in a file and if so is another word in the
same
> > file. Optionally you could say how many words away the association can
be
> > detected.
> >
> > Thanks,
> >
> > Donald G. Lewis (303-581-4879)
> > Lockheed Martin M&DS
> > Boulder, CO
>
> I don't have the time right now to check out the various packages, but
> this sounds like something common to many search engines.  I'd look
> at swish, glimpse, and htdig.  If any of them return the offset in a file
> for a particular word, you're in.
>
> Walter

Replying to myself again . . .

Depending on what you want to do -- how many files, how often they
change, etc. -- you might also want to hack a script using GNU grep,
which has a --byte-offset switch.

I'm going outside now:  it's a beautiful day, and it would be a crime
to spend it all puttering at the keyboard.

Walter

_______________________________________________
Boulder-pm mailing list
Boulder-pm at mail.pm.org
http://mail.pm.org/mailman/listinfo/boulder-pm



More information about the Boulder-pm mailing list