SPUG: Not again...
Andrew Sweger
andy at n2h2.com
Tue Jan 4 15:45:47 CST 2000
On Jan 4, 2000 @ 12:38pm, Steve Laybourn wrote:
> The other way I recently learned was:
>
> @hitz=grep(/$searchstring/, at array);
>
> which seems to almost work the quickest, except the duplicate matches
> really pile up quickly in this one...
Pretty good analysis of the problem. Most folks are reduced to
sort-and-uniquify to remove duplicates in a set (as a result of some other
operation, like a search). An alternate method, that Perl affords, takes
advantage of a hash.
%hitz = map {$_, 1} grep {m/$searchstring/o} @array;
I'm not sure that 'o' after the search pattern would help or not since the
looping is already an internal function.
The list, keys %hitz, will be the matching elements minus duplicates. You
lose ordering information that was in the original @array. But there are
ways to deal with that too (of course).
Since it sounds like your dealing with a text file, here are some old
fashioned ways to accomplish similar results (assuming you have Unix-like
utilities at hand):
grep <pattern> file # finds lines matching pattern
grep <pattern> file | sort | uniq # "removes" duplicates
grep <pattern> file | sort -u # same as last one if your sort has -u
grep <pattern> file | sort | uniq -c | sort -n
# lists an ordered frequency dist. of
# matches
> I'll just bet someone will be waiting for me outside the next SPUG
> meeting with a tar melter and a couple of dozen goose-down pillows...
I don't have any tar or feathers, but I'll be glad to bring the camel and
penquin (I don't have to carry them as far this time).
--
Andrew Sweger <andy at n2h2.com> N2H2, Incorporated
900 Fourth Avenue, Suite 3400
No thanks, I'll just have the Seattle WA 98164-1059
Linux with a side of Perl http://www.n2h2.com/ (206) 336-2947
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
POST TO: spug-list at pm.org PROBLEMS: owner-spug-list at pm.org
Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
SUBSCRIBE/UNSUBSCRIBE: Replace ACTION below by subscribe or unsubscribe
Email to majordomo at pm.org: ACTION spug-list your_address
More information about the spug-list
mailing list