SPUG: Not again...

Andrew Sweger andy at n2h2.com
Tue Jan 4 15:45:47 CST 2000


On Jan 4, 2000 @ 12:38pm, Steve Laybourn wrote:

>    The other way I recently learned was:
> 
> @hitz=grep(/$searchstring/, at array);
> 
>    which seems to almost work the quickest, except the duplicate matches 
> really pile up quickly in this one...

Pretty good analysis of the problem. Most folks are reduced to
sort-and-uniquify to remove duplicates in a set (as a result of some other
operation, like a search). An alternate method, that Perl affords, takes
advantage of a hash.

%hitz = map {$_, 1} grep {m/$searchstring/o} @array;

I'm not sure that 'o' after the search pattern would help or not since the
looping is already an internal function.

The list, keys %hitz, will be the matching elements minus duplicates. You
lose ordering information that was in the original @array. But there are
ways to deal with that too (of course).

Since it sounds like your dealing with a text file, here are some old
fashioned ways to accomplish similar results (assuming you have Unix-like
utilities at hand):

grep <pattern> file                 # finds lines matching pattern

grep <pattern> file | sort | uniq   # "removes" duplicates

grep <pattern> file | sort -u       # same as last one if your sort has -u

grep <pattern> file | sort | uniq -c | sort -n

                                    # lists an ordered frequency dist. of
                                    # matches

>    I'll just bet someone will be waiting for me outside the next SPUG 
> meeting with a tar melter and a couple of dozen goose-down pillows...

I don't have any tar or feathers, but I'll be glad to bring the camel and
penquin (I don't have to carry them as far this time).

-- 
 Andrew Sweger <andy at n2h2.com>    N2H2, Incorporated
                                  900 Fourth Avenue, Suite 3400
 No thanks, I'll just have the    Seattle WA 98164-1059
     Linux with a side of Perl    http://www.n2h2.com/  (206) 336-2947





 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    POST TO: spug-list at pm.org        PROBLEMS: owner-spug-list at pm.org
 Seattle Perl Users Group (SPUG) Home Page: http://www.halcyon.com/spug/
 SUBSCRIBE/UNSUBSCRIBE: Replace ACTION below by subscribe or unsubscribe
        Email to majordomo at pm.org: ACTION spug-list your_address





More information about the spug-list mailing list