[kw-pm] unique

Wed Apr 19 00:47:07 PDT 2006

Dave Carr wrote:
> Found this snippet to do unique on a sorted array
>
>  
>
> $prev = 'nonesuch';
>
> @uniq = grep($_ eq $prev && (($prev) = $_), @sorted);
>
>  
>
> How could I modify this to require a string to occur at least two times
> to be included in the output array, in other words single occurrences
> will be skipped?
>
>   

This isn't the classical way of doing things.  Normally you'd use a hash
(%seen).

However, if you'd really like to do it with the same rough structure as
you provide above, you could use:

my ($prev, $before_that) = ();
@uniq = grep($_ eq $prev && $prev eq $before_that && (($prev) = $_) &&
($before_that = $prev), @sorted);

Better hope that none of your sorted values are "0", though.

Trust me, this is a crappy algorithm.  Use the hash key method.

For unique items (note that input list doesn't have to be sorted, thus @items rather than @sorted):
====================================================================================================
my %seen = ();
$seen{$_}++ foreach @items;
@uniq = grep { $seen{$_} > 0 } keys %seen;

This algorithm has the added benefit of being easily expandable to requiring an item to be seen twice (or more):

my %seen = ();
$seen{$_}++ foreach @items;
@uniq = grep { $seen{$_} > 1 } keys %seen;

Finding Perl snippets and using them without knowing what they're doing / how they work will only bite you in the ass.  That includes the snippet I just gave here.  Learn how it works before you use it.

Cheers,
Richard