[kw-pm] unique

John Macdonald john at perlwolf.com
Wed Apr 19 08:15:49 PDT 2006


On Wed, Apr 19, 2006 at 03:47:07AM -0400, Richard Dice wrote:
> Dave Carr wrote:
> > Found this snippet to do unique on a sorted array
> >
> >  
> >
> > $prev = 'nonesuch';
> >
> > @uniq = grep($_ eq $prev && (($prev) = $_), @sorted);
> >
> >  
> >
> > How could I modify this to require a string to occur at least two times
> > to be included in the output array, in other words single occurrences
> > will be skipped?
> >
> >   
> 
> This isn't the classical way of doing things.  Normally you'd use a hash
> (%seen).
> 
> However, if you'd really like to do it with the same rough structure as
> you provide above, you could use:
> 
> my ($prev, $before_that) = ();
> @uniq = grep($_ eq $prev && $prev eq $before_that && (($prev) = $_) &&
> ($before_that = $prev), @sorted);

You'd have to switch the assignments so that $before_that = $prev
is done before $prev is changed to $_.

> Better hope that none of your sorted values are "0", though.
> 
> Trust me, this is a crappy algorithm.  Use the hash key method.

It does have the advantage of not needing to make a hash that
has as many elements as the original list (minus duplicates).
If the program was stressing memory size originally, that could
be significant but normally it would be immaterial.

> For unique items (note that input list doesn't have to be sorted, thus @items rather than @sorted):
> ====================================================================================================
> my %seen = ();
> $seen{$_}++ foreach @items;
> @uniq = grep { $seen{$_} > 0 } keys %seen;
> 
> 
> This algorithm has the added benefit of being easily expandable to requiring an item to be seen twice (or more):
> 
> my %seen = ();
> $seen{$_}++ foreach @items;
> @uniq = grep { $seen{$_} > 1 } keys %seen;
> 
> 
> Finding Perl snippets and using them without knowing what they're doing / how they work will only bite you in the ass.  That includes the snippet I just gave here.  Learn how it works before you use it.
> 
> Cheers,
> Richard
> 
> 
> _______________________________________________
> kw-pm mailing list
> kw-pm at pm.org
> http://mail.pm.org/mailman/listinfo/kw-pm

-- 


More information about the kw-pm mailing list