[Pdx-pm] common elements in two lists

Joshua Keroes joshua at keroes.com
Fri Nov 4 15:43:25 PDT 2011


A few more libraries:

http://search.cpan.org/~turugina/Set-Intersection-0.02/lib/Set/Intersection.pm
http://search.cpan.org/~duncand/Set-Relation-0.12.7/lib/Set/Relation.pm

-Joshua

On Fri, Nov 4, 2011 at 3:41 PM, Joshua Keroes <joshua at keroes.com> wrote:

> Well, grep is always going to check every single item in the list, so if
> you can avoid that, you'll can save time.
>
> Out of curiosity, have you tried out or benchmarked against something like
> List::Compare's get_intersection()?
>
> -Joshua
>
> 2011/11/4 Tom Keller <kellert at ohsu.edu>
>
>>  Greetings,
>> I have two very long lists of names. They have slightly different
>> conventions for naming the same thing, so I devised a regex to compare the
>> two lists. I need to extract the names common to both. (Acknowledgement:
>> "Effective Perl Programming, 1st ed.")
>> But it is taking an ungodly amount of time, since
>> names1 contains 46227 names.
>> names2 contains 5726 names.
>>
>> Here's the code:
>> ########
>> my @names1 = get_names($file1);
>> my @names2 = get_names($file2);
>> #say join(", ", @names1);
>>
>> my @out = map { $_ =~  m/\w+[-_]*(\w*[-_]*\d+[a-z]*).*/ } @names2;
>> my @index = grep {
>> my $c = $_;
>> if ( $c > $#names1  or # always false
>>  ( grep { $names1[$c] =~ m/$_/ } @out ) > 0) {
>> 1;  ## save
>> } else {
>>  0;  ## skip
>> }
>> } 0 .. $#names1;
>>
>> my @common = map { $names1[$_] } @index;
>>  ########
>>
>> Is there a faster/better way to do this?
>>
>> thanks,
>> Tom
>> MMI DNA Services Core Facility<http://www.ohsu.edu/xd/research/research-cores/dna-analysis/>
>> 503-494-2442
>> kellert at ohsu.edu
>> Office: 6588 RJH (CROET/BasicScience)
>>
>> OHSU Shared Resources<http://www.ohsu.edu/xd/research/research-cores/index.cfm>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Pdx-pm-list mailing list
>> Pdx-pm-list at pm.org
>> http://mail.pm.org/mailman/listinfo/pdx-pm-list
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/pdx-pm-list/attachments/20111104/c5e285cc/attachment-0001.html>


More information about the Pdx-pm-list mailing list