[Pdx-pm] common elements in two lists
Tom Keller
kellert at ohsu.edu
Fri Nov 4 15:34:26 PDT 2011
Greetings,
I have two very long lists of names. They have slightly different conventions for naming the same thing, so I devised a regex to compare the two lists. I need to extract the names common to both. (Acknowledgement: "Effective Perl Programming, 1st ed.")
But it is taking an ungodly amount of time, since
names1 contains 46227 names.
names2 contains 5726 names.
Here's the code:
########
my @names1 = get_names($file1);
my @names2 = get_names($file2);
#say join(", ", @names1);
my @out = map { $_ =~ m/\w+[-_]*(\w*[-_]*\d+[a-z]*).*/ } @names2;
my @index = grep {
my $c = $_;
if ( $c > $#names1 or # always false
( grep { $names1[$c] =~ m/$_/ } @out ) > 0) {
1; ## save
} else {
0; ## skip
}
} 0 .. $#names1;
my @common = map { $names1[$_] } @index;
########
Is there a faster/better way to do this?
thanks,
Tom
MMI DNA Services Core Facility<http://www.ohsu.edu/xd/research/research-cores/dna-analysis/>
503-494-2442
kellert at ohsu.edu<http://ohsu.edu>
Office: 6588 RJH (CROET/BasicScience)
OHSU Shared Resources<http://www.ohsu.edu/xd/research/research-cores/index.cfm>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/pdx-pm-list/attachments/20111104/ac6aa111/attachment.html>
More information about the Pdx-pm-list
mailing list