Well, grep is always going to check every single item in the list, so if you can avoid that, you'll can save time.<div><br></div><div>Out of curiosity, have you tried out or benchmarked against something like List::Compare's get_intersection()?<br>


<br>-Joshua</div><div><br><div class="gmail_quote">2011/11/4 Tom Keller <span dir="ltr"><<a href="mailto:kellert@ohsu.edu">kellert@ohsu.edu</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div style="word-wrap:break-word">Greetings,<div>I have two very long lists of names. They have slightly different conventions for naming the same thing, so I devised a regex to compare the two lists. I need to extract the names common to both. (Acknowledgement: "Effective Perl Programming, 1st ed.")</div>


<div>But it is taking an ungodly amount of time, since </div><div><div>names1 contains 46227 names.</div><div>names2 contains 5726 names.</div></div><div><br></div><div>Here's the code:</div><div>########</div><div><div>


my @names1 = get_names($file1);</div><div>my @names2 = get_names($file2);</div><div>#say join(", ", @names1);</div><div><br></div><div>my @out = map { $_ =~  m/\w+[-_]*(\w*[-_]*\d+[a-z]*).*/ } @names2;</div><div>


my @index = grep {</div><div><span style="white-space:pre-wrap">    </span>my $c = $_;</div><div><span style="white-space:pre-wrap">      </span>if ( $c > $#names1  or <span style="white-space:pre-wrap">              </span># always false</div>


<div><span style="white-space:pre-wrap">          </span>( grep { $names1[$c] =~ m/$_/ } @out ) > 0) {</div><div><span style="white-space:pre-wrap">         </span>1;  ## save</div><div><span style="white-space:pre-wrap">      </span>} else {</div>


<div><span style="white-space:pre-wrap">          </span>0;  ## skip</div><div><span style="white-space:pre-wrap">      </span>}</div><div>} 0 .. $#names1;</div><div><br></div><div>my @common = map { $names1[$_] } @index;</div></div>

<div>

########</div><div><br></div><div>Is there a faster/better way to do this?</div><div><br></div><div>thanks,<br><div>

Tom<br><a href="http://www.ohsu.edu/xd/research/research-cores/dna-analysis/" target="_blank">MMI DNA Services Core Facility</a><br><a href="tel:503-494-2442" value="+15034942442" target="_blank">503-494-2442</a><br>kellert at <a href="http://ohsu.edu" target="_blank">ohsu.edu</a><br>


Office: 6588 RJH (CROET/BasicScience)<br><br><a href="http://www.ohsu.edu/xd/research/research-cores/index.cfm" target="_blank">OHSU Shared Resources</a><br><br><br><br><br><br>

</div>

<br></div></div><br>_______________________________________________<br>

Pdx-pm-list mailing list<br>

<a href="mailto:Pdx-pm-list@pm.org">Pdx-pm-list@pm.org</a><br>

<a href="http://mail.pm.org/mailman/listinfo/pdx-pm-list" target="_blank">http://mail.pm.org/mailman/listinfo/pdx-pm-list</a><br></blockquote></div><br></div>