[Omaha.pm] Find all the keys present in two files

Jay Hannah jay at jays.net
Mon Mar 3 13:38:48 PST 2008


Another quicky I was given today.

Problem:

   Given one file which is a list of keys, and another file which is 
tab-delimited and the key is the 0th column, find all the keys that are 
present in both files.



5m solution and example run below.

j




$ cat /tmp/j.pl
#!/usr/bin/perl

my %hash;

open (IN, $ARGV[0]);
while (<IN>) {
   chomp;
   $hash{$_} = 1;
}
close IN;

my @matches;

open (IN, $ARGV[1]);
while (<IN>) {
   chomp;
   my @l = split /\t/;
   if ($hash{$l[0]}) {
      push @matches, $l[0];
   }
}
close IN;

print join ", ", @matches;
print "\n";

$ /tmp/j.pl probeName.list2 lookup.file
200644_at, 200754_x_at, 200761_s_at, 200892_s_at, 200923_at, 
201277_s_at, 201381_x_at, 201663_s_at, 201885_s_at, 202007_at, 
202095_s_at, 202164_s_at, 202181_at, 202336_s_at, 202376_at, 202589_at, 
202779_s_at, 203234_at, 203418_at, 203432_at, 203434_s_at, 204281_at, 
204441_s_at, 205053_at, 205240_at, 205574_x_at, 205676_at, 206074_s_at, 
206102_at, 206316_s_at, 206336_at, 207165_at, 207345_at, 208079_s_at, 
208084_at, 208779_x_at, 209183_s_at, 209714_s_at, 209774_x_at, 
209974_s_at, 209980_s_at, 210987_x_at, 211066_x_at, 211747_s_at, 
211762_s_at, 212281_s_at, 212417_at, 212438_at, 212503_s_at, 213008_at, 
213226_at, 213462_at, 213861_s_at, 214710_s_at, 215446_s_at, 
216250_s_at, 217783_s_at, 218016_s_at, 218115_at, 218804_at, 219212_at, 
219725_at, 219770_at, 219933_at, 219981_x_at, 221729_at, 221922_at, 
221986_s_at, 222077_s_at, 38149_at, 59625_at, 222549_at, 222673_x_at, 
223194_s_at, 223307_at, 224779_s_at, 224944_at, 225300_at, 225541_at, 
225687_at, 226104_at, 226325_at, 226932_at, 227212_s_at, 227379_at, 
228286_at, 228654_at, 229538_s_at, 229610_at, 231823_s_at, 235509_at, 
242517_at, 242873_at, 1552348_at, 1552619_a_at, 1553768_a_at, 
1554696_s_at, 1555007_s_at, 1555758_a_at, 1564911_at





More information about the Omaha-pm mailing list