revamped source code

nkuipers nkuipers at uvic.ca
Thu Jun 20 13:17:27 CDT 2002


Hey all,

I implemented *some* of the changes y'all suggested yesterday...the parts I 
could follow, basically. =)  You'll note the shorter regexes, passing by 
reference, and the complete absence of any arrays.

Here is what I have, and [I think] it works; at least, it is giving output 
that looks suspiciously like what I am looking for.  So now it's just running 
it by more brains to make sure I am not being hoodwinked by some Perl internal 
I am not aware of, and see if I can squeak by with some more speed...

Eventually it will accept multiple filter strengths at the command line and 
iterate through all of them, each iteration being faster than the last because 
of the delete call to the "find-this" hash (%queryid).

#Program name: getparsedfasta
#Author: Nathanael Kuipers, nkuipers at uvic.ca
#Date written: June 12, 2002
#Last updated: June 20, 2002
#Purpose: conditionally formats parsed blastn xml as FASTA
#Use: >perl getparsedfasta inputfilename int

#!/usr/bin/perl -w

use strict;

my $infile = shift;
my $filterstrength = shift;
my %queryid = ();
my $header = '';
my %wholeid = ();

open IN, "$infile" or die;
for (<IN>) {
	if (/(\(query:.*)/) {
	$queryid{$1}++;}
}
close IN;

&build_filter($filterstrength, \%queryid);

open IN, "$infile" or die;
for (<IN>) {
	if (/Hit:\s(.*)/) {
	$header = "$1"; next;}
	elsif (s/\s+HSP\s\d+\s=\s//) {
    	chomp $_;
    	$wholeid{$_} = $header
	unless exists $wholeid{$_};
    	next;}
 }	
close IN;

&get_filtered($filterstrength, \%queryid, \%wholeid);

#########################################################

sub build_filter {
  my ($int, $href) = @_;
  while (my $key = each %$href) {
	delete $$href{$key} if $$href{$key} < $int;}
}

sub get_filtered {
  my ($int, $href1, $href2) = @_;
  my $regex = '';
  open OUT, ">$infile" . ".fil." . "$int" or die;
  while (my $key1 = each %$href1) {
  	 $regex = quotemeta $key1;
  	 while (my $key2 = each %$href2) {
		if ($$href2{$key2} =~ /${regex}/) {
		print OUT ">$$href2{$key2}\n$key2\n";}
	 }
  }
}

#########################################################

"Luckily, we have computers."




More information about the Victoria-pm mailing list