revamped source code
nkuipers
nkuipers at uvic.ca
Thu Jun 20 13:17:27 CDT 2002
Hey all,
I implemented *some* of the changes y'all suggested yesterday...the parts I
could follow, basically. =) You'll note the shorter regexes, passing by
reference, and the complete absence of any arrays.
Here is what I have, and [I think] it works; at least, it is giving output
that looks suspiciously like what I am looking for. So now it's just running
it by more brains to make sure I am not being hoodwinked by some Perl internal
I am not aware of, and see if I can squeak by with some more speed...
Eventually it will accept multiple filter strengths at the command line and
iterate through all of them, each iteration being faster than the last because
of the delete call to the "find-this" hash (%queryid).
#Program name: getparsedfasta
#Author: Nathanael Kuipers, nkuipers at uvic.ca
#Date written: June 12, 2002
#Last updated: June 20, 2002
#Purpose: conditionally formats parsed blastn xml as FASTA
#Use: >perl getparsedfasta inputfilename int
#!/usr/bin/perl -w
use strict;
my $infile = shift;
my $filterstrength = shift;
my %queryid = ();
my $header = '';
my %wholeid = ();
open IN, "$infile" or die;
for (<IN>) {
if (/(\(query:.*)/) {
$queryid{$1}++;}
}
close IN;
&build_filter($filterstrength, \%queryid);
open IN, "$infile" or die;
for (<IN>) {
if (/Hit:\s(.*)/) {
$header = "$1"; next;}
elsif (s/\s+HSP\s\d+\s=\s//) {
chomp $_;
$wholeid{$_} = $header
unless exists $wholeid{$_};
next;}
}
close IN;
&get_filtered($filterstrength, \%queryid, \%wholeid);
#########################################################
sub build_filter {
my ($int, $href) = @_;
while (my $key = each %$href) {
delete $$href{$key} if $$href{$key} < $int;}
}
sub get_filtered {
my ($int, $href1, $href2) = @_;
my $regex = '';
open OUT, ">$infile" . ".fil." . "$int" or die;
while (my $key1 = each %$href1) {
$regex = quotemeta $key1;
while (my $key2 = each %$href2) {
if ($$href2{$key2} =~ /${regex}/) {
print OUT ">$$href2{$key2}\n$key2\n";}
}
}
}
#########################################################
"Luckily, we have computers."
More information about the Victoria-pm
mailing list