I am glad I am using perl and not python or any thing else. Thanks to Bob who got me started with perl and Jay with bio-perl helping me every step of the way. :-)<br><br>
<div><span class="gmail_quote">On 2/28/07, <b class="gmail_sendername">Jay Hannah</b> <<a href="mailto:jay@jays.net">jay@jays.net</a>> wrote:</span>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">Reading this article:<br><a href="http://www.linuxjournal.com/article/6977">http://www.linuxjournal.com/article/6977
</a><br>Sequencing the SARS Virus - Linux Journal, Nov 2003<br><br>This guy needs Perl and/or BioPerl. :)<br><br>> The sequence file is in FASTA format consisting of a header line<br>> and the sequence, split into fixed-width lines. The following
<br>> counts the number of Gs and Cs in the sequence and presents the<br>> total as a fraction of the total number of bases:<br>><br>> > grep -v "^>" AY274119.fa | fold -w 1 |<br>> tr "ATGC" "..xx" | sort | uniq -c |
<br>> sed 's/[^0-9]//g' | t -s "\012" " " |<br>> sed 's/\([0-9]*\) \([0-9]*\)/scale = 3;<br>> ↪\2 \/ (\1+\2)/' |<br>> bc -i<br>> scale = 3; 12127 / (17624+12127)<br>> .407
<br>><br>> Out of the 29,751 bases in our sequence, 12,127 are either G or C,<br>> giving a GC content of 41%.<br><br>BioPerl version:<br><br>use Bio::SeqIO;<br>my $io = Bio::SeqIO->new(<br> -file => '
AY274119.fa',<br> -format => 'Fasta'<br>);<br>my $seq = $io->next_seq->seq;<br>print ( ($seq =~ tr/GC/GC/) / length ($seq) );<br><br>Command-line Perl:<br><br>perl -e '$/ = undef; $_ = <>; s/>.*//; s/\n//g; print tr/GC/GC/ /
<br>length($_)' AY274119.fa<br><br>I'm sure you can Perl Golf my stabs at it. :)<br><br>j<br><a href="http://seqlab.net">seqlab.net</a><br><a href="http://www.bioperl.org/wiki/User:Jhannah">http://www.bioperl.org/wiki/User:Jhannah
</a><br><br><br><br><br>_______________________________________________<br>Omaha-pm mailing list<br><a href="mailto:Omaha-pm@pm.org">Omaha-pm@pm.org</a><br><a href="http://mail.pm.org/mailman/listinfo/omaha-pm">http://mail.pm.org/mailman/listinfo/omaha-pm
</a></blockquote></div><br><br clear="all"><br>-- <br>Dhundy R. Bastola<br>Assistant Professor<br>Department of Pediatrics<br>University of Nebraska Medical Center<br>Omaha NE 68198<br>Always reply to: <a href="mailto:dbastola@unmc.edu">
dbastola@unmc.edu</a><br>