[Omaha.pm] "Command-Line Bioinformatics"

kiran bina kiranbina at gmail.com
Wed Feb 28 07:07:55 PST 2007


I am glad I am using perl and not python or any thing else. Thanks to Bob
who got me started with perl and Jay with bio-perl helping me every step of
the way. :-)

On 2/28/07, Jay Hannah <jay at jays.net> wrote:
>
> Reading this article:
> http://www.linuxjournal.com/article/6977
> Sequencing the SARS Virus - Linux Journal, Nov 2003
>
> This guy needs Perl and/or BioPerl.  :)
>
> > The sequence file is in FASTA format consisting of a header line
> > and the sequence, split into fixed-width lines. The following
> > counts the number of Gs and Cs in the sequence and presents the
> > total as a fraction of the total number of bases:
> >
> > > grep -v "^>" AY274119.fa | fold -w 1 |
> > tr "ATGC" "..xx" | sort | uniq -c |
> > sed 's/[^0-9]//g' | t -s "\012" " " |
> > sed 's/\([0-9]*\) \([0-9]*\)/scale = 3;
> > ↪\2 \/ (\1+\2)/' |
> > bc -i
> > scale = 3; 12127 / (17624+12127)
> > .407
> >
> > Out of the 29,751 bases in our sequence, 12,127 are either G or C,
> > giving a GC content of 41%.
>
> BioPerl version:
>
> use Bio::SeqIO;
> my $io = Bio::SeqIO->new(
>   -file   => 'AY274119.fa',
>   -format => 'Fasta'
> );
> my $seq = $io->next_seq->seq;
> print ( ($seq =~ tr/GC/GC/) / length ($seq) );
>
> Command-line Perl:
>
> perl -e '$/ = undef; $_ = <>; s/>.*//; s/\n//g; print tr/GC/GC/ /
> length($_)' AY274119.fa
>
> I'm sure you can Perl Golf my stabs at it.  :)
>
> j
> seqlab.net
> http://www.bioperl.org/wiki/User:Jhannah
>
>
>
>
> _______________________________________________
> Omaha-pm mailing list
> Omaha-pm at pm.org
> http://mail.pm.org/mailman/listinfo/omaha-pm




-- 
Dhundy R. Bastola
Assistant Professor
Department of Pediatrics
University of Nebraska Medical Center
Omaha NE 68198
Always reply to: dbastola at unmc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/omaha-pm/attachments/20070228/8536b9ab/attachment.html 


More information about the Omaha-pm mailing list