[Purdue-pm] Meeting Post-Mortem and Challenge

Phillip San Miguel pmiguel at purdue.edu
Tue Oct 19 12:50:23 PDT 2010


  On 10/19/2010 3:10 PM, Dave jacoby wrote:
> We had a selection of new faces at our Perl Mongers meeting, seeking 
> to learn Perl to understand and adapt the tools they have at hand to 
> their further purpose. Mark and I and mostly Rick went through an 
> example program they had used, RADpools, as a teaching tool, covering 
> many of the basics and some of the advanced elements of our favorite 
> programming language, until we got to line 143.
>
>             foreach my $mid (@pool_mid_list) {
>                 $mid_length = length $mid;
>                 if ( !$fuzzy_MIDs ) {
>                     push @{ $mid_pools{$mid} }, $pool_name;
>                 }
>                 else {
>                     for my $i ( 1 .. $mid_length ) {
>                         for my $base (qw{A C G T}) {
>                             my $fuzzycode  = $mid;
>                             my $prebase_i  = $i - 1;
>                             my $postbase_i = $mid_length - $i;
>                             $fuzzycode =~ s{^([ACGT]{$prebase_i})
>                             ([ACGT])
>                             ([ACGT]{$postbase_i})$}
>                             {$1$base$3}xms;
>                             push @{ $mid_pools{$fuzzycode} }, $pool_name;
>                         }
>                     }
>                 }
>             }
>
> And, actually, the if(!$fuzzy_MIDs){} section is straightforward. The 
> else{} section is ... well, something else.
>
> Specifically, it's the kind of regular expression that causes folks 
> like JWZ to say "now you have TWO problems". Rick specifically 
> mentioned that he would not use regular expressions for this problem, 
> and Mark said it was best to ignore my attempts to explain this.
>
>     (I will point out now that this code is
>     copyright 2008,2010 by John Davey of the
>     University of Edinburgh and
>     is licenced by the GPL v3.)
>
> I think that a good challenge for our November meeting would be to 
> come up with variations of the else{} code that are more suitable for 
> Perl 101. Does that sound good to anybody?
>
> Also, I have just reserved WSLR116 for next Tuesday from Noon to 2pm 
> to continue the analysis of RADpools.
>
I don't know that regexes are to blame for this. I'm not even sure that 
the code is *that* unwieldy. I was not there to hear the definition of 
"fuzzy codes", but the code appears to just iterate over each base of 
the MID sequence supplied to it and create 4 sequences at each step that 
have the base at that position replaced with each possible base. So if 
the MID were "GAC", then 12 sequences would be stored in the mid_pools hash:

AAC
CAC
GAC
TAC
GAC
GCC
GGC
GTC
GAA
GAC
GAG
GAT

I think that is what it does. Although my understanding is based more on 
the biology that would be supported than the code itself.

Phillip



More information about the Purdue-pm mailing list