[Purdue-pm] Meeting Post-Mortem and Challenge

Rick Westerman westerman at purdue.edu
Tue Oct 19 12:56:36 PDT 2010


On 10/19/2010 03:50 PM, Phillip San Miguel wrote:
>  On 10/19/2010 3:10 PM, Dave jacoby wrote:
>> We had a selection of new faces at our Perl Mongers meeting, seeking 
>> to learn Perl to understand and adapt the tools they have at hand to 
>> their further purpose. Mark and I and mostly Rick went through an 
>> example program they had used, RADpools, as a teaching tool, covering 
>> many of the basics and some of the advanced elements of our favorite 
>> programming language, until we got to line 143.
>>
>>             foreach my $mid (@pool_mid_list) {
>>                 $mid_length = length $mid;
>>                 if ( !$fuzzy_MIDs ) {
>>                     push @{ $mid_pools{$mid} }, $pool_name;
>>                 }
>>                 else {
>>                     for my $i ( 1 .. $mid_length ) {
>>                         for my $base (qw{A C G T}) {
>>                             my $fuzzycode  = $mid;
>>                             my $prebase_i  = $i - 1;
>>                             my $postbase_i = $mid_length - $i;
>>                             $fuzzycode =~ s{^([ACGT]{$prebase_i})
>>                             ([ACGT])
>>                             ([ACGT]{$postbase_i})$}
>>                             {$1$base$3}xms;
>>                             push @{ $mid_pools{$fuzzycode} }, 
>> $pool_name;
>>                         }
>>                     }
>>                 }
>>             }
>>
>> And, actually, the if(!$fuzzy_MIDs){} section is straightforward. The 
>> else{} section is ... well, something else.
>>
>> Specifically, it's the kind of regular expression that causes folks 
>> like JWZ to say "now you have TWO problems". Rick specifically 
>> mentioned that he would not use regular expressions for this problem, 
>> and Mark said it was best to ignore my attempts to explain this.
>>
>>     (I will point out now that this code is
>>     copyright 2008,2010 by John Davey of the
>>     University of Edinburgh and
>>     is licenced by the GPL v3.)
>>
>> I think that a good challenge for our November meeting would be to 
>> come up with variations of the else{} code that are more suitable for 
>> Perl 101. Does that sound good to anybody?
>>
>> Also, I have just reserved WSLR116 for next Tuesday from Noon to 2pm 
>> to continue the analysis of RADpools.
>>
> I don't know that regexes are to blame for this. I'm not even sure 
> that the code is *that* unwieldy. I was not there to hear the 
> definition of "fuzzy codes", but the code appears to just iterate over 
> each base of the MID sequence supplied to it and create 4 sequences at 
> each step that have the base at that position replaced with each 
> possible base. So if the MID were "GAC", then 12 sequences would be 
> stored in the mid_pools hash:
>
> AAC
> CAC
> GAC
> TAC
> GAC
> GCC
> GGC
> GTC
> GAA
> GAC
> GAG
> GAT
>
> I think that is what it does. Although my understanding is based more 
> on the biology that would be supported than the code itself.

That is my understanding it as well.  In which case why regex?  Why not 
use substrings?   And even if one wanted to use regexs then why bother 
capturing the middle portion and then throwing it away?

The program up to that point was well written.  But that regex seems 
unnecessarily complex to me.


Anyway that portion of the code does make for a good challenge problem.





-- 
-- Rick


More information about the Purdue-pm mailing list