Phoenix.pm: Tried Perl debugger on this code (Was Please help, my brain is fried)

Kevin Buettner kev at primenet.com
Thu Jan 27 23:14:25 CST 2000


On Jan 27,  7:39pm, Shay Harding wrote:

> I like the above code, but I think my whole question was missed...

[...]

>   sub get_char(){
>       srand;
>       $count++;
> 
>       my $char = int(rand(57))+65;
>       print $count, "  ", chr($char), "\n";
> 
>       get_char() if ($char >= 91 && $char <= 96);
>       return $char;
>   }

There were a number of questions in the section I snipped.  I'll attempt
to answer what I think are the relevant ones.  (I'll skip the ones
regarding the perl debugger since I rarely use it.  It confuses me
too.)

Before I get into a discussion of why the above code doesn't work and
the reasons why, I hope you'll permit me to simplify it somewhat by
removing code which I suspect was added for debugging purposes.  My
rewrite (with line numbers added to make discussion easier) looks like
this:

 1      sub get_char {
 2          my $char = int(rand(57))+65;
 3    
 4          get_char() if ($char >= 91 && $char <= 96);
 5          return $char;
 6      }

The intent of the above code is to return the ASCII code for a
randomly chosen alphabetic character, i.e, one of 'A'..'Z','a'..'z'.

Let us begin our discussion with line 2.  We would like line 2 to
set $char to an integer between 65 ('A') and 122 ('z') inclusive.
As written, this line will fail to ever set $char to 122.  It
should be changed to read as follows:

 2a         my $char = int(rand(58))+65;

In order to verify that this is correct, remember that int(rand(58))
will evaluate to an integer between 0 and 57 inclusive.  This means
that the lowest value achievable by $char will be 65 and the greatest
is 65+57=122, which is precisely what we want.

Now let's turn our attention to the 'my' declaration.  The 'my'
declaration declares a statically scoped local variable.  The
scope extends from the point of the declaration to the ending
right curly brace on line 6.  This means that when get_char
is invoked, $char is only visible between lines 2 and 6 for that
invocation.  (Go back and read "for that invocation" again; it's
important.

Let's consider an example.  Suppose you enter get_char and were
unfortunate enough to have $char set to 91 ('[').  Once you get down
to line 4, perl will invoke get_char again because it happens to fall
between 91 and 96 inclusive.

Now the next bit is *very* important.  When you enter get_char again,
you get a brand spanking new $char that is only visible between
lines 2 and 6 (for this invocation).  The $char in the calling
invocation is *not* visible to the current one.  Let's suppose in
this invocation that you get $char set to 65 ('A').  That means that
line 4 will *NOT* cause get_char to be invoked yet again.  Line 5
will cause 65 to be returned.

So now we're back in the original invocation of get_char.  What
happened to our 65 returned by the recursive call?  It is discarded
because the code, as written, doesn't do anything with it.  $char
in this invocation is still set to 91 and that is what is returned.

Let's consider what happens with one attempt to "fix" this code:

 4a         get_char() while ($char >= 91 && $char <= 96);

In the above line (4a), the 'if' has been replaced by a 'while'.  This
is quite possibly worse than the original version because (as you
observed), it is prone to hanging.  Reread my above example.  The only
difference here is that when get_char returns, the while statement
will check again to see if $char has changed so that it is no longer
in the indicated range.  Well, it *can't* have changed.  There's no
way it could have changed since the recursive invocation of get_char
has no way to affect the instance of $char in the frame under
consideration.  Thus, line 4a will continue to call get_char() over
and over again in the vain hope that $char will somehow get changed.

Finally, let's consider a correct fix:

 4b         $char = get_char() if ($char >= 91 && $char <= 96);

This version does two things that the original version did not.

First, it actually does something with the return value of get_char. 
If you have a subroutine which returns a value and you also have
instances of calls to that subroutine that do nothing with the return
value, that should be a red flag that something is likely wrong.

Second, and more importantly, line 4b sets $char to the return value
of the recursive call to get_char.  This means that $char in the
outermost invocation of get_char will be set to a non out-of-range
character.

Here is the final corrected version of get_char:

 1      sub get_char {
 2a         my $char = int(rand(58))+65;
 3    
 4b         $char = get_char() if ($char >= 91 && $char <= 96);
 5          return $char;
 6      }

You might wonder if this version is susceptible to infinite recursion.
It is not, so long as the pseudo-random number generator doesn't get
stuck in a cycle of generating only out-of-range characters.  Eventually,
there will be a recursive invocation which will pick an in-range
character.  When this happens, the in-range character will propogate
back to the outermost invocation due to the assignment statement
added to 4b.

I suspect that much of your confusion stems from the difference between
'my' and 'local'.  I took a look at the perlfaq7 man page in hopes
that it would contain a cogent discussion of the issues that I could
refer you to, but after reading it I concluded that you may well have
been misled by the cursory treatment that it gives to the matter.  (It
says nothing about what happens when you have recursive subroutines.
In fact, it does give the impression that there's only one copy which
is certainly not the case.)

So, instead, I'll direct your attention to the perlsub man page.  See
the section called "Private Variables via my()".  In particular, the
following section is especially relevant...

       Unlike dynamic variables created by the "local" operator,
       lexical variables declared with "my" are totally hidden
       from the outside world, including any called subroutines
       (even if it's the same subroutine called from itself or
       elsewhere--every call gets its own copy).

Note that last parenthesized bit, "every call gets its own copy".

Kevin

-- 
Kevin Buettner
kev at primenet.com, kevinb at redhat.com



More information about the Phoenix-pm mailing list