[Neworleans-pm] Fwd: Solutions and Discussion for Perl Quiz of the Week #24 (Turing Machine simulation)

E. Strade, B.D. estrabd at yahoo.com
Tue Sep 21 12:11:06 CDT 2004




=====
http://www.brettsbsd.net/~estrabd

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

----- Original message -----
From: "Zed Lopez" <zed at apricot.com>
To: perl-qotw at plover.com
Date: Tue, 21 Sep 2004 11:23:41 -0400
Subject: Solutions and Discussion for Perl Quiz of the Week #24 (Turing
Machine simulation)

Sample solutions and discussion
Perl Quiz of The Week #24 (20040915)

There were 14 Perl solutions, one in Python, one in Ruby, and a whole
bunch of interesting Turing Machines contributed -- I was thrilled by
the play this quiz inspired.

Here's my solution, with annotations:

    #!/usr/bin/perl

    use strict;
    use warnings;

    my ($filename, $input) = @ARGV;
    die "Usage: $0 filename [input]" if !defined $filename or @ARGV > 2;

Make sure the program has exactly 1 or 2 input parameters.

    my (%table, $state, @tape, $head);

Define the Turing Machine's basic elements as global variables.

    my %movehead = (L => -1, R => 1);

Initialize a hash for converting the state transition table's
direction symbol to a value suitable for adding to an array subscript.

    open (FH, $filename) or die "Can't open $filename: $!";
    while (<FH>) {

Open and start reading the state transition file.

      next if !/\S/  or /^\s*#/;

Skip all-blank and blank-but-for-comment lines.

      chomp;
      my ($cur_state, $cur_symbol, $new_state, $new_symbol, $dir) = 
        /^\s*(\w+)\s+(\w)\s+(\w+)\s+(\w)\s+(L|R)(?:\s*#.*)?/
        or die "Bad instruction at $filename line $.: $_";

Use a complicated regexp to parse the line, error-checking at the same
time.

      $state = $cur_state unless defined $state;

Set the initial state.

      $table{$cur_state, $cur_symbol} = [$new_state, $new_symbol,
      $movehead{$dir}];

Load the %table hash with an arrayref consisting of the corresponding
info. $table{$cur_state, $cur_symbol} is equivalent to $table{join $;,
$cur_state, $cur_symbol}. In Perl 4, this was the usual implementation
of multidimensional arrays; it can still be a convenient way to simulate
a
hash whose keys have multiple parts.

    }
    close (FH);
    die "No instructions present in $filename" unless defined $state;

Die if there was no state table (which also means there's no current
state.)
   
    @tape = defined $input ? split //, $input : ('_');

The tape is implemented as an ordinary Perl array, which will be
extended in either direction as necessary. Initialize the tape to the
input string or a single '_' character.

    $head = 0;
   
Initialize $head, the current position of the tape, which will be used
as a
subscript to @tape.

    while (defined (my $instruction = $table{$state, $tape[$head]})) {

This is the Turing Machine algorithm. While there is a state
transition defined for the current state and the current character...

      my $move;
      ($state, $tape[$head], $move) = @$instruction;
      $head += $move;

Change the state to the new state; write the new character to the
tape; move the head.

      push @tape, '_' if $head > $#tape; # grow tape to the right
      unshift @tape, '_' and $head++ if $head < 0; # grow tape to the
      left

Extend the tape to either direction, if necessary.

    }
   
If the loop exits, there was no transition defined for the current
state and current character, so the Turing Machine should halt.

    shift @tape while @tape and $tape[0] eq '_';
    pop @tape while @tape and $tape[-1] eq '_';

Clean up '_' characters from both sides of the tape.

    print join '', @tape, "\n";

Print the contents of the tape.
   
As some early respondents noted, this quiz was simpler than it might
have seemed at first glimpse.

A Turing Machine's definition spells out the algorithm that will drive
any implementation. Two major points on which the submissions differed
were the data structure of the state table and the implementation of
the infinite tape.

(I was happy to see the Python and Ruby submissions, but as I don't
speak those languages, I'm not going to comment on their internals.)

For the data structure, a hash of hashrefs whose values were arrayrefs
was most popular, used by Burton West, Dominus, Ericson, Fuglerud,
Modi, Nielsen, and Sanderson. For instance, from Dominus' program:

   $transition{$instate}{$intape} = [$outstate, $outtape,
   $motion{$motion}];

Instead of arrayrefs, Pletinckx, Rafferty, Tucker, and Varga used
a hashref so they could name the values, as shown by Rafferty's program:

   $states{$start}{$input} = { end    => $end,
                               output => $output,
                               dir    => $dir    };

Kimball's program, which uses only Llama Book features, used a hash
whose key contains the state name and current symbol delimited by a
space, and whose value was space delimited string of the next state,
next symbol, and direction.

  $instructions{"$current_state $current_char"} =
    "$new_state $new_char $direction";

(This program is available at

        http://perl.plover.com/qotw/misc/r024/kimball.pl )

Dominus outlined some approaches to representing the infinite tape:

        http://perl.plover.com/~alias/list.cgi?1:mss:2230

 >Getting an infinite tape is not as difficult as some people seemed to
 >think it would be.  There are at least three techniques that occurred
 >to me.  One would be to have
 >
 >        TAPE(n) is stored in $tape[1 - 2*$n] when n <  0
 >                          in $tape[    2*$n] when n >= 0       
 >
 >and an alternative would use this as a backend to a tied array which
 >would interpret the negative subscripts transparently.
 >
 >A third way is the way I did it in this program, which I think is
 >simple and elegant.


Most people used that third way (Dominus, Ericson, Fuglerud,
Pletinckx, Quint, Sanderson, Tucker, and myself, as shown above.)

[ MJD: I should have pointed out that the tied array technique can
  also be used in conjunction with the third approach, so that I
  really listed either two or four techniques, depending on how you
  count.  ]

Everyone else came up with still more ways. Burton West, Kimball,
Nielsen, and Varga used a hash; Rafferty used an object based an a
hash.   Modi used a string.
 
To test the programs' behavior on good programs, I used
binary_incr.tm, hello_world.tm (complete with typo), and multiply.tm
(unary multiplication) from the examples in the quiz, plus the
binary_add, stack and parens tm's that people submitted to
qotw-discuss, plus the following:

echo.tm:

        s0 _ s1 _ R

This should just echo the input string. If passed nothing, or if
passed a string containing only '_' characters, it should print
nothing.

0.tm:

        0 1 0 1 R
        0 0 0 0 R
        0 _ 1 _ L
        1 1 1 0 L
        1 0 2 1 L
        1 _ 2 1 L
        2 1 2 1 L
        2 0 2 0 L
        2 _ 3 _ R

This is binary_incr with a state named 0. As Dominus noted, if a
program depends on the truth of a state name as opposed to whether
it's defined, it will fail for this valid case.

And two more tests suggested by discussion on qotw-discuss.

nostate1.tm:

        A 1 B 1 L

This should work the same as echo.

nostate2.tm

        a _ b 1 R
        b _ c _ R
        c _ d _ R
        d _ e _ R
        e X stop X R

If there was no input string, output 1. If the input string begins
with a blank, replace the blank with a 1 and print it. Otherwise, just
print the input string.

Finally, I tested them on a series of invalid inputs, for which they
should die:

bad1.tm:

        a@ a b 1 R # bad state name

bad2.tm:

        a b c d q # bad direction

bad3.tm

        a ? b c L # bad current symbol

bad4.tm

        a b c $ L # bad new symbol

bad5.tm

        a b * c L # bad new state

bad6.tm

        a a a a # no more!

bad7.tm

        1 2 3 4 5 6 7 # too many elements

Finally, I tested them on empty.tm, which was an empty file. The
specification did not make explicit what the behavior in this case
should be. From the definition of a Turing Machine, I think it's most
sensible to consider this an error case: a Turing Machine must have a
state transition table and a current state. But I don't consider any
particular behavior correct or incorrect for this case, as it wasn't
defined.

Some of the programs went into infinite loops. In some cases, I'm
giving people credit for dying when I shouldn't because I broke out of
my test program while it was running a submission that was in an
infinite loop.

The solutions of Dalke, Tucker, Rafferty, Ericson, Gray, Kimball, and
Quint passed all of the tests.

Of the remaining submissions, one didn't parse comments. A couple of
them didn't correctly strip leading and trailing underscores. echo.tm
posed a problem for some, as did dealing with state names of '0' or
input of '0'.  The nostate tms were also a common source of trouble.
One went into an infinite loop if it ended with a blank tape.

So though Turing Machines are simple, getting everything right in an
implementation of even a simple thing isn't easy.

My testing program, the tm files, the verbose results, and everyone's
submissions are available from

        http://perl.plover.com/qotw/misc/r024/

for those who'd like to play along at home. Note that the testing
program goes into infinite loops when the submissions do.

Thanks to everyone who participated (including those who didn't submit
solutions.)


More information about the NewOrleans-pm mailing list