[ABE.pm] metaprogramming is way cool

Sun May 18 20:22:47 PDT 2008

Remember that technical meeting where I engaged in a length digression about
something that seemed fairly irrelevant?

Wait, I need to be more specific?

At the meeting on May 7th, I went on and on about something that's going on
with a possible new feature for perl (either as a new module or as part of
5.12).  It's overloading for the method invocation operator.

Huh?  Well, it's actually pretty simple.

When you write this:

  $object->method($arg1, $arg2);

This happens (but note that this code is just simplified enough to not work):

  my $class = ref $object;  # we find the name of the class of the object

  while ($class) {          # we start a loop (this will make sense soon)

    my $code = \&{"$class::method"}; # we get the sub "method" in that class
    if (defined &$code) {                     # ...and if exists
      return $code->($object, $arg1, $arg2);  # call it with these args
    }

    $class = shift @{"$class::ISA"}; # otherwise, try on our parent class
  }

  die "no such method!"     # if we didn't return by now, there's no method!

Like I said, this isn't quite right, for a few reasons, but it illustrates the
point.  Here are some of the assumptions built into how method calls work in
Perl:

  * objects are members of a class, and a class is represented by a package
  * @ISA is the way that super/sub-classes are set up
  * the object method X is a sub named X in the class's package
  * ...unless it's a sub named X in a package in that package's @ISA variable
  * the class method X is a sub named X in the class's package (or in @ISA)
  * ...which means that object and class methods share the same namespace
  * ...and that methods are always found by name
  * if no package in ISA defines X, look in UNIVERSAL
  * failing that, call the sub AUTOLOAD, if defined in the class package

This isn't ideal, but it's pretty good for most things that most people need.
Here are a few things that you can't do with it:

You can't...

  * ever have your AUTOLOAD find a method in UNIVERSAL
  * clearly divide object methods from class methods
  * have distinct classes that do not map to packages
  * determine how to find superclasses any other way than @ISA
  * define methods via anything other than AUTOLOAD (which fails as noted
    above) or as named subroutines in packages

These are fairly unusual cases, but they come in handy.  If you want a lot of
similar objects that have different methods available, or different behaviors
for their methods, it might be useful to do "classless" or "prototype-based" OO
the way that JavaScript does.  Separating class from instance methods is just
wonderfully useful in general, but most Perl programmers have learned to live
without it.  Having classes without packages is not wildly useful per se, but
because packages in Perl are difficult to track, inspect, or destroy, replacing
a "package-based" class with something else would provide many benefits.

So, the solution under discussion is to say that when the user writes:

  $object->method($arg1, $arg2);

...that the author of a class may choose to say, "don't do the usual thing
(approximated in the code snippet above).  Instead, do my special thing."
Because the method call looks totally normal, the user doesn't need to know
that behind the scenes it's doing something special.

The idea here is that the user -- the guy using this $object -- knows that
there is a defined way to deal with objects.  You say:

  my $object = Class->new( ... );

and later you say:

  $object->method($arg1, $arg2);

over and over until you're done.  It's the same way he knows to use "my @array"
to make an array and "push @array, $item" to append to it.  It's the language's
contract or protocol for users.  As long as the user sees the same interface,
he can understand what's happening.

For a long time, now, Perl has had the ability to "tie" a variable.  This
basically means you can make a thing called @array that has more behavior than
a normal array.  When you push to it, something more can happen -- but you
still /push/ to it, with "push @array, $item".  As far as the user is
concerned, it's still an array.  He can be blissfully unaware of whatever magic
has been added, and doesn't need to learn some new interface like:

  my $array = Magic::Tracker->new;
  $array->push($item);

Unfortuantely, implementing tied variables is ugly.  Tied variables are looked
down on for many reasons, rightly and wrongly.  One good reason is that they're
hard to extend.  If you say, "this routine expects to be passed a reference to
an array," then you know you expect something that you build with \@array.
That is, something that you can do these things to:

  push @$array, $foo;
  splice @$array, 0, 1, @list;
  unshift @$array
  etc

That list is determined by Perl.  You can't add to it.  If later you are going
to demand something that's an array reference, but has extra methods, you end
up passing in something that has overloaded array dereference logic (as awful
as it sounds) or you have to do something insane like:

  (tied @$array)->method(...)

Fail!

autobox, which I mentioned recently, helps get around this problem.  You can
define a set of needed methods -- push, pop, etc -- and then say that any
object that provides those methods can be passed in.  Users know just what they
have to implement to meet the requirements.  You don't have to worry about the
limited growth potential of tied variables.  With autobox, you can start off
saying, "since I know that I can do everything I need based on an array, I will
treat plain old array references like objects with these methods by using
autobox to say that an array reference has all the methods defined in my
Array::Adapter::ForMyCode class."

So, again, you provide a formal interface (basically a list of required
methods) that your code requires.  Basic users only need to know that they can
pass in an array reference or a MagicList object.  Advanced users have clear
instructions for what kind of object they can build on their own and pass in.

There are two related ideas at work here.  One is the ability to have known
interfaces that can be made to behave in a different way without changing the
interface.  The other is the the ability to use this technique not only for
making two similar classes, but for replacing any part at all of the
programming language with something that looks the same but is implemented
differently.

At another technical meeting, I talked a little about Moose, which steals some
of its ideas from CLOS, the Common Lisp Object System.  They're both made very
powerful by the idea of metaprogramming and a meta-object protocol.  It's a
very simple idea, if you get past the name.

The idea is that if you can define the way that objects, classes, and all those
things are used by programmers, then you can change the way they behave behind
the scenes.  That is, you can let the programmer say:

  standard class Foo {
    method bar(int i): { ... }
    ...
  }

and something like this will happen behind the scenes:

  standard->new_class(
    name    => 'Foo',
    methods => {
      bar => {
        args => [ "int i" ],
        code => sub { ... },
      },
    },
  );

What happens?  Well, it's up to the definition of 'standard' which is just a
class implementing the interface that says "I make classes!"  If you want to
write your own class builder, called "funky" you can.  Then someone else can
write:

  funky class Bar { ... }

They don't need to know anything about how it works.  They know it will make a
class, and they probably know what the end-user documentation says about how
funky classes differ from normal classes.

Perl really lacks any kind of built-in protocol for this kind of thing.  At
best, classes are defined by how objects respond to 'ref' and the methods 'isa'
and 'can' and 'DOES' and how methods are dispatched.  Once method dispatch can
be customized, it will be possible to write any sort of object system (within
some very, very broad limits, anyway) in pure Perl, and it will look just like
normal Perl.  The end user won't need to care.

The question of "code that writes code" has come up a few times, here, but I
think a much more useful and interesting idea is the idea of a "programmable
programming language."  That label is often applied to Lisp, in which basically every part of the language can be extended or altered as needed.

I realize that this is a long, somewhat disjointed and rambling post.  I want
to get across, though, the importance and fundamental value of the ability to
programatically alter the behavior of a programming language's core behavior.
"The Art of the Meta-Object Protocol" says that meta-object programming
transforms a single language into a large area of related and interoperable
languages, each better suited for specific usages.  This is an excellent
summary.

Any serious programmer would be really well served to look into this topic,
either by learning more Lisp, playing with Moose (and its metaclasses, not just
Moose itself) and Class::MOP, or just by thinking about how his next common
problem might be made easier not with more code to combat the problem, but with
mode code to change the language to make the problem vanish.

I'm going to bed now, to dream of lambdas.

-- 
rjbs