[Phoenix-pm] "Perl 6 Now" chapter 1

Scott Walters scott at illogics.org
Thu Oct 14 14:01:48 CDT 2004


Again, but chapter 1, which needs some assurance (hint? suggestion?)
of technical accuracy. The editor will look at it after this and
suggest structural and content changes, and the copy editor will fix
any spelling errors I've introduced since I last spell checked it
and correct grammar and style - so don't worry about those things.

This chapter exists to get some of the technical things out of the way
readers might not be familiar with (building Perl from source, etc),
and also to serve as a more traditional first chapter by defining
the terms the book is about. I think it overlaps with the intro too
much still. Hrm.

Thanks!

-scott



=head1 1. The Programmer's Introduction to the Perl Computer Programming Language

Perl is a technical creation as well as a social creation, and the license it's distributed under
tells a good deal about what kind of social creation Perl is.

Perl 5 borrows heavily from other languages, most of which are procedural or special purpose languages,
and by far, the most powerful features come from the special purpose languages.
Perl 6 too borrows heavily but favors features from object oriented languages from industry and from
functional languages from academia.

As I write this, the latest, unreleased version of the language is needed to run the examples,
and this chapter will help you download, patch, and build a development version of Perl from sources.
This book uses Perl modules heavily to demonstrate Perl 6 concepts, and in the interest of
easily installing these, I'll show you how to set Perl to fetch CPAN modules automatically on
demand, which is itself a Perl 6 concept.
Many of these modules tailor, or let you tailor, Perl's behavior to data types, so a summary
of Perl 5 data types is presented which doubles as a reader's guide to the standard documention
on built-in functions.

Chapters after this one outline in detail the changes and addition to Perl 5 that create
Perl 6, but this chapter first sets the stage for the kinds of changes I'm going to detail.


=head2 Hello World

F<perl> is the program that parses and executes Perl programs:

  $ perl -e 'print "Hello, world!\n";'

Run that at the command shell for the familiar greeting.


=head2 Dual-License

Perl is a copyrighted work, but the owner of the copyright, Larry Wall,
makes Perl available under two license agreements, as is the privilege of a copyright holder.
Generally speaking, licenses grant others certain rights to do things
with a copyrighted work they otherwise wouldn't have under the terms of copyright law.
Often these license grant precious few rights, such as:  One person at a time may use this 
software on one computer; the software may not be moved to another computer or sold
with the computer; you may not reverse engineer this software to build utilities which
work with it; the copyright holder may revoke at any time this license for you to use
this software, even though you paid money for it.
This is in addition to not allowing you to make or distribute copies.

Perl's license is far less restrictive than this, but there are certain controls in place.
You have the choice of two licenses, the I<Artistic License> and the I<GNU General Public License>. 
Under the terms of either license, you're not limited in how you may use Perl,
and, provided you abide by the terms of the license, the license is non-revokable.

The Aritistic License, included with Perl in the file named F<Artistic>, allows
for creation of commercial versions of Perl provided they don't conflict with,
and are properly distinguished from, the community maintained Perl.
This allows Larry Wall creative control over the official Perl while others
are able to make improved and specialized versions.
The Artistic License also grants programmers the right to embed Perl in
commercial creations of their own 
(provided the primary selling point of that creation isn't that Perl is embedded in it).
This lets you build a business around Perl and even make user extensibility using Perl a 
feature of products you create.
Further more, commercial interest may cause Perl to be ported to new platforms where
the free software developers don't have the knowledge or inclination.

Under the terms of the GNU GPL, the software community may modify and enhance Perl, and 
may redistribute, and even sell, copies of Perl (as long as the license is perpetuated in copies).
As Perl cannot be taken off the market by any single entity, you'll never 
find yourself helpless to obtain bug or security fixes
fixes, as any competent C programmer can, with some work, perform this service for you.

This is a rough summary; read the file F<Artistic>, included with Perl's source code, and the
GNU GPL at F<http://www.gnu.org/copyleft/gpl.html> for details.


=head2 Perl's Influences

I can't think of a better introduction to the Perl language than listing the features
it sports, much like an advertising brochure might.

It's widely known Perls 1 through 4 borrowed from "C, sed, awk, and sh" and
"language historians will also note some vestiges of csh, Pascal, and even BASIC-PLUS"
(I'm quoting the F<perldoc perl> manual page here).
Regular expressions originated in I<ed>, the Unix line mode text editor, of which
I<sed> is a non-interactive version intended for batch processing text data.
I<awk> is text scanning and reporting language also possessing a decidedly C influence
(which is little wonder as awk and C shared creators).
I<csh> apparently contributed some notions of lists and list processing
(including the C<$#arr> syntax for counting the number of elements in an array, C<@arr>),
but the ideas of readily piping data to from other programs and expanding variables
in strings come from shell scripting languages such as csh and sh.
(sh is the Bourne shell).


=head3 Perl 5 Influences

Perls 1 through 4 took from procedural, special purpose text processing, and scripting
languages, but Perl 5 branched out into functional and and object oriented languages.

Perl 5.00, released in 1994, introduced objects, a feature first realized in Simula (1968), where it
languished in academia for nearly 20 years before it was popularized by C++.
Perl would add support for overloaded operators soon afterwards, allowing programmers
to use create objects that behaved as ordinary strings and numbers as far as the operators
are concerned, or even invent new meanings for operators when used on their objects.
Perl 5.00 also took the idea of lexically scoped variables from Lisp.
(I<Lexical> is a functional programming concept where it refers to a system of 
scope and reference counting where a variables are limited in scope
to the current block).

>From PL/I (1965), Perl 5.005 took multiple concurrent threads of execution.

Lisp also introduced a concept of I<code as data>, which would later become known as I<reflection>, that
allows a running program to inspect itself to learn about such things as functions defined and variables
defined.
Perl barrowed this idea as well (though no languages outside of the Lisp family of languages
have truly completed and generalized this idea because Lisp source code is just lists of data).


=head3 Perl 6 Influences

Perl 6 continues in the footsteps of Perl 5, drawing from functional, object oriented,
and specialized languages.

Python arguments may be passed into functions using the parameter names rather than position in a list.
In other words, if a function had a variable named C<fred>, a call to that function
could supply a value for C<fred> with an expressioin of the form C<function(fred = 10)>.
(Apologizes if there is prior art here I'm not aware of).
Perl 6 adopts this feature.

The Lisp folks have a saying about how all other languages slowly evolve towards Lisp.
Perhaps this is true, of they don't mean evolve I<exclusively> towards Lisp.
Lisp macros are defined just as regular function call is, but when called, the 
call executes immediately, stopping compilation momentarily to do so. 
The output of the macro may be code that's substituted back in in place of the call
(which may be nothing more than a constant value or variable reference computed at compile time).

Also useful for extending the language as is an idea from the ML family of languages:
allowing programms to extend the language by introducing entirely new operators rather
than merely overloading existing operators.

Perl 6 better rounds out Perl's reflection facilities, allowing inspection of the parameter names and types
expected by subroutines and the ability to inspect lexical variables in other lexical contexts.

Perl 6 adopts Smalltalk's (early 1970's) concept of making everything an object to generalize
the language and add flexibility to objectish things people want to do.
In Perl 6, for example, subroutines are objects, and may be queried for information such as
the parameters they expect, and operators are a kind of subroutine, and likewise may be
queried for meta-information.

Python and C both had a concept of user defined types beyond the character and numeric
types included with the language, and both languages attempted to check for consistant
use of these types, as, unlike numeric types, they couldn't automatically be converted between.
Object oriented languages and functional languages alike ran with the idea of 
program validation, and Perl 6 optionally, on a variable by variable basis, performs this checking.

The Icon programming language introduced a combinational behavior, where logic tests
are performed of sets of data, where each set may be composed of other sets.
This overloads the meanings of the logic operators.
For example, C<< (1|2|3) = (0|1) >> is true as C<1> exists in both sets.
Perl 6 includes this feature.

Perl 5 mplementations are available for most of the features mentioned here,
including combinational logic, the idea of making everything an object,
better type checking, and to a degree, named parameters and reflection.
Other topics are covered in this book as well.


=head3 Functional and Object Oriented Languages

I<Object oriented> languages introduce a sence of multiplicity to programming, where
instead of merely having one module with its own data and functions, you may have
multiple independent copies of that module, and rather than having a single implementation
of that module, multiple versions of it may be swapped out and multiple versions may
be used concurrently in a single program.

Functions in I<functional languages> are garunteed to return the same results given the
same arguments, which implies a lack of global variables and a lack of side-effects from expressions.
For example, there are no special variables to alter the basic rules of pattern matching
or built-in functions to change the working directory for the entire program.
As a result of restricting the design of the language, functional languages developed
powerful primitives for expressing solutions recursively and in terms of pipelining lists
of data through series of operations.
Just as many languages are primarily, but not completely, object oriented, most
functional languages are primarily, but not entirely, functional.
Lisp, Scheme, Haskell, and the ML family of languages represent functional languages, with
the last two being the most pure of these.

Many functional languages, such as Common Lisp and Ocaml (an ML language), 
have object systems and are thus hybrid languages.
As Perl 6 adds more list processing built-in functions, it seems fair to call
it a hybrid language as well.


=head2 Get and Build Perl

Examples for this book require Perl 5.8.4 or later.
Some features described require  "bleeding edge" (unstable) version 5.9.2.
One module used heavily throughout this book, F<autobox>, requires patches to the Perl interpreter.

=head3 Microsoft Windows Users

ActiveState Perl directly interfaces to the Microsoft Windows operating system and because
of that is normally the preferred version of Perl for Microsoft Windows machines.
However, it doesn't come with sources and modules aren't built inside of a full POSIX 2 compliant
environment, and this hobbles some of the more demanding modules documented in this book.
You're going to need the Cygwin environment to take advantage of this book.
Download Cygwin from F<< http://www.cygwin.com >>.
You'll get a small installer that automates fetching and installing potentially hundreds of optional
software bundles.
You're going to need a full compiler build environment including F<make>, F<gcc>, and F<bash>.
After installing Cygwin and the appropriate bundles, you have a Unix-like system
running on top of Microsoft Windows.
Cygwin's command shell is F<bash>, the GNU Bourne Again Shell, which works with the
command line examples in this book.
Modules install using the process shown XXX section.
Code examples in this book require no modification to run on Microsoft Windows (especially under
Cygwin), but you should consult F<perldoc perlport> for information about potential portability
problems.
If you don't have F<perldoc> installed, the same documentation is available online at
F<http://www.perldoc.com>. 
Follow the instructions in the next section, "Building Perl", for the next steps in
building Perl from source.


=head3 Building Perl

For POSIX 2 compliant systems with full build environment (such as Cygwin, Linux, FreeBSD, and Apple's MacOS X),
get the portable version of Perl 5 from the CPAN at F<http://www.cpan.org>.
Fetch the latest source code from F<< ftp://ftp.cpan.org/pub/CPAN/src >>.
Here are two of the entries right now:

  -rw-rw-r--   1 ftp     ftp     11930764 Jul 19 21:57 perl-5.8.5.tar.gz
  -rw-rw-r--   1 ftp     ftp     11995887 Mar 16  2004 perl-5.9.1.tar.gz

Odd minor numbered versions are development versions. 
C<< 5.9.1 >> is development (C<9> is the minor version) and C<< 5.8.5 >> is stable (C<8> is the minor version). 
If you have the F<bunzip2> utility, get the F<bz2> version instead of the F<gz>
version to save bandwidth.
On a Unix-like system, the install process might go something like this, assuming Perl
version 5.8.3:

  tar -xzvf perl-5.8.5.tar.gz
  cd perl-5.8.5
  ./Configure -de && make && make install

After moving into the directory containg the sources and before running
F<Configure>, apply the F<< autobox-0.xx/patch/perl-5.x.x.diff >> patch for your version of
F<perl> and your version of F<autobox>.
You'll first need to download and uncompress the F<autobox> module from CPAN.
Find it using the CPAN search engine at F<http://search.cpan.org>.

  tar -xzvf autobox-0.12.tar.gz
  tar -xzvf perl-5.8.5.tar.gz
  cd perl-5.8.5
  patch < ../autobox-0.12/patch/perl-5.8.5.diff
  ./Configure -de && make && make install
  
Perl's install process will no longer replace F</usr/bin/perl> by default, so if you have a vendor supplied
version that you'd like to install over top of, use the C<< --prefix >> argument
to C<Configure>:

  ./Configure -de --prefix=/usr && make && make install

After C<make install>, run F<perl -v> and you should see a message similar to:

  This is perl, v5.9.4 built for i386-netbsd
  (with 1 registered patch, see perl -V for more detail)
  
  Copyright 1987-2004, Larry Wall
  
  Perl may be copied only under the terms of either the Artistic License or the
  GNU General Public License, which may be found in the Perl 5 source kit.
  
  Complete documentation for Perl, including FAQ lists, should be found on
  this system using `man perl' or `perldoc perl'.  If you have access to the
  Internet, point your browser at http://www.perl.com/, the Perl Home Page.

Your version and host platform will likely differ from mine. 
If you do remove the version of Perl supplied with your operating system, scripts 
written in Perl that came with your operating system may fail to function - newer versions 
of Perl 5 are more aware of syntax errors, deprecated features, and grossly unsafe
operations than earlier versions of Perl 5. 

If you elect not to replace the F<perl> that came with the system, the default (for most platforms) is to
install into F</usr/local>. 
Use F<< #!/usr/local/bin/perl >> as the she-bang line to use the new version you've installed.
I like to reference specific versions of Perl in my scripts easy upgrades.
I can then change each script, one by one, to use the new version of Perl and test it.
This way, no scripts broke completely and escaped attentioned.
Specific versions of Perl may be referenced explicitly as long as they aren't uninstalled:

  #!/usr/local/bin/perl5.9.0

Modules dependant on a specific version of Perl will need to be installed
for the new version of Perl.
The old version of the module will stay installed for the old version of Perl.
The F<cpan> shell's F<autobundle> command creates an installable bundle of modules installed
for an existing version of Perl and places the bundle under the F<.cpan> directory of your
home directory (or whereever F<cpan> was configured to place its data).
This is useful if you've installed modules for the previous version of Perl and programs depend on those modules.
After installing a new version of Perl, install this bundle as if it were any other
Perl module.
XXX reference CPAN module install instructions

=begin WARNING RedHat's RPM Package Manager

It is strongly recommended that RedHat users doing serious development
install Perl from source and not use F<< rpm >> to install modules or 
updates. Perl does a lot of work to manage different versions of
module dependencies and RPM was never designed for this complexity.

=end


=head3 Development Versions of Perl

Development versions include bug fixes as well as new features and usually work well.
Use the F<rsync> utility to download (and update an already downloaded copy) of
Perl:

  mkdir bleedperl
  rsync -avz rsync://ftp.linux.activestate.com/perl-current/ bleedperl/

To build a development version of F<perl>, you'll need the F<-Dusedevel>
command line argument to C<./Configure>:

  cd bleedperl
  ./Configure -de -Dusedevel && make && make install

Module incompatabilities and errors with some expressions are common problems. 
If you don't mind dealing with and reporting these hiccups, running the development version
helps the testing process.


=head2 Perl as the Script Interpeter

The first line of any Perl program should read:

  #!/usr/bin/perl

If you're writing Perl on Win32 with the intention of uploading it to a Unix-like system, turn MS-DOS
style newlines off in your text editor and use this example first line instead:

  #!/usr/bin/perl --

This prevents a common problem where Unix-like systems look for a program named
F<< /usr/bin/perl^M >> (where F<< ^M >> is the character generated by holding down
the Control key and pressing the M key) and fail to find it. 
The F<< -- >> tells F<perl> that no futher arguments follow and to ignore the rest of the command line.
Alternatively, the single line command C<< perl -pi.bak -e 's/\r$//' program_name.pl >>
will strip MS-DOS style line feeds. 
You will need to trade the single quotes for double quotes on MS-DOS based systems.

Unix-like systems use the first line of a program to specify which interpreter
should run the program when the first line starts with the magic sequence
C<< #! >>, pronounced I<shebang>. 
Even Win32 users should use the shebang line. 
Environments such as Apache use this and Perl itself looks for command line arguments there on the Win32 platform.

Perl versions 1 through 5 have been mostly backwards compatiable so the 
executable name F<< perl >> has been kept. 
Perl 6 currently installs
as F<< perl6 >> and will probably continue to do so atleast until an
automatic translation or emulation mechanism is put in place. 

To distinguish Perl 6 programs from ones written for Perl 5 and earlier, use this
she-bang line:

  #!/usr/bin/perl6

Give programs their she-bang lines and mark them executable with F<< chmod u+x program.pl >>
and run them as if they were a binary program.
Common Gateway Interface programs should end in F<.cgi> as most webservers, even on Unix-like
systems, look at the file extension. 
Command-line programs written in Perl don't require F<.pl> on the end of their name.
(What if you rewrite the program in another language later?)


=head2 Perl Basics

Comments start with the C<#> character and may appear on the end of the line.
Not all C<#> characters start comments - they may appear in quoted strings,
regular expressions, or as user-selected quote characters.

  # This is a comment

The she-bang line is a comment to F<perl>.

Import modules with the C<use> syntax: C<use module @options;>.
Pragmatic modules  modify the environment in some way
rather than providing routines or objects or code libraries. 
These traditionally go next after the she-bang line:

  #!/usr/bin/perl

  use strict;
  use warnings;

Some modules, such as F<CGI::Carp>, are considered quasi-pragmatic. 

  # Make sure headers are sent reguardless of most compile time errors - Perl 5
  use CGI::Carp 'fatalsToBrowser';

This intercepts warning messages and fatal errors, queuing up warnings in case of a fatal error, and sending
HTTP headers, fatal errors, and accumulated warnings to C<< STDOUT >>, formatted for display
by a web browser. 
F<< CGI::Carp >> is a must for web development.
As mentioned in L<StrictureByDefault>, C<< use diagnostics >> is handy too,
especially for those new to Perl needing explanations of what the errors mean. 
Alternatively, consult F<perldoc perldiag> to discover the meaning of error messages.
L<DebuggingPerl> deals with common problems.

C<use> also requests versions. 
It can request a minimum version of a module or a minimum version of F<perl> itself.

  use 5.10.0;   # Require atleast perl version 5.10
  use 6;        # Require perl 6 to run

Failure looks something like:

  Perl v5.10.0 required--this is only v5.9.0, stopped at -e line 1.

This and the C<class> statement are two days Perl 6 code is distinguinshed
from Perl 5 code.


=head2 Installing Modules

Perl is a user-extensible language. 
Modules are the means by which the language is extended.
The most important features are the ones we haven't even thought of yet.
This book makes heavy use of optional modules to extend Perl 5 into doing Perl 6-like things. 

Perl 6 will fetch and install modules for you, should you attempt to use a module
you don't currently have installed. 
We can do this too with Perl 5 with the F<< Acme::Intraweb >> 
module and the F<< CPANPLUS >> module. 
F<< CPANPLUS >> also gives you an easy shell to use to install modules manually, invokable with
the command F<< cpanp >>. 
Once in the shell, use the F<< install >> command
with the exact name of a module, without version number, and the module will
download and install. 
F<< Acme::Intraweb >> will likely fail on atleast one
module in this book, so remember that using F<cpan> or F<< cpanp >> are the traditional methods of
installing modules.
Fetch F<CPANPLUS> from F<< http://search.cpan.org/search?q=CPANPLUS >>,
fetch F<Acme::Intraweb> from F<< http://search.cpan.org/search?q=Acme::Intraweb >>,
and install them by hand. 
In the search results, the bolded name takes you to the documentation, 
and the smaller version with version number takes you to a screen with a download link.
Click on F<Acme-Intraweb-1.01> in the search results for C<< Acme::Intraweb >>, for example.

  tar -xzvf CPANPLUS-0.048.tar.gz
  cd CPANPLUS-0.048
  perl Makefile.PL
  make
  make test
  make install

This procedure is standard. 
Repeat the process for F<< Acme::Intraweb >>.


=head2 Data Types

Perl features true X<lexical> variables. 
Lexicals are rare treat for a language frequently used outside of acedamia to write production code. 
Lexical variables are valid in any code until the end of the block. 
Even if a reference to a portion of that code is returned or stored and then 
executed later, those variables will still have their values.
The values will be remembered even if the routine has since been called again.
This feature comes to us by way of Lisp. 
Declare lexical variables using the C<my> keyword.
Perl's basic data types are hashes, arrays, and scalars, any of which can be lexicals when
declared with C<my>:

  # Lexical variables - Perl 5 and Perl 6

  my $foo;   # declares a scalar - references are scalars
  my @foo;   # declares an array
  my %foo;   # declares a hash

Scalars may hold references to other things such as regular expressions, objects, and code.


=begin NOTE Naming Things

Variables traditionally have lower case names with underscores separating words,
C<< $like_this >>. 
Constants traditionally are written in all capital letters
with or without underscores, C<< $LIKETHIS >>. 
Class names are written in mixed case with no underscores, C<< SuchAsThis >>.

=end NOTE

Typing F<< perldoc perlfunc >> at the shell pulls up documention for the built-in functions. 
The Perl core is rich in functions that interface to the operating
system, and work on arrays and hashes.
F<perldoc perlfunc> uses a sort of short-hand when listing the functions.
For example, C<chomp> is listed with three forms:
 
   chomp VARIABLE
   chomp( LIST )
   chomp   

C<chomp> will work on any variable, a list of terms, or without an argument at all. 
Built-in functions like C<chomp> may do completely different
things depending on how you use the result (which context they execute in) and what you pass to them.
Prossiblities for the kinds of things you pass to built-ins are:

=over 1

=item C<VARIABLE> - any variable, such as C<$scalar_var>, C<@array_var>, C<%hash_var>

=item C<LIST> - one or more values of arbitrary origin, all in a list

=item C<ARRAY> - an array variable, which looks like C<@foo>

=item C<HASH> - a hash variable, which looks like C<%foo>

=item C<EXPR> - sometimes an expression that products a hash value, sometimes an expression that picks a value out of an
array, other times other things

=item C<NUMBER> - a number

=item C<FILENAME> - a string representing the name of a file

=item C<FILEHANDLE> - the result of opening a file as well as C<STDOUT>, C<STDIN>, and C<STDERR>

=item C<BLOCK> - a block of code, either right there, or else a reference to a subroutine

=item Others - other values with less formal meanings are used, too

=item None - operators on the default variable, C<$_>

=back

Operations that require an C<ARRAY> won't accept a C<LIST>. 
An array can be modified in various ways where a list is read-only. 


=head2 Chapter Summary

That's the penny tour. 
F<perldoc perlintro> is another quick introduction to Perl, complimentary to this chapter,
that introduces Perl from the perspective of its basic syntax.
You should have F<perl> installed and built from source
(though many examples will still work with only a reasonably modern but otherwise unmodified F<perl>).
You should know how to install modules, know where to go for documentation on
for modules, for operators, and for built-in functions.
I've tried to convey how Perl builds on the ideas of other languages, and how
the individual ideas are more valuable for being combined in a greater-than-the-sum-of-their-parts sort of way.



More information about the Phoenix-pm mailing list