Network Programming with Perl

Tim Chambers tbc at spamcop.net
Fri Dec 22 23:49:22 CST 2000


Network Programming with Perl

Lincoln D. Stein

Preface

This is a book about network programming in Perl. It will show you how to
write client applications for existing network protocols such as Mail, News,
and HTTP, as well as how to create customized client/server applications
that intercommunicate via TCP/IP. The examples in this book are typical of
real-world applications. Among the working programs we develop here are a
real-time chat and messaging system, a program for fetching and processing
e-mail containing MIME attachments, a script for mirroring data from an FTP
site, and a program for uploading and analyzing text files. My emphasis
throughout is on showing how to design networking applications that are
robust and maintainable, and that reuse existing technology whenever
possible.

The Web makes it possible to create network applications using a universal
server and a universal client, the Web browser. Applications programmers can
easily extend Web servers by using protocols like CGI, servlets, and module
APIs. Thus, programmers can write Internet-based applications without
worrying about what's going on underneath the scenes.However, Web-based
applications are fundamentally limited by the need to do things "the Web
way." Applications that have other requirements, such as the need for a
long-running relationship between the client and the server or a requirement
that a single client exchange messages with hundreds of peers
simultaneously, must move beyond the Web protocols.

Ironically, the great popularity of the Web has vastly increased the number
of Internet connections to offices and households. By making networking
ubiquitous, the Web has breathed new life into TCP/IP as developers rush to
take advantage of the opportunities that this new landscape offers. In a
span of less than one year we have seen the introduction of entirely novel
peer-to-peer systems like Napster, Gnutella, and Freenet; of Jinni, an
adaptive protocol that allows such devices as thermostats, stereos and
television sets to interact over the Internet; and of a host of instant
messaging systems.This trend will accelerate as more of the world is wired.

The common denominator in all the new networking protocols is TCP/IP, the
underlying technology that makes it possible to move an item of information
from point A to point B across the Internet. An understanding of TCP/IP is
fundamental to interoperating with the new breed of network protocol and to
creating your own protocol.

However, there is more to network programming than knowing the steps for
opening a connection to a remote server and sending some data. You must
understand the conventions for the interaction of clients, servers, and
peers; what can go wrong during the course of an Internet connection, and
what to do to correct problems. To interact meaningfully with a server, you
need to know in some detail how an application-level protocol works.

Setting up TCP/IP connections involves a lot of detail, and networking books
based on the C and Java languages tend to spend so much time on the
lower-level details that higher-level details--such as how to actually talk
to a particular type of server--can get lost. Fortunately, Perl has a
straightforward but very powerful interface to the TCP/IP networking system.
In addition, Perl comes with a rich and wonderful set of libraries for
interacting at a high level with Web servers, the mail system, and other
popular Internet services. This allows programmers to get beyond the
low-level details quickly so that they can focus on the interesting part of
network application programming.

You can build extremely sophisticated networking applications on top of
Perl, as well as quick and dirty scripts designed just "to get the job
done." Even if you don't plan to write the next Napster application in Perl,
I think you will find it fun and instructive to see how to build network
applications on top of this language.

This Book's Audience

Network Programming with Perl is written for novice and intermediate Perl
programmers. I assume you know the basics of Perl programming, including how
to write loops, how to construct if-else statements, how to write regular
expression pattern matches, the concept of the automatic $_ variable, and
the basics of arrays and hashes.

You should have access to a Perl interpreter and some experience writing,
running, and debugging scripts. Just as important, you should have access to
a computer that is connected both to a local area network and to the
Internet!  Although the recipes in Chapter 10 on setting Perl-based network
servers to start automatically when a machine is booted do require superuser
(administrative) access, none of the other examples require privileged
access to a machine.

Although this book does take advantage of the object-oriented features in
Perl version 5 and higher, most chapters do not assume a deep knowledge of
this system. Chapter 1 addresses all the details you will need as a casual
user of Perl objects.

This book is a thorough review of the TCP/IP protocol at the lowest level,
or a guide to installing and configuring network hubs, routers, and name
servers. Many good books on the mechanics of the TCP/IP protocol and network
administration are listed in the references in Appendix D.

Roadmap

This book is organized into four main parts, Basics, Developing Cients for
Common Services, Developing TCP Client/Server Systems, and Advanced Topics.

Part I, Basics, introduces the fundamentals of TCP/IP network
communications.

Chapters 1 and 2, Networking Basics and Processes, Pipes, and Signals review
Perl's functions and variables for input and output, discusses the
exceptions that can occur during I/O operations, and uses the piped
filehandle as the basis for introducing sockets. The chapter also reviews
Perl's process model, including signals and forking, and introduces Perl's
object-oriented extensions.

Chapter 3, Introduction to Berkeley Sockets, discusses the basics of
Internet networking and discusses IP addresses, network ports, and the
principles of client/server applications. It then turns to the Berkeley
Socket API, which provides the programmer's interface to TCP/IP.

Chapters 4 and 5, The TCP Protocol and The IO::Socket API and Simple TCP
Applications, shows the basics of TCP, the networking protocol that provides
reliable stream-oriented communications. These chapters demonstrate how to
create client and server applications and then introduce examples that show
the power of technique as well as some common roadblocks.

Part II, Developing Clients for Common Services, looks at a collection of
the best third-party modules that developers have contributed to the
Comprehensive Perl Archive Network (CPAN).

Chapter 6, FTP and Telnet, introduces modules that provide access to the FTP
file-sharing service, as well as to the flexible Net::Telnet module which
allows you to create clients to access all sorts of network services.
E-mail is still the dominant application on the Internet, and Chapter 7,
SMTP: Sending Mail, introduces half of the equation. This chapter shows you
how to create e-mail messages on the fly, including binary attachments, and
send them to their destinations.

Chapter 8, POP, IMAP, and NNTP: Processing Mail and Netnews, covers the
other half of e-mail, explaining modules that make it possible to receive
mail from mail drop systems and process their contents, including binary
attachments.

Chapter 9, HTTP: Talking to the Web discusses the LWP module, which provides
everything you need to talk to Web servers, download and process HTML
documents, and parse XML.

Part III, Developing TCP Client/Server Systems--the longest part of the
book--discusses the alternatives for designing TCP-based client/server
systems. The major example used in these chapters is an interactive
psychotherapist server, based on Joseph Weizenbaum's classic Eliza program.

Chapter 10, Forking Servers and the inetd Daemon, covers the common type of
TCP server that forks a new process to handle each incoming connection. This
chapter also covers the UNIX and Windows inetd daemons, which allow programs
not specifically designed for networking to act as servers.

Chapter 11, Multithreaded Applications, explains Perl's experimental
multithreaded API, and shows how it can greatly simplify the design of TCP
clients and servers.

Chapters 12 and 13, Multiplexed Operations and Nonblocking I/O, discuss the
select() call, which enables an application to process multiple I/O streams
concurrently without using multiprocessing or multithreading.

Sum Chapter 14, Bulletproofing Servers, discusses techniques for enhancing
the reliability and maintainability of network servers. Among the topics are
logging, signal handling, and exceptions, as well as the important topic of
network security.

Chapter 15, Preforking and Prethreading, presents of the forking and
threading models discussed in earlier chapters. These enhancements increase
a server's ability to perform well under heavy loads.

Chapter 16, The IO::Poll Module, discusses an alternative to select()
available on UNIX platforms. This module allows applications to multiplex
multiple I/O streams using an API that some people find more natural than
select()'s.

Part IV, Advanced Topics, addresses techniques that are useful for
specialized applications.

Chapter 17, TCP Urgent Data, is devoted to TCP urgent or "out of band" data.
This technique is often used in highly interactive applications in which the
user urgently needs to signal the remote server.

Chapters 18 and 19, The UDP Protocol and UDP Servers, introduce the User
Datagram Protocol, which provides an unreliable message-oriented
communications service. Chapter 18 introduces the protocol, and Chapter 19
shows how to design UDP servers. The major example in this and the next two
chapters contain a live online chat and messaging system written entirely in
Perl.

Chapters 20 and 21, Broadcasting and Multicasting, extend the UDP discussion
by showing how to build one-to-all and one-to-many message broadcasting
systems. In these chapters we extend the chat system to take advantage of
automatic server discovery and multicasting.

Chapter 22, UNIX Domain Sockets, shows how to create lightweight
communications channels between processes on the same machine. This can be
useful for specialized applications such as loggers.

The Many Versions of Perl

All good things evolve to meet changing conditions, and Perl has gone
through several major changes in the course of its short life. This book was
written for version of Perl in the 5.X series (5.003 and higher
recommended). At the time I wrote this preface (August 2000), the most
recent version of Perl was 5.6, with the release of 5.7 expected imminently.
I expect that Perl versions 5.8 and 5.9 (assuming there will be such
versions) will be compatible with the code examples given here as well.

Over the horizon, however, is Perl version 6. Version 6, which is expected
to be in early alpha form by the summer of 2001, will fix many of the
idiosyncrasies and misfeatures of earlier versions of Perl. In so doing,
however, it is expected to break most existing scripts. Fortunately, the
Perl language developers are committed to developing tools to automatically
port existing scripts to version 6. With an eye to this, I have tried to
make the examples in this book generic, avoiding the more obscure Perl
constructions.

Cross-Platform Compatibility

More serious are the differences between implementations of Perl on various
operating systems. Perl started out on UNIX (and Linux) systems, but has
been ported to many different operating systems, including Microsoft
Windows, the Macintosh, VMS, OS/2, Plan9, and others. A script written for
the Windows platform will run on UNIX or Macintosh without modifications.

The problem is that the I/O subsystem (the part of the system that manages
input and output operations) is the part that differs most dramatically from
operating system to operating system. This restricts the ability of Perl to
make its I/O system completely portable. While Perl's basic I/O
functionality is identical from port to port, some of the more sophisticated
operations are either missing or behave significantly differently on
non-UNIX platforms. This affects network programming, of course, because
networking is fundamentally about input and output.

In this book, Chapters 1 through 9, use generic networking calls that will
run on all platforms. The exception to this rule is the last example in
Chapter 5, which calls a function that isn't implemented on the Macintosh,
fork(), and some of the introductory discussion in Chapter 2 of process
management on UNIX systems. The techniques discussed in these chapters are
all you need for the vast majority of client programs, and are sufficient to
get a simple server up and running.  Chapters 10 through 22 deal with more
advanced topics in server design.

The nice thing is that the non-UNIX ports of Perl are improving rapidly, and
there is a good chance that new features will be available at the time you
read this.

Getting the Code for the Code Examples

All the sample scripts and modules discussed in this book are available on
the Web in ZIP and TAR/GZIP formats. The URL for downloading the source is
http://www.modperl.com/perl_networking. This page also includes instructions
for unpacking and installing the source code.

Installing Modules

Many of Perl's networking modules are preinstalled in the standard
distribution. Others are third-party modules that you must download and
install from the Web. Most third-party modules are written in pure Perl, but
some, including several that are mentioned in this book, are written partly
in C and must be compiled before they can be used.

CPAN is a large Web-based collection of contributed Perl modules. You can
get access to it via a Web or FTP browser, or by using a command-line
application built into Perl itself.

Installing from the Web

To find a CPAN site near you, point your Web browser at
http://www.cpan.org/. This will present a page that allows you to search for
specific modules, or to browse the entire list of contributed modules sorted
in various ways. When you find the module you want, download it to disk.
Perl modules are distributed as gzipped tar archives. You can unpack them
like this:

% gunzip -c Digest-MD5-2.00.tar.gz





More information about the Pikes-peak-pm mailing list