perl, cgi, win32, oh my! (was: Re: System)

Tkil tkil-sdpm at
Thu Feb 20 02:16:12 CST 2003


Joel --

First off, let me make sure that I fully understand what you are
trying to accomplish

When you open up Internet Explorer, or Windows Explorer, you go to the
address bar and type in:

When you press enter, it either provides you with the output that you
are looking for, or it causes some other change in state on the
server.  Ok so far?

The quick way to do exactly the same thing in perl -- presuming you
have libwww-perl ("LWP") installed -- is:

   use LWP::Simple qw( get );
   get "";

The slightly less quick way is to build the connection yourself; using
IO::Socket::INET, it's pretty easy:

   my $sock = IO::Socket::INET->new( PeerHost => '',
                                     PeerPort => 80 );
   my $CRLF = "\x0d\x0a"; # HTTP standard EOL marker
   print $sock "GET /cgi-bin/ HTTP/1.0", $CRLF, $CRLF;
   my $response = do { local $/; <$sock> };
   $response =~ s/$CRLF/\n/g;
   my ( $header, $body ) = split /\n\n/, $response, 2;

This should work as-is, assuming that $sock is flushed when we wait
for input on it -- this is usually the case, but if it's not, you'll
have to jump through some hoops, something like:

       my $old_fh = select $sock;
       $| = 1;
       select $old_fh;

Although, if IO::Socket::INET inherits from Handle, you should be able
to just do:


(Doing this with raw socket() calls is left as an exercise to the

As various people have pointed out, is the IP "loopback"
address, also known as "localhost".  Since you're using the "http"
scheme in your URL, you're asking Explorer to connect to that host on
the standard HTTP port 80.

Since this works on your local machine, we can surmise that you are
running a web server of some sort on this machine.  When your web
server receives this request, it has to figure out what to do with it.
It does this by taking apart the URL that is passed to it; in this
case, it sees:

   scheme:  http
   path:    /cgi-bin/

This is where the configuration of your web server can vary.  Most web
servers are configured to have "/cgi-bin/" map to a particular
directory on the server's disk, and it knows that it should execute
any script in that directory.  Before it executes a given script,
however, it sets up the "Common Gateway Interface" environment, better
known as CGI.  This is mostly a matter of setting the appropriate
environment variable values, and perhaps modifying the user and group
ID settings (probably less relevant on your win32 system than on

I'm assuming that is a straightforward CGI-compliant script.  (If
it's not, there's no real point in calling it through the web server,
as it *should* get confused.  If it *doesn't* get confused, then
you're getting very very lucky.)

Here's what Chris was trying to say (I think): is executed *by
the web server* in response to an HTTP request.  So, when you typed

in the address bar, the flow of control went something like this:

   Explorer: open a connection to

   HTTPD: hi there!  listening on port 80! [*]

   Explorer: GET /cgi-bin/ HTTP/1.0

   HTTPD: hrm... /cgi-bin/ is my ScriptAlias directory.
          does exist?  good!
          is executable?  good!
          ok, need to set up some environment...
          alright, time to run the script... i'm awake!  ok, stuff to do, stuff to do. 
         do some stuff.
         i'm done!  here's my output, mr. httpd.

   HTTPD: cool, thanks.  now go away.
          hey, Explorer, here's your data.

   Explorer: thanks.  bye!

[*] http servers only *listen* on port 80; when they "accept" a
    connection, that connection actually has an "ephemeral port" as
    it's local endpoint (something random, typically over 10000).
    This way, port 80 is kept available for incoming connections.

The other half of what Chris was trying to convey is that, since this
is on the same machine, you can cut Explorer and HTTPD out of the loop
by calling directly.  This is where understanding the server
configuration becomes important.

Let's assume that you're using Apache HTTPD for Win32.  Let's further
assume you're using the default configuration, where "/cgi-bin/" is
mapped to "C:\Program Files\Apache Group\Apache\cgi-bin\", via the use
of a ScriptAlias directive in the standard configuration files (by
default, "C:\Program Files\Apache Group\Apache\conf\httpd.conf").

Following the above execution outline, apache httpd will get the
request to run /cgi-bin/, set up an execution environment, then

   C:\Program Files\Apache Group\Apache\cgi-bin\

Then it takes the output and returns it as the content of the HTTP
response.  (Unless something bad happens, in which case it will
probably form a 500 header and some boilerplate body.)

Finally, Chris's ultimate point was that, since it's on the same
machine, you could just invoke directly, instead of making this
"side trip" through the http server.  If doesn't use the CGI
environment at all, and doesn't produce any useful output, you can
indeed use a "naked" system() invocation:

   my $cgi_bin_dir = "C:/Program Files/Apache Group/Apache/cgi-bin/"
   system( "$cgi_bin_dir/" );

(Yes, you *can* use forward slashes -- the win32 api supports them
natively; it's only / cmd.exe that interprets forward
slashes as option specifiers.)

If *does* produce useful output, you will want to capture that,
presumably with backticks or qx:

   my $x_pl_output = qx( $cgi_bin_dir/ );

If it uses the CGI environment, you have to set that up yourself:

   my $x_pl_output = do
       local %ENV = %ENV;
       $ENV{QUERY_STRING} = "foo=bar&quux=baz%20gibber";
       qx( $cgi_bin_dir/ );

Finally, if you are doing and this script on the same machine,
consider whether you can factor out the common functionality into a
library, and just call directly into that library from perl.  This
removes the inefficiency of starting a new process entirely!



The posting address is: san-diego-pm-list at

List requests should be sent to: majordomo at

If you ever want to remove yourself from this mailing list,
you can send mail to <majordomo at> with the following
command in the body of your email message:

    unsubscribe san-diego-pm-list

If you ever need to get in contact with the owner of the list,
(if you have trouble unsubscribing, or have questions about the
list itself) send email to <owner-san-diego-pm-list at> .
This is the general rule for most mailing lists when you need
to contact a human.

More information about the San-Diego-pm mailing list