[sf-perl] parallelize system tasks, collecting statuses and output

Fri Mar 14 22:12:09 PDT 2008

David,

I think the sequentialism is in your code. You fork() in a loop, and  
each child does its thing, but the parent process -- the one that is  
about to fork the next child -- is sitting there waiting on the  
child's output:

	chomp( my $line = <CHILD_READER> );

Until the child gives it something, you've got blocking IO there. It  
won't start the next child until after it's received the output from  
the previous child. To make it actually parallel, you have to not wait  
on output from the child until after you've created all the children.  
You can do that via a double-fork server, some sort of signaling, or...

Have you considered using threads? Perl now comes with real live  
working thread support (if you compiled it in). The threads API is  
pretty brilliant, IMO, as it makes it almost as easy as in Java to  
write multi-threaded code.

http://search.cpan.org/~jdhedden/threads-1.69/threads.pm

If you didn't happen to compile in thread support, or are running on  
someone else's copy of perl who didn't, there's a CPAN module that  
emulates the thread API using fork(). It's much slower than real  
threads, but lets you write code that works without threads now and  
with them later when you can update perl.

http://search.cpan.org/~rybskej/forks-0.27/lib/forks.pm

In either case, it explicitly lets you detach from threads and let  
them run at their own pace, gathering the results later. Your driving  
script can only pick up the results in single file, but threads may  
finish in a different order than you started them. I've used this a  
couple of times and it's pretty cool. ;-)

I think the threads API would be a good solution to your problem, if  
it's available to you.

-- Mike

On Mar 14, 2008, at 7:38 PM, David Alban wrote:

> greetings,
>
> our sysadmins regularly need to restart a service on, say thirty
> machines.  the service takes two and a half minutes to restart.  i'd
> like to write a perl program they can use to parallelize the restart.
> so that the whole operation takes three minutes rather than an hour or
> more.  i want to collect the statuses and any output of the service
> restart commands.
>
> i found the Bidirectional Communication with Yourself section of the
> perlipc man page.  i'm trying to hack their example so that only the
> child writer writes to the parent reader.  my hacked version forks two
> child processes, which i want to run in parallel.  each child process
> ssh's to a host, sleeps a small amount of time, and then runs the
> hostname command.  but the child procs seem to be running serially.
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
>
>     # log_timestamp() below comes from this module
> use <LOCAL LOGGING MODULE>;
>
> use IO::Handle;
>
> my @hosts = qw( hostname_1 hostname_2 );
> my $numprocs = @hosts;
>
> my @output;
>
> for my $instance ( 1..$numprocs ) {
>  my $index = $instance - 1;
>
>  pipe( CHILD_READER, PARENT_WRITER )
>    or die "can't pipe( CHILD_READER, PARENT_WRITER ): $!\n";
>
>  PARENT_WRITER->autoflush( 1 );
>
>  my $pid;
>
>       # parent
>  if ( $pid = fork() ) {
>    close PARENT_WRITER;
>
>    chomp( my $line = <CHILD_READER> );
>    $output[ $index ] = $line;
>
>    close CHILD_READER;
>  } # if
>
>       # child
>  else {
>    not defined $pid and die "can't fork: $!\n";
>
>    close CHILD_READER;
>
>    my $host = $hosts[ $index ];
>    my @results = qx{ ssh $host "sleep 5; hostname" };
>
>    print PARENT_WRITER
>          log_timestamp(),
>          " child pid $$; instance $instance; results => ",
>          join( '', @results );
>
>    close PARENT_WRITER;
>
>    exit;
>  } # if
> } # for
>
> for my $instance ( 1..$numprocs ) {
>  my $index = $instance - 1;
>  print $output[ $index ], "\n";
> } # if
>
>
>
> --------------------------------------------------------------------------------
>
> i execute this as:
>
>  $ date; perl junk; date
>
>
> and get:
>
>  Sat Mar 15 02:30:22 UTC 2008
>  2008-03-15 02:30:27 +0000 child pid 5469; instance 1; results =>  
> hostname_1
>  2008-03-15 02:30:32 +0000 child pid 5471; instance 2; results =>  
> hostname_2
>  Sat Mar 15 02:30:32 UTC 2008
>
>
> it's running the second child process only after the first one
> finishes, which defeats my goal of parallelizing.  what am i missing?
>
> there's so much stuff out there that promises to help with this and i
> don't know whether i'm going down the wrong path.  surely you fine
> folks must have done stuff like this before.  do indeed tell me to
> rtfm, but please tell me which fm (or other doc) to r.
>
> please also feel free to tell me i'm going about it totally the wrong
> way, perhaps with a pointer in the general (better) direction.
>
> thanks,
> david
> -- 
> Live in a world of your own, but always welcome visitors.
> _______________________________________________
> SanFrancisco-pm mailing list
> SanFrancisco-pm at pm.org
> http://mail.pm.org/mailman/listinfo/sanfrancisco-pm

---------------------------------------------------------------------
Michael Friedman                     HighWire Press
Phone: 650-725-1974                  Stanford University
FAX:   270-721-8034                  <friedman at highwire.stanford.edu>
---------------------------------------------------------------------