APM: FTP

Mike South msouth at shodor.org
Sun Apr 18 00:35:41 CDT 2004


>From austin-bounces at mail.pm.org  Sat Apr 17 16:50:01 2004
>Date: Sat, 17 Apr 2004 13:49:50 -0700 (PDT)
>From: Peter botros <peterbotros at yahoo.com>
>Subject: APM: FTP
>
>am looking for ftp scripts to send evry 15 min if the
>files are in a directory and/or to run to run scripts
>if files are in a directory

I am assuming you are wanting something that "reacts",
so to speak, to files being in a directory, and, when
it sees some, does something to them (where "something"
includes transferring them out of the directory so that
they don't trigger a re-run or otherwise hang around
in the way).

We have to do something like that, and maybe I can
save you some headaches by describing what I think
we do (I haven't seen it firsthand, just know the
description).

First, for the "every fifteen minutes" part, we use
cron.  That way you can just have a script that
does whatever is supposed to be done with the files,
and you won't have to write the "every fifteen 
minutes" part, or make sure it gets started again
when the system reboots, or whatever.

Second, we have a lockfile that prevents two instances
of the script from running at the same time.  One 
day, you might have so much going on that your script
isn't done in fifteen minutes, and then cron fires off
another run of your script and all hell breaks loose
as they both try to work on the same files.

Third, we don't look for "files in the directory", but
"a trigger file in that directory".  The trigger file
lists all the files that are to be processed.  The point
here is that the trigger file gets transferred into
the directory with the other files, but it gets transferred
last.  Why?  Because sooner or later your "every fifteen
minutes" is going to wake the script up right in the middle
of a file getting dumped into the directory, and then 
you'll do your work on half a file.

So, something like this goes in your crontab:

0-59/15 * * * * /home/msouth/bin/handle_files.pl

handle_files.pl would be something like this:

# UNTESTED UNTESTED UNTESTED
#!/usr/bin/perl -w
use strict;

# put a lockfile in the same place as us named
# same thing as us with '.lock' appended
my @files_to_unlink;

my $lockfile = $0 . '.lock';
if (-e $lockfile ) {
    warn "a lockfile exists, I'm not running\n";
    exit;
    # would be better to see if the file just didn't get
    # cleaned up, and wipe it out if that's the case.
    #
    # you can probably "kill 0, PID" or something to see
    # if the PID that put the lockfile there is still
    # running, and just wipe out the file if it isn't
    # (that is, if you put the PID in the lockfile)

    # Also, in real life you will probably have to
    # keep the lockfile somewhere else, because
    # the directory where the script lives is likely not
    # to be writeable
}
else { 
    open(LOCK, ">$lockfile") or die "couldn't open lockfile:$!\n";

    # put the PID in the lockfile so future instances of 
    # this script can check whether we are still running
    print LOCK "$$\n";

    close LOCK;
    push @files_to_unlink, $lockfile ;
}

my $dir = '/home/msouth/dump';

my $trigger = "$dir/trigger.txt";
&cleanup_and_exit unless ( -e $trigger );

open (TRIGGER, "<$trigger") or die "couldn't open $trigger:$!\n";
chomp( my @lines = <TRIGGER> );
close TRIGGER;

my $saw_end = 0;
foreach my $line (reverse @lines) {
    if ($line eq 'END_FILES') {
        $saw_end++;
        last;
    }
}

unless ($saw_end) {
    warn qq{trigger file $trigger is missing "END_FILES" line.  I am bailing, hopefully it's still being transferred\n};
    &cleanup_and_exit;
}

shift @lines while $lines[0] ne 'BEGIN_FILES';

unless (@lines) {
    warn "trigger file $trigger does not have 'BEGIN_FILES', this is not good\n";
    &cleanup_and_exit;
}

shift @lines; # $lines[0] is just 'BEGIN_FILES', remember
    
foreach my $line (@lines) {
    next if $line =~ /^\s*#/;
    last if $line eq 'END_FILES';
    my $this_file = "$dir/$line";
    &process_file($this_file);
    push @files_to_unlink, $this_file;
}
push @files_to_unlink, $trigger;

&cleanup_and_exit;

sub process_file {
    my $file = shift;
    if (system "cat $file >> /home/msouth/dump/all_dumped_files") {
        warn "$file didn't process\n";
        # cp file to error directory
    }
    else {
        # cp file to success directory
    }
}

sub cleanup_and_exit {
    unlink $_ for @files_to_unlink;
    exit(0);
}
__END__

Then you can use a trigger file like this;

BEGIN_FILES
yo
ya
ye
END_FILES

good luck,

mike



More information about the Austin mailing list