[VPM] monitor process, april meeting

Tue Apr 1 12:12:21 CST 2003

At 09:43 AM 4/1/2003 -0800, nkuipers wrote:
>Hello all,
>
>I have a script that connects to a remote database, extracts one DNA 
>sequence,
>runs it through a BLAST program on a different remote machine, and then
>updates the database with the new information.  Rinse and repeat, for every
>sequence.  I want this to be automated, such that if something goes horribly
>askew, my script can detect this and attempt to pick up where it left off,
>until it succeeds in doing so.  For example, let's say the BLAST server goes
>down and my BLAST system call hangs.  In this case, I want to pound 
>the server
>with a "let me in?", say once every five minutes, until it connects, and then
>restart from the last successful sequence, which is stored in a tempfile.  I
>also want an interruption to be printed to a log file, and I think that both
>of these tasks could include altering the SIG handlers, though I am not sure
>if this is the best way, or what SIG handlers would need attention.  I 
>suppose
>a crontab is an option, but I would like to keep everything as 
>contained in my
>script as possible, rather than having little files flying around here and
>there, to make it more portable.  But if cron is the best way, so be it.  So,
>can anyone enlighten me as to what is involved in having a script "listen" to
>a process, and facilitate its continuity in the event of interruptions?

Basically, perldoc perlipc.

It's not clear to me that you need an extra process.  You could do this 
with a single process that runs forever and loops around the operations 
you need to do.  It would timeout the server connection if it might 
hang (perldoc -q timeout) and so forth.  I don't think you need to go 
the full IPC route.

If this process is so important that you want to make sure it restarts 
no matter what might happen to it, then I'd run it from cron for (say) 
every hour, and have it quit immediately if it detects that it's 
already running (write the pid to a .pid file, read that file, check ps 
to see if that process is still running).  If you don't use cron then 
having another process just to watch the first one means, what if the 
monitor script quits as well?

And it sounds like you need a checkpoint-restart capability, so write 
out your progress to a file as you complete operations (could get into 
whole discussion about ACID semantics here, won't) and use that on a 
restart.  Not the easiest thing in the world to get right.

-- 
Peter Scott
Pacific Systems Design Technologies
http://www.perldebugged.com/