[sf-perl] Server downtime reporting and recovery
mason at singlefeed.com
Tue Feb 24 17:19:51 PST 2009
I'd say yes, you're going to want to be looking at something like Nagios or
Big Brother, which can check the status of known services/ports/web apps/etc
as frequently as you need, and then invoke scripts to restart things. You
can actually get quite a bit done with home-grown perl scripts, really, but
there are plugins and other things available for tools like Nagios which
you'll probably find save you time (once you learn the system, of course).
On Tue, Feb 24, 2009 at 5:13 PM, Matt Barkovich <barko192 at gmail.com> wrote:
> Hi all,
> I was curious about how those of you who work with web aps deal with
> minimizing downtime when a particular service dies for whatever
> reason. I'm not a sysadmin by training, rather it is a responsibility
> that no one else seemed willing to take. Right now I have a perl
> script that runs as a cron job every five minutes, checking the status
> of the various services on the server and restarting and reporting if
> anything is amiss.
> I've been told that my production schedule needs to be pushed forward
> and five minutes of downtime will soon be unacceptable. Since I've
> got a .NET app running in mono (which has not been kind to me) I need
> to catch problems as quickly as possible and restart the service.
> Most frequently the mono app will just hang indefinitely, not crash
> outright. With the new schedule I don't have time to fix (read
> replace) the problematic app before I go live.
> So my question, what do you folks recommend as far as checking the
> status of services more frequently than every 5 minutes? Would you
> recommend sticking with perl, or this there some FOSS that would
> better serve my purposes? In my research, I've found programs like
> Nagios, but don't know much about them. I'd prefer not to add too much
> the way of overhead, but I also don't want to reinvent the wheel.
> Sorry if this is a little off topic.
> SanFrancisco-pm mailing list
> SanFrancisco-pm at pm.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the SanFrancisco-pm