[LA.pm] Problem with FastCGI under Apache 2

Thu Feb 9 16:07:33 PST 2006

"Benjamin J. Tilly " < ben_tilly at operamail.com > wrote:
> "Loren Osborn" <lsosborn at dis-sol-inc.com> wrote:
> > 
> > We are having a serious problem with signals in FastCGI under Apache
2
> > and could use any assistance you could offer.  It appears that
FastCGI
> > is attempting to shutdown a perl script that it's running by sending
it
> > a SIGTERM, but the process is asleep and the signal is queued for
when
> > the process is next awakened.  When FastCGI gets a request for a
page of
> > the type that was sent the SIGTERM, it sees the process as still
alive,
> > and sends it another request. The process is awakened, sees the
SIGTERM,
> > and dies without providing a response to FastCGI.  FastCGI then
produces
> > an error indicating that the script produced no output.
>
> Set the environment variable PERL_SIGNALS to unsafe for the script.
>
> Here is the issue.  Perl has 2 ways to handle signals.  One is safe,
which > will interrupt and handle the error when the current opcode
finishes.  The > other is unsafe, which uses setlongjmp to break out of
whatever Perl is 
> doing right now.
>
> Unsafe signals guarantee that signals are handled immediately, but run
a 
> risk of dumping core.  (The more you do in your signal handler, the
higher > the probability of dumping core.)  Safe signals are handled
eventually but > won't dump core.
>
> The default in Perl 5.8 is safe signal handling.

We did try this before the latest incarnation of our issue.  This is
actually how we arrived at our work-around.  When we tried
    PERL_SIGNALS=unsafe
but we found that other behaviors broke. (All our signal handlers do is
toggle some Boolean variables, and possibly exit -- depending on 
their state.)  We thought the work-around might be a less invasive
change.

Can you advise what other behaviors PERL_SIGNALS=unsafe might alter?

>
> > We do have one work around, but just discovered that this work
around
> > does not always solve the problem.  The work around was, if the
process
> > was asleep when the SIGTERM was received, then it answers one final
> > request before exiting.  We are finding issues, apparently, when the
> > process has been awakened after excessive periods of time.  (hours)
>
> I don't have a solution to this problem, and strongly suspect that it
is a > separate bug.
>
> There are lots of possibilities though.  For example suppose that your

> script has a database connection, and there is a firewall between the 
> script and the database.  The database connection goes over a TCP/IP
> connection, and after a certain time without activity the firewall
will
> terminate the TCP/IP connection.  After that the script will fail
because > it doesn't have a database connection.
>
> Solutions to that could include carefully testing whether the database
> connection works (and reconnecting if need be) on every request, or
having > the script wake itself up periodically to do something trivial
with the
> database.

We were experiencing this issue earlier, but included code at the start
of each request to specifically test for stale/invalid database handles
and have confirmed that we correctly recover from bad database handles
by confirming that the script works correctly after shutting down and
restarting our mysql server. 

I do agree that this might be symptomatic of another issue, but I'm
currently at a loss to think of what this might be.  This requires more
investigation, and we've already devoted a substantial amount of
resources to this issue.

>
> > Any assistance you can offer is very much appreciated,
>
> You're welcome, and I hope that my suggestions work.
>
> Cheers,
> Ben

Thanks for the prompt and relevant response,

-Loren