Re: file descripter leak in current with Qmail?

From: Arjan van Leeuwen <avleeuwen_at_piwebs.com>
Date: Mon, 7 Jun 2004 21:41:09 +0200
On Monday 07 June 2004 21:31, Arjan van Leeuwen wrote:
> Hi,
>
> On Monday 07 June 2004 19:06, Robert Watson wrote:
> > On Mon, 7 Jun 2004, David A. Benfell wrote:
>
> (...)
>
> > However, I think the more serious element here is the reason why you
> > reach the limit: this happens "naturally" under some workloads simply
> > because of large numbers of open files and network connections.  However,
> > in some workloads, it's a symptom of a system or application bug, such as
> > a resource leak.
> >
> > Because the resources were returned when qmail was killed, that largely
> > eliminates the possibility of a kernel resource leak (not entirely, but
> > largely), as most kernel resource leaks involving file descriptors have
> > the symptom that even after the process exits, the resources aren't
> > release (i.e., a reference counting bug or race).  This suggests a user
> > space issue -- that doesn't eliminate a system bug, as it could be a bug
> > in a library that manages descriptors, but it also suggests the
> > possibility of an application bug, or at least, a poor application
> > interaction with a system bug.  Occasionally, we've seen bugs in the
> > threading libraries that result in leaked descriptors, but my
> > recollection is that qmail doesn't use threads.  So that suggests either
> > a support library (perhaps crypto or the like), or qmail itself.  Or that
> > you just hit an extremely high load. :-)
> >
> > In terms of debugging it: your first task it to identify if there's one
> > process that's holding all the fd's, or if it is distributed over many
> > proceses.  After that, you want to track down what kind of fd is being
> > left open, which may help you track down why it's left open...
>
> Just as I'm reading this, I'm seeing the same thing on my -CURRENT server,
> which has a _very_ low load (atm, it's only routing the internet traffic
> for 3 users and serving SMTP for 2 of them). I'm also running qmail. The
> kernel is from June 6. How do I go about investigating this further?

Replying to myself -
fstat shows all open files evenly distributed among the running processes.

Arjan

Received on Mon Jun 07 2004 - 17:41:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:56 UTC