Re: file descriptor leak in 5.2-RC

From: Oliver Brandmueller <ob_at_e-Gitt.NET>
Date: Sat, 27 Dec 2003 13:31:48 +0100
Hello David, hello everybody.

On Sat, Dec 27, 2003 at 12:18:20AM +0000, David Malone wrote:
> > during the machine is running on high load and after going to single 
> > user mode. You can clearly see, that even though kern.openfiles still 
> > shows a high number, pstat -f only finds very few files.
> 
> Ahhh crud - the kern.file sysctl isn't completly calculated from
> the list of all open files - it iterates through all the processes
> to form the final list. Could you try rerunning pstat with the patch
> below - it walks the full open file list, rather than checking each
> process (this may leak open file info to people within jails on the
> machine, hopefully that is not a problem for you...)

Though I'm running out of time soon, the machine is still not in 
production. I do not have users and jails, so no problems at all.

> (You'll need to recompile your kernel, but not anything else...)

Even here no problem even in building a new world ;-)

> If the files start to show up here, then we can begin to figure out
> where they're comming from.

OK, fstat and lsof still don't see the files, but pstat does now!

The output is quite long and I'm not sure everybody here likes Mails of 
250 Kilobytes, so I do give the URL here:

http://the.addict.de/~ob/pstat-patched.txt

(if someone likes to see that in a mail, I can send it of course).

The main thing here is now:

4333/262144 open files
   LOC   TYPE   FLG  CNT MSG   DATA        OFFSET
c7757154 inode    RW   5   0 c7540000             474d
c75a2e14 inode     W   1   0 c861f820                0
c8606198 inode    RW   1   0 c7540000                0
c8594220 inode    RW   1   0 c72ebb2c                0
c85a1b28 inode    RW   1   0 c72ebb2c                0
c84b2dd0 inode    RW   1   0 c72ebb2c                0
c852d908 inode    RW   1   0 c72ebb2c                0
[...]

I had a quick look over the rest of the table, and it seems as if nearly 
every other line looks the same as the fast few lines, except the LOC 
value changing. kern.openfiles show 4332, so the pstat -f output 
corresponds with these values just fine.

Does that mean, that with the same value for "DATA" it is the same file 
all over that's opened? Can I somehow find the correspondig file?

Thanx for the help, Oliver

PS: The machine has to go live until the end of the year, including a
period of about 16-24 hours of testing and a preiod of 24 hours of close
monitoring. This means I have to have a running system up at least
tomorrow evening. I currently plan something like installing 4.9 if I
cannot see any quick fix within the next 24 hours. I would really like
to track down the problem further, as it seems I'm one of the few who
can reproduce that currently, but I don't have any hardware powerful
enough to stick it into the testing place at the moment. If someones
willing to go all through this during the weekend, I can offer IRC chat,
phone call and maybe even access to the machine.

-- 
| Oliver Brandmueller | Offenbacher Str. 1  | Germany       D-14197 Berlin |
| Fon +49-172-3130856 | Fax +49-172-3145027 | WWW:   http://the.addict.de/ |
|               Ich bin das Internet. Sowahr ich Gott helfe.               |
| Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |
Received on Sat Dec 27 2003 - 03:32:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:35 UTC