On Tue, Jan 22, 2008 at 09:59:33PM -0800, Xin LI wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Kostik Belousov wrote: > > On Tue, Jan 22, 2008 at 03:45:32PM -0800, Xin LI wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> Hi, > >> > >> I have got a lot of this in dmesg output for RELENG_7_0 as of today: > >> > >> vm_thread_new: kstack allocation failed > >> vm_thread_new: kstack allocation failed > >> vm_thread_new: kstack allocation failed > >> vm_thread_new: kstack allocation failed > >> vm_thread_new: kstack allocation failed > >> vm_thread_new: kstack allocation failed > >> > >> Any idea? > > > > Does it cause any problems aside from printing these messages ? > > It causes some fork() to fail. > > > What workload do you put on the machine ? > > It was an rsync from NFS to ZFS with ~15M of files, and rsync will > consume basically all physical memory. I end up with some 2GB active, > 4GB wired thing. (The system has 8GB of RAM), and I added a "make -j9 > buildworld" into the chaos to see if things get worse, and it did :-) > > > The messages came from the failure of the kernel to allocate address > > space for the kernel stack for a thread being created. Previously, the > > system would panic encountering this situation. > > Yes, I knew, previously it just panic and hangs there, and thanks a lot > for fixing it =-) > > > This may happen due to kernel_map address space depletion, for instance, > > by having a lot (on i386 machines with > 1Gb memory, ~40000) threads. > > It seems that I have hit some sort of "leak" or some exhaustion issue. > Say, when the workload is gone, the system did not recover from the > situation, and reboot worked fine. > > The system is sort of in production and it is about 20 miles away from > my office. Do you want me to do some experiments for this? Yes, I want to know what exactly leaked. Ideally, I would like to see the series of the output of the vmstat -z and vmstat -m for some time before the system is bogged down. But, even the one snapshot of the vmstat -z/-m output immediately before things stop working would be good to look at. Output of the ps auxwwH is helpful too.
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:26 UTC