Hi all, in our system we've got three jail-hosts based on FreeBSD-CURRENT of July 2nd. The following problem has been a recurring problem since ~April 15th. The setup of every host is four "base" directories and ~250 jail environements that via nullfs have the base contents (userland applications, installed ports, etc) mounted and only save the data and config-files specific to their jail. Each jail is a webserver with its own IP. Upon doing large filesystem functions, such as a 'cp -R' of the data, tar'ing them, dumping the filesystems for backup, the servers have a 50% chance of "hanging". From having had a top(1) running while the system becomes inresponsive, it seems processes will go into an infinate loop waiting for the filesystem. (I once had it hanging for running df(1). ;-) ) And indeed, all the webs running will function until they make filesystem requests that are not cached. I'm not an experienced kernel-debugger, and hitting Ctrl-Alt-Esc only gives me a trace of the keyboard. If this is a good route to take, hints to how to go about are most welcome. The extra sysctl settings I've got in /boot/loader.conf are beastie_disable="YES" kern.ipc.maxpipekva="104857600" kern.maxfiles="65536" net.inet.ip.portrange.lowfirst="79" net.inet.ip.portrange.reservedhigh="79" kern.ipc.maxpipekva and maxfiles were set because all the jails required more than the default values were set to. The values now are about 5 times what is really used. The last two are to allow users to be in control of their webserver that binds to port 80. I've replaced all the hardware. I've rebuilt FreeBSD and the ports regularly. Still the servers go down about once every 24 hours, particularly when I'm asleep. ;-) And I'm fresh out of ideas to how to go about solving this. All suggestions are very much appreciated. Cheers NikReceived on Sun Jul 04 2004 - 07:42:15 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:00 UTC