At work, we've been having a few hangs that are apparently from a fragmented buffer cache... We are running w/ some UFS2 file systems with a 64KB/64KB and 64KB/32KB block/fragment sizes which I believe is a contributing factor to the fragmentation. Luckily, only I/O to the large block file systems are hung, and I've been able to run kgdb on /dev/mem which has helped tremendously. I am still running kgdb on the box, so I can get any additional information requested. Disk IO on the devices that the file systems are housed are fully functional, as I can run ffsinfo, and dd from the disks. Most of the processes are stuck in nbufkv (from getnewbuf) w/ needsbuffer set to VFS_BIO_NEED_BUFSPACE. This can only get set if it needs to defrag the buffer cache because a call to vm_map_findspace(buffer_map fails. The bufdaemon is stuck in qsleep. The syncer is also stuck in nbufkv. So, BKVASIZE which is the minimum allocation size of space in the buffer_map was increased to 16KB to be 2x the size (at the time) of the UFS block size of standard file systems (8KB). We have since increased the standard block size to 16KB, but have not made a repsective increase to the BKVASIZE. I see that as a possible work around, but not as one that is guaranteed to make 64KB block FS's work though. I have walked part of the buffer_map, but have not seen any adj_free or max_free >= 64KB, they are usually either 32KB or 48KB... Some information about the box: 6.2-RELEASE using a SMP kernel w/ debugging enabled, nothing else special. I have attached sysctl -a, vmstat -m, vmstat -z and dmesg. I believe this hang is possible w/ -current also, as the buffer cache has not changed significantly. Comments? Help? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC