2010/4/20 David Ehrmann <ehrmann_at_gmail.com>: > Initially, I noticed a problem where reading a file on this machine seemed > to stop--something like a video would just stop playing. At first, I > thought it was the machine, but a new motherboard, CPU, and RAM later, the > problem persists. The network card uses a different chipset, too. > > The files are on zfs, but scrubs are fine, and zpool status lists no errors > of any kind. Trying to reproduce the problem, I set up a script that > reading a random 1M block every 60 seconds off the drive backing zfs. > That's when I noticed something: one disk seems to be causing the problems. > I logged the dd times, and some of them were huge--more than a minute. The > times on the other disk in the mirrored vdev were low. > > I've only seen the problem when I have a vm's disk image hosted on the > machine. That said, the network interface is configured at 100mbps, so > there's no reason for that to saturate the disk's throughput. Top reports > that almost 20% of the CPU is going towards interrupts. I can read a file > off the zfs pool at over 50MB/s, so that shouldn't be a problem. One thing > I'm wondering is why the disk read doesn't timeout quickly? At least that > way zfs could try to use the other drive in the mirrored vdev. > > Any ideas? One thing I should try is switching the drive, see if the > problem follows the disk or stays with the lowest /dev/adX device. I'm > using geli, but the read problems happen with both /dev/adX AND > /dev/adX.eli., so I don't think that's it. I've seen the problem with > Samba, NFS, and dd. David, do you think you are willing to re-create the problem and do a PMC analysis on it? (If you need any guidance let me know, I will be happy to give it). Attilio -- Peace can only be achieved by understanding - A. EinsteinReceived on Tue Apr 20 2010 - 10:29:15 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:02 UTC