Henri Hennebert wrote: > Kris Kennaway wrote: >> Ivan Voras wrote: >>> On 06/01/2008, Peter Schuller <peter.schuller_at_infidyne.com> wrote: >>>>> This number is not so large. It seems to be easily crashed by rsync, >>>>> for example (speaking from my own experience, and also some of my >>>>> colleagues). >>>> I can definitely say this is not *generally* true, as I do a lot of >>>> rsyncing/rdiff-backup:ing and similar stuff (with many files / large >>>> files) >>>> on ZFS without any stability issues. Problems for me have been >>>> limited to >>>> 32bit and the memory exhaustion issue rather than "hard" issues. >>> >>> It's not generally true since kmem problems with rsync are often hard >>> to repeat - I have them on one machine, but not on another, similar >>> machine. This nonrepeatability is also a part of the problem. >>> >>>> But perhaps that's all you are referring to. >>> >>> Mostly. I did have a ZFS crash with rsync that wasn't kmem related, >>> but only once. >> >> kmem problems are just tuning. They are not indicative of stability >> problems in ZFS. Please report any further non-kmem panics you >> experience. > > I encounter 2 times a deadlock during high I/O activity (the last one > during rsync + rm -r on a 5GB hierarchy (openoffice-2/work). > > I was running with this patch: > http://people.freebsd.org/~pjd/patches/zgd_done.patch > db> show allpcpu > Current CPU: 1 > > cpuid = 0 > curthread = 0xa5ebe440: pid 3422 "txg_thread_enter" > curpcb = 0xeb175d90 > fpcurthread = none > idlethread = 0xa5529aa0: pid 12 "idle: cpu0" > APIC ID = 0 > currentldt = 0x50 > > cpuid = 1 > curthread = 0xa56ab220: pid 47 "arc_reclaim_thread" > curpcb = 0xe6837d90 > fpcurthread = none > idlethread = 0xa5529880: pid 11 "idle: cpu1" > APIC ID = 1 > currentldt = 0x50 > > With the 2 times arc_reclaim_thread `running` Backtraces of the affected processes (or just alltrace) are usually required to proceed with debugging, and lock status is also often vital (show alllocks, requires witness). Also, in the case when threads are actually running (not deadlocked), then it is often useful to repeatedly break/continue and sample many backtraces to try and determine where the threads are looping. KrisReceived on Sun Jan 06 2008 - 15:03:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:25 UTC