Henri Hennebert wrote: > Kris Kennaway wrote: >> Henri Hennebert wrote: >>> Kris Kennaway wrote: >>>> Ivan Voras wrote: >>>>> On 06/01/2008, Peter Schuller <peter.schuller_at_infidyne.com> wrote: >>>>>>> This number is not so large. It seems to be easily crashed by rsync, >>>>>>> for example (speaking from my own experience, and also some of my >>>>>>> colleagues). >>>>>> I can definitely say this is not *generally* true, as I do a lot of >>>>>> rsyncing/rdiff-backup:ing and similar stuff (with many files / >>>>>> large files) >>>>>> on ZFS without any stability issues. Problems for me have been >>>>>> limited to >>>>>> 32bit and the memory exhaustion issue rather than "hard" issues. >>>>> >>>>> It's not generally true since kmem problems with rsync are often hard >>>>> to repeat - I have them on one machine, but not on another, similar >>>>> machine. This nonrepeatability is also a part of the problem. >>>>> >>>>>> But perhaps that's all you are referring to. >>>>> >>>>> Mostly. I did have a ZFS crash with rsync that wasn't kmem related, >>>>> but only once. >>>> >>>> kmem problems are just tuning. They are not indicative of stability >>>> problems in ZFS. Please report any further non-kmem panics you >>>> experience. >>> >>> I encounter 2 times a deadlock during high I/O activity (the last one >>> during rsync + rm -r on a 5GB hierarchy (openoffice-2/work). >>> >>> I was running with this patch: >>> http://people.freebsd.org/~pjd/patches/zgd_done.patch >>> db> show allpcpu >>> Current CPU: 1 >>> >>> cpuid = 0 >>> curthread = 0xa5ebe440: pid 3422 "txg_thread_enter" >>> curpcb = 0xeb175d90 >>> fpcurthread = none >>> idlethread = 0xa5529aa0: pid 12 "idle: cpu0" >>> APIC ID = 0 >>> currentldt = 0x50 >>> >>> cpuid = 1 >>> curthread = 0xa56ab220: pid 47 "arc_reclaim_thread" >>> curpcb = 0xe6837d90 >>> fpcurthread = none >>> idlethread = 0xa5529880: pid 11 "idle: cpu1" >>> APIC ID = 1 >>> currentldt = 0x50 >>> >>> With the 2 times arc_reclaim_thread `running` >> >> Backtraces of the affected processes (or just alltrace) are usually > > noted for next time > >> required to proceed with debugging, and lock status is also often >> vital (show alllocks, requires witness). > > I add it to my kernel config > > Also, in the case when threads are >> actually running (not deadlocked), then it is often useful to >> repeatedly break/continue and sample many backtraces to try and >> determine where the threads are looping. > > I do this after the second deadlock and arc_reclaim_thread was always > there and second cpu was idle. To repeat, it is important not just to note which thread is running, but *what the thread is doing*. This means repeatedly comparing the backtraces, which will allow you to build up a picture of which part of the code it is looping in. KrisReceived on Sun Jan 06 2008 - 16:13:38 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:25 UTC