On Sun, Dec 26, 2004 at 05:11:53PM +0100, Peter Holm wrote: > > Yes, I think that I have verified your exelent analysis of the > problem: http://www.holm.cc/stress/log/freeze04.html > > So, do have any fix suggenstons? :-) Not yet, because the problem is non-obvious from the trace. I need to know exactly when the UMA RCntSlabs zone recurses _first_, and I need to confirm that it is an actual recursion. I've looked at the VM code and I don't see how/why recursion on the RCntSlabs zone would happen. Please modify the printf code to look exactly like this: if (keg->uk_flags & UMA_ZFLAG_INTERNAL && keg->uk_recurse != 0) { if ((zone == slabzone) || (zone == slabrefzone)) panic("Zone %s forced to fail due to recurse non-null: %d\n", zone->uz_name, keg->uk_recurse); return (NULL); } (You don't need to check any global counter -- the counter is imperfect anyway -- because even a single recursion on slabzone or slabrefzone should be illegal). I'd like to see the trace from the above panic, if possible. Also, from your current crash dump, see if you can print the value of keg->uk_recurse (from frame 11, pid 74804). It appears that the other KASSERT being triggered from propagate_priority() is due to some weird side-effect of process 74804 looping with the UMA RCntSlabs zone lock held (without it ever being dropped). We'll have to see. The point is: the trace is useless unless it shows where/when the recursion on slabrefzone _begins_ to happen (not that it has already happened, that part is obvious now). Happy holidays, -- Bosko Milekic bmilekic_at_technokratis.com bmilekic_at_FreeBSD.orgReceived on Sun Dec 26 2004 - 17:17:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:25 UTC