Re: Panic on -CURRENT after LDT changes

From: Tor Egge <Tor.Egge_at_cvsup.no.freebsd.org>
Date: Mon, 28 May 2007 16:20:23 +0000 (UTC)
> > Could you please try this better approach on a VANILLA kernel and say if 
> > it still works for you:
> > http://users.gufi.org/~rookie/works/patches/schedlock/ldt2.diff
> 
> Still works, no crash.

I got similar crashes (page fault in i386_ldt_grow) and tried this patch.
During testing, my development machine repeatedly got following panic:

spin lock 0xa0ae4378 (descriptor tables) held by 0xadf77360 (tid 100161) too long
exclusive spin mutex descriptor tables r = 0 (0xa0ae4378) locked _at_ i386/i386/sys_machdep.c:414
panic: spin lock held too long
cpuid = 0


This looked like a lock leak, with user_ldt_free() as the suspect, since it
initially appeared to be able to return with dt_lock still held.  But that path
seems to be impossible since the callers first check that mdp->md_ldt is
non-NULL.

During the hunt for the real reason, I found that unsharing of user LDT in
cpu_fork() seems broken since the call to user_ldt_free() frees the newly
allocated user LDT.

Finally, I found that i386_ldt_grow() called smp_rendezvous() without
temporarily unlocking dt_lock.  That caused a deadlock.  Adding a temporary
unlock of dt_lock seems to solve the problem for me.

smp_rendezvous_action() fails to make a local copy of smp_rv_teardown_func
before bumping smp_rv_waiters[1], thus the other CPUs might end up calling the
teardown function for the next rendezvous instead of the teardown function for
the current rendezvous.

- Tor Egge
Received on Mon May 28 2007 - 14:53:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:11 UTC