Re: Kernel page fault with non-sleepable locks held error with kernel r270837

From: John-Mark Gurney <jmg_at_funkthat.com>
Date: Sun, 31 Aug 2014 13:34:19 -0700
Hiroo Ono (????????????) wrote this message on Sun, Aug 31, 2014 at 20:43 +0900:
> Thank you for taking a look into this.
> 
> 2014-08-31 15:47 GMT+09:00 John-Mark Gurney <jmg_at_funkthat.com>:
> > Hiroo Ono (????????????) wrote this message on Sun, Aug 31, 2014 at 14:01 +0900:
> >> During upgrading world and kernel from r26939 to r270837, I got the
> >> following problem.
> >> a) the arch is i386
> >> b) kernel is of r270837, userland is of r26939 (make kernel is done
> >> and rebooted, make installworld not yet).
> >> c) booting in single user mode is OK.
> >> d) during startup of multi-user mode, when dhclient is run, the
> >> following message appears, and the system freezes:
> >>
> >> Starting devd.
> >> wlan0: link state changed to UP
> >> Starting webcamd.
> >> Attached to ugen4.2[0]
> >> Starting webcready running for ugen4.2.0
> >> /usr/local/etc/rc.d/webcamd: WARNING: failed to start webcamd
> >> Starting dhclient.
> >> DHCPREQUEST on wlan0 to 255.255.255.255 port 67
> >> DHCPACK from 192.168.8.2
> >> Kernel page fault with the following non-sleepable locks held:
> >> exclusive sleep mutex so_rcv (so_rcv) r = 0 (0xc713f078) locked _at_
> >> /usr/src/sys/kern/kern_event.c:2005
> >
> > I'm puzzled by this line number...  This line number doesn't do any
> > locks, it is in the function knlist_remove_inevent...
> 
> The line 2005 is "mtx_lock((struct mtx *)arg);" of knlist_mtx_lock()
> https://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?revision=268843&view=markup#l2005
> 
> this function is assigned to (struct knlist *)->kn_lock in knlist_init()
> https://svnweb.freebsd.org/base/head/sys/kern/kern_event.c?revision=268843&view=markup#l2058

Sorry, turns out I had a local patch to my kern_event.c...

Can you find out what line the filt_soread is on?  This will help figure
out if it's kn or so...  If you could get the address of the page fault,
that would also be helpful...

Ok, a similar fix was committed in r133794, and a quick look at the code
doesn't show any knote's that are allocated on the stack anymore...

> >> KDB stack backtrace:
> >>  rapper+0x2d/frame 0xe8f42710
> >> kdb_backtrace(c11aaf80,0,c713f078,c119a9e8,7d5,...) at 0xc0b4b160 =
> >> kdb_backtrace+0x30/frame 0xe8f42778
> >> witness_warn(5,0,c136b0a0,76fb000,c1833d58,...) at 0xc8b68a52 =
> >> witness_warn+0x402/frame 0xe8f427c8
> >> trap_pfault(18,3fd,c0dcc2d0,c1f64a80,c75fa000,...) at 0xc102f46b =
> >> trap_pfault+0x5b/frame 0xe8f42840
> >> trap(e8f42988) at 0xc102edcf = trap+0x6cf/frame 0xe8f4297c
> >> calltrap() at 0xc1017c4c = calltrap+0x6/frame 0xe8f4297c
> >> filt_soread(c75f7828,0,c119a9e8,48d,0,...) at 0xc0b9837d =
> >> filt_soread+0x9d/frame 0xe8f429f0
> >> kqueue_register(c6f59310,1,1,4f5,0,...) at 0xc0ad1457 =
> >> kqueue_register+0x807/frame 0xe8f42a68
> >> kern_kevent(c6f59310,7,12c217ce1 = Xint0x80), eip =
> 
> calltrap() seems to be invoked by
>     SOCKBUF_LOCK_ASSERT(&so->so_rcv);
> of filt_soread() in sys/kern/uipc_socket.c
> https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l3250
> 
> but I do not know where &so->so_rcv was previously locked.
> knlist_init_mtx (which then calls knlist_init) is called with
> so->so_rcv in sys/kern/uipc_socket.c in
> line 517:  	socreate()
> https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l517
> and
> line 606: sonewconn()
> https://svnweb.freebsd.org/base/head/sys/kern/uipc_socket.c?revision=270664&view=markup#l606
> 
> so the problem may be around there.
> but, I cannot track any further.  the system freezes, so I cannot deal with ddb.
> 
> > But notice the knlist_remove_inevent doesn't appear in the back
> > trace...
> >
> > Can you confirm that your kern_event.c is:
> > __FBSDID("$FreeBSD: head/sys/kern/kern_event.c 268843 2014-07-18 14:27:04Z bapt
> > $");
> 
> I checked that it was this revision.
> 
> >> instruction poi         = 0x28:0xe8f429f0     fff, type 0x1b
> >> DHCPREQUEST on wlan0 to 255.255.255.255 port 67
> >> DHCPACK from 192.168.8.2
> >>
> >> e) kernel configuration differs from GENERIC on the following point
> >> options      VIMAGE
> >> options      DDB_NUMSYM
> >> nocpu        I486_CPU
> >> nooptions  VESA
> >>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
Received on Sun Aug 31 2014 - 18:34:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:51 UTC