Re: Instant panic while trying run ports-mgmt/poudriere

From: Pawel Pekala <pawel_at_FreeBSD.org> Date: Wed, 15 Jul 2015 17:40:27 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:58 UTC

Hi John-Mark,

On 2015-07-14 15:27 -0700, John-Mark Gurney <jmg_at_funkthat.com> wrote:
>Pawel Pekala wrote this message on Tue, Jul 14, 2015 at 22:47 +0200:
>> On 2015-07-13 23:28 +0200, Mateusz Guzik <mjguzik_at_gmail.com> wrote:
>> >On Mon, Jul 13, 2015 at 11:12:05PM +0200, Pawel Pekala wrote:
>> >> Hi
>> >> 
>> >> I'm getting 100% reproducible kernel crash while trying build
>> >> ports with poudriere on my system. This started to show up about
>> >> 2-3 weeks ago. I upgrade my system on weekly basis usually on
>> >> saturday. Here's backtrace:
>> >> 
>> >> (kgdb) bt
>> >[..]
>> >>     at /hdd/src/sys/amd64/amd64/trap.c:201
>> >> #25 0xffffffff80dace32 in calltrap ()
>> >> at /hdd/src/sys/amd64/amd64/exception.S:235 #26 0xffffffff80941430
>> >> in knote (list=0xfffff801a2589408, hint=2147483648,
>> >> lockflags=<value optimized out>)
>> >> at /hdd/src/sys/kern/kern_event.c:1920 #27 0xffffffff80946a51 in
>> >> exit1 (td=0xfffff801b84014d0, rv=<value optimized out>)
>> >> at /hdd/src/sys/kern/kern_exit.c:560 #28 0xffffffff80945f1e in
>> >> sys_sys_exit (td=0x0, uap=<value optimized
>> >> out>) at /hdd/src/sys/kern/kern_exit.c:178 #29 0xffffffff80dcdaa2
>> >> out>in amd64_syscall (td=0xfffff801b84014d0, traced=0)
>> >>     at subr_syscall.c:133
>> >> #30 0xffffffff80dad11b in Xfast_syscall ()
>> >> at /hdd/src/sys/amd64/amd64/exception.S:395 #31 0x0000000800922eea
>> >> in ?? () Previous frame inner to this frame (corrupt stack?)
>> >> Current language:  auto; currently minimal
>> >> 
>> >> Let me know if you need more details.
>> >
>> >
>> >Well, if the problem is really that reproducible it would be best if
>> >you narrowed it down to the exact commit.
>> >
>> >However, quick look suggests you may be a "victim" of r284861.
>> 
>> After further testing I can confirm that this panic was introduced in
>> r284861, thanks for the hint!
>
>Can you tell me what your line 1920 of kern_event.c is? (and the
>context around it?   Or at least the $FreeBSD$ line from
>kern_event.c?  Because in HEAD, the line is:
>		} else if ((lockflags & KNF_NOKQLOCK) != 0) {
>
>and there isn't a way to fault on that code...

Yes, this is strange.

                if ((kn->kn_status & (KN_INFLUX | KN_SCAN)) == KN_INFLUX) {
                        /*
                         * Do not process the influx notes, except for
                         * the influx coming from the kq unlock in the
                         * kqueue_scan().  In the later case, we do
                         * not interfere with the scan, since the code
                         * fragment in kqueue_scan() locks the knlist,
                         * and cannot proceed until we finished.
                         */
                        KQ_UNLOCK(kq);
===> line 1920  } else if ((lockflags & KNF_NOKQLOCK) != 0) {
                        kn->kn_status |= KN_INFLUX;
                        KQ_UNLOCK(kq);
                        error = kn->kn_fop->f_event(kn, hint);
                        KQ_LOCK(kq);
                        kn->kn_status &= ~KN_INFLUX;
                        if (error)
                                KNOTE_ACTIVATE(kn, 1);
                        KQ_UNLOCK_FLUX(kq);
                } else {

Id line:

__FBSDID("$FreeBSD: head/sys/kern/kern_event.c 284215 2015-06-10 10:48:12Z mjg $");

-- 
pozdrawiam / with regards
Paweł Pękala