On Fri, Dec 21, 2012 at 4:21 PM, Garrett Cooper <yanegomi_at_gmail.com> wrote: > On Mon, Dec 10, 2012 at 3:18 PM, <mdf_at_freebsd.org> wrote: >> On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd <adrian_at_freebsd.org> wrote: >>> 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf >>> after it's finalised/freed. >>> >>> I have a similar bug showing up on ath(4) RX. :( >> >> Compile with DEBUG_MEMGUARD in the kernel configuration, and then set >> vm.memguard.desc to the name of the UMA zone used for the 9216 byte >> allocations, mbuf_jumbo_9k. This should cause a panic when the memory >> is touched after free. > > Tada (dang, that's nifty stuff)! > > # sysctl vm.memguard.desc=mbuf_jumbo_9k > vm.memguard.descM: -> mbuf_jumboem_9k > # ory modified after free 0xffffff8000401000(9216) val=0 _at_ 0xffffff8000401000 > Memory modified after free 0xffffff8000405000(9216) val=0 _at_ 0xffffff8000405000 > Memory modified after free 0xffffff8000409000(9216) val=5a5a5a5a _at_ > 0xffffff8000409000 > > > > > > > Fatal trap 1: privileged instruction fault while in kernel mode > Fatal trap 1: privileged instruction fault while in kernel mode > Memory modified after free 0xffffff800040d000(9216) val=5a5a5a5a _at_ > 0xffffff800040d000 > Fatal trap 1: privileged instruction fault while in kernel mode > cpuid = 3; > cpuid = 1; > apic id = 02 > cpuid = 0; apic id = 06 > apic id = 00 > instruction pointer = 0x20:0xffffffff80af5099 > instruction pointer = 0x20:0xffffffff80af5099 > instruction pointer = 0x20:0xffffffff80af5099 > Fatal trap 1: privileged instruction fault while in kernel mode > stack pointer = 0x28:0xffffff8496fff880 > stack pointer = 0x28:0xffffff8496fe1880 > cpuid = 2; frame pointer = 0x28:0xffffff8496fff8b0 > frame pointer = 0x28:0xffffff8496fe18b0 > stack pointer = 0x28:0xffffff849705d880 > code segment = base 0x0, limit 0xfffff, type 0x1b > frame pointer = 0x28:0xffffff849705d8b0 > apic id = 04 > code segment = base 0x0, limit 0xfffff, type 0x1b > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > = DPL 0, pres 1, long 1, def32 0, gran 1 > instruction pointer = 0x20:0xffffffff80af5099 > processor eflags = = DPL 0, pres 1, long > 1, def32 0, gran 1 > interrupt enabled, processor eflags = stack pointer = > 0x28:0xffffff8497067880 > interrupt enabled, resume, resume, frame pointer = > 0x28:0xffffff84970678b0 > IOPL = 0 > code segment = base 0x0, limit 0xfffff, type 0x1b > current process = = DPL 0, pres 1, long > 1, def32 0, gran 1 > processor eflags = 12 (irq280: ix0:que 3) > ilock order reversal: (Giant after non-sleepable) > 1st 0xfffffe0078148b38 ix0:rx(3) (ix0:rx(3)) _at_ > /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 > 2nd 0xffffffff814457b8 Giant (Giant) _at_ /usr/src/sys/dev/usb/input/ukbd.c:1946 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fff320 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fff3d0 > witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496fff450 > __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496fff490 > ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496fff4b0 > kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496fff4d0 > cngrab() at cngrab+0x35/frame 0xffffff8496fff4f0 > kdb_trap() at kdb_trap+0x124/frame 0xffffff8496fff550 > trap_fatal() at trap_fatal+0x345/frame 0xffffff8496fff5b0 > trap() at trap+0x836/frame 0xffffff8496fff7c0 > calltrap() at calltrap+0x8/frame 0xffffff8496fff7c0 > --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496fff880, rbp > = 0xffffff8496fff8b0 --- > uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496fff8b0 > mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496fff8e0 > uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496fff960 > m_getjcl() at m_getjcl+0xce/frame 0xffffff8496fff9a0 > ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xffffff8496fffa20 > ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xffffff8496fffad0 > ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff8496fffb20 > intr_event_execute_handlers() at > intr_event_execute_handlers+0x90/frame 0xffffff8496fffb60 > ithread_loop() at ithread_loop+0x161/frame 0xffffff8496fffbb0 > fork_exit() at fork_exit+0x84/frame 0xffffff8496fffbf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496fffbf0 > --- trap 0, rip = 0, rsp = 0xffffff8496fffcb0, rbp = 0 --- > [ thread pid 12 tid 100218 ] > Stopped at uma_find_refcnt+0x79: > db> > > This looks interesting: > > $ git diff sys/dev/ixgbe/ixgbe.c > diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c > index 40f5488..19f842b 100644 > --- a/sys/dev/ixgbe/ixgbe.c > +++ b/sys/dev/ixgbe/ixgbe.c > _at__at_ -897,8 +897,10 _at__at_ ixgbe_qflush(struct ifnet *ifp) > > for (int i = 0; i < adapter->num_queues; i++, txr++) { > IXGBE_TX_LOCK(txr); > - while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) > + while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) { > m_freem(m); > + m = NULL; > + } > IXGBE_TX_UNLOCK(txr); > } > if_qflush(ifp); > > I'm going to recompile the driver and restart the network a few > times (that's the sticking point) and see what happens. Now the T4 is angry: ^C^C^C^C^C^C^C^C^C^C^C^C^C^C^CMemory modified after free 0xffffff8000401000(9216) val=0 _at_ 0xffffff8000401000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80af5099 stack pointer = 0x28:0xffffff8496f41910 frame pointer ^C = 0x28:0xffffff8496f41940 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (irq265: t4nex0:1.0) lock order reversal: (Giant after non-sleepable) 1st 0xfffffe00790eb690 cxgbe1 rxq0-fl (cxgbe1 rxq0-fl) _at_ /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_sge.c:1008 2nd 0xffffffff814457b8 Giant (Giant) _at_ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496f413b0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496f41460 witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496f414e0 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496f41520 ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496f41540 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496f41560 cngrab() at cngrab+0x35/frame 0xffffff8496f41580 kdb_trap() at kdb_trap+0x124/frame 0xffffff8496f415e0 trap_fatal() at trap_fatal+0x345/frame 0xffffff8496f41640 trap() at trap+0x836/frame 0xffffff8496f41850 calltrap() at calltrap+0x8/frame 0xffffff8496f41850 --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496f41910, rbp = 0xffffff8496f41940 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496f41940 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496f41970 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496f419f0 refill_fl() at refill_fl+0x28f/frame 0xffffff8496f41a60 service_iq() at service_iq+0x821/frame 0xffffff8496f41b00 t4_intr() at t4_intr+0x2e/frame 0xffffff8496f41b20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff8496f41b60 ithread_loop() at ithread_loop+0x161/frame 0xffffff8496f41bb0 fork_exit() at fork_exit+0x84/frame 0xffffff8496f41bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496f41bf0 --- trap 0, rip = 0, rsp = 0xffffff8496f41cb0, rbp = 0 --- [ thread pid 12 tid 100196 ] Stopped at uma_find_refcnt+0x79: Guess I'll rebuild the modules with the changes that np_at_ just made today and see if that changes anything... Thanks, -GarrettReceived on Fri Dec 21 2012 - 23:53:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC