Re: "Memory modified after free" - by whom?

From: Garrett Cooper <yanegomi_at_gmail.com>
Date: Fri, 21 Dec 2012 16:47:15 -0800
On Fri, Dec 21, 2012 at 4:21 PM, Garrett Cooper <yanegomi_at_gmail.com> wrote:
> On Mon, Dec 10, 2012 at 3:18 PM,  <mdf_at_freebsd.org> wrote:
>> On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd <adrian_at_freebsd.org> wrote:
>>> 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf
>>> after it's finalised/freed.
>>>
>>> I have a similar bug showing up on ath(4) RX. :(
>>
>> Compile with DEBUG_MEMGUARD in the kernel configuration, and then set
>> vm.memguard.desc to the name of the UMA zone used for the 9216 byte
>> allocations, mbuf_jumbo_9k.  This should cause a panic when the memory
>> is touched after free.
>
> Tada (dang, that's nifty stuff)!
>
> # sysctl vm.memguard.desc=mbuf_jumbo_9k
> vm.memguard.descM:  -> mbuf_jumboem_9k
> # ory modified after free 0xffffff8000401000(9216) val=0 _at_ 0xffffff8000401000
> Memory modified after free 0xffffff8000405000(9216) val=0 _at_ 0xffffff8000405000
> Memory modified after free 0xffffff8000409000(9216) val=5a5a5a5a _at_
> 0xffffff8000409000
>
>
>
>
>
>
> Fatal trap 1: privileged instruction fault while in kernel mode
> Fatal trap 1: privileged instruction fault while in kernel mode
> Memory modified after free 0xffffff800040d000(9216) val=5a5a5a5a _at_
> 0xffffff800040d000
> Fatal trap 1: privileged instruction fault while in kernel mode
> cpuid = 3;
> cpuid = 1;
> apic id = 02
> cpuid = 0; apic id = 06
> apic id = 00
> instruction pointer     = 0x20:0xffffffff80af5099
> instruction pointer     = 0x20:0xffffffff80af5099
> instruction pointer     = 0x20:0xffffffff80af5099
> Fatal trap 1: privileged instruction fault while in kernel mode
> stack pointer           = 0x28:0xffffff8496fff880
> stack pointer           = 0x28:0xffffff8496fe1880
> cpuid = 2; frame pointer                = 0x28:0xffffff8496fff8b0
> frame pointer           = 0x28:0xffffff8496fe18b0
> stack pointer           = 0x28:0xffffff849705d880
> code segment            = base 0x0, limit 0xfffff, type 0x1b
> frame pointer           = 0x28:0xffffff849705d8b0
> apic id = 04
> code segment            = base 0x0, limit 0xfffff, type 0x1b
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> instruction pointer     = 0x20:0xffffffff80af5099
> processor eflags        =                       = DPL 0, pres 1, long
> 1, def32 0, gran 1
> interrupt enabled, processor eflags     = stack pointer         =
> 0x28:0xffffff8497067880
> interrupt enabled, resume, resume, frame pointer                =
> 0x28:0xffffff84970678b0
> IOPL = 0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
> current process         =                       = DPL 0, pres 1, long
> 1, def32 0, gran 1
> processor eflags        = 12 (irq280: ix0:que 3)
> ilock order reversal: (Giant after non-sleepable)
>  1st 0xfffffe0078148b38 ix0:rx(3) (ix0:rx(3)) _at_
> /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296
>  2nd 0xffffffff814457b8 Giant (Giant) _at_ /usr/src/sys/dev/usb/input/ukbd.c:1946
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fff320
> kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fff3d0
> witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496fff450
> __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496fff490
> ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496fff4b0
> kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496fff4d0
> cngrab() at cngrab+0x35/frame 0xffffff8496fff4f0
> kdb_trap() at kdb_trap+0x124/frame 0xffffff8496fff550
> trap_fatal() at trap_fatal+0x345/frame 0xffffff8496fff5b0
> trap() at trap+0x836/frame 0xffffff8496fff7c0
> calltrap() at calltrap+0x8/frame 0xffffff8496fff7c0
> --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496fff880, rbp
> = 0xffffff8496fff8b0 ---
> uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496fff8b0
> mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496fff8e0
> uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496fff960
> m_getjcl() at m_getjcl+0xce/frame 0xffffff8496fff9a0
> ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xffffff8496fffa20
> ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xffffff8496fffad0
> ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff8496fffb20
> intr_event_execute_handlers() at
> intr_event_execute_handlers+0x90/frame 0xffffff8496fffb60
> ithread_loop() at ithread_loop+0x161/frame 0xffffff8496fffbb0
> fork_exit() at fork_exit+0x84/frame 0xffffff8496fffbf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496fffbf0
> --- trap 0, rip = 0, rsp = 0xffffff8496fffcb0, rbp = 0 ---
> [ thread pid 12 tid 100218 ]
> Stopped at      uma_find_refcnt+0x79:
> db>
>
> This looks interesting:
>
> $ git diff sys/dev/ixgbe/ixgbe.c
> diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c
> index 40f5488..19f842b 100644
> --- a/sys/dev/ixgbe/ixgbe.c
> +++ b/sys/dev/ixgbe/ixgbe.c
> _at__at_ -897,8 +897,10 _at__at_ ixgbe_qflush(struct ifnet *ifp)
>
>         for (int i = 0; i < adapter->num_queues; i++, txr++) {
>                 IXGBE_TX_LOCK(txr);
> -               while ((m = buf_ring_dequeue_sc(txr->br)) != NULL)
> +               while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) {
>                         m_freem(m);
> +                       m = NULL;
> +               }
>                 IXGBE_TX_UNLOCK(txr);
>         }
>         if_qflush(ifp);
>
>     I'm going to recompile the driver and restart the network a few
> times (that's the sticking point) and see what happens.

Now the T4 is angry:

^C^C^C^C^C^C^C^C^C^C^C^C^C^C^CMemory modified after free
0xffffff8000401000(9216) val=0 _at_ 0xffffff8000401000


Fatal trap 1: privileged instruction fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff80af5099
stack pointer           = 0x28:0xffffff8496f41910
frame pointer        ^C   = 0x28:0xffffff8496f41940
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (irq265: t4nex0:1.0)
lock order reversal: (Giant after non-sleepable)
 1st 0xfffffe00790eb690 cxgbe1 rxq0-fl (cxgbe1 rxq0-fl) _at_
/usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_sge.c:1008
 2nd 0xffffffff814457b8 Giant (Giant) _at_ /usr/src/sys/dev/usb/input/ukbd.c:1946
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496f413b0
kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496f41460
witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496f414e0
__mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496f41520
ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496f41540
kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496f41560
cngrab() at cngrab+0x35/frame 0xffffff8496f41580
kdb_trap() at kdb_trap+0x124/frame 0xffffff8496f415e0
trap_fatal() at trap_fatal+0x345/frame 0xffffff8496f41640
trap() at trap+0x836/frame 0xffffff8496f41850
calltrap() at calltrap+0x8/frame 0xffffff8496f41850
--- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496f41910, rbp
= 0xffffff8496f41940 ---
uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496f41940
mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496f41970
uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496f419f0
refill_fl() at refill_fl+0x28f/frame 0xffffff8496f41a60
service_iq() at service_iq+0x821/frame 0xffffff8496f41b00
t4_intr() at t4_intr+0x2e/frame 0xffffff8496f41b20
intr_event_execute_handlers() at
intr_event_execute_handlers+0x90/frame 0xffffff8496f41b60
ithread_loop() at ithread_loop+0x161/frame 0xffffff8496f41bb0
fork_exit() at fork_exit+0x84/frame 0xffffff8496f41bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496f41bf0
--- trap 0, rip = 0, rsp = 0xffffff8496f41cb0, rbp = 0 ---
[ thread pid 12 tid 100196 ]
Stopped at      uma_find_refcnt+0x79:

    Guess I'll rebuild the modules with the changes that np_at_ just made
today and see if that changes anything...
Thanks,
-Garrett
Received on Fri Dec 21 2012 - 23:53:24 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:33 UTC