On Fri, Aug 19, 2011 at 08:10:31AM -0400, John Baldwin wrote: > On Friday, August 19, 2011 3:17:12 am Garrett Cooper wrote: > > On Thu, Aug 18, 2011 at 9:31 PM, <mdf_at_freebsd.org> wrote: > > > On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper <yanegomi_at_gmail.com> > wrote: > > >> When loading if_alc as a module on my netbook and running > > >> /etc/rc.d/netif restart, I can deterministically panic my netbook with > > >> the following message: > > > > These repro steps were overly simplified. The complete steps are: > > > > 1. Attach ethernet cable to alc(4) enabled NIC. > > 2. Boot up machine. > > 3. Login. > > 4. Physically remove ethernet cable from alc(4) enabled NIC. > > 5. Run `/etc/rc.d/netif restart' as root. > > > > >> ) at _bus_dmamap_sync+0x51 > > >> alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e > > >> alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at > alc_ioctl+0x22e > > >> ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at > ifioctl+0xc98 > > >> soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at > soo_ioctl+0x401 > > >> kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 > > >> ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 > > >> syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f > > >> syscall(e6ca3d28) at syscall+0x2e > > >> Xint0x80_syscall() at Xint0x80_syscall+0x21 > > >> --- syscall (54kernel trap 12 with interrupts disabled > > >> Kernel page fault with the following non-sleepable locks held: > > >> exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked > > >> _at_ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 > > >> KDB: stack backtrace: > > >> db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at > > >> db_trace_self_wrapper+0x26 > > >> kdb_backtrace(93a,0,ffffffff,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a > > >> _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e > > >> witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 > > >> trap(e6ca32dc) at trap+0x15a > > >> calltrap() at calltrap+0x6 > > >> > > >> I tried to track down what the exact issue was, but I got lost > > >> (the locking sort of looks ok to me, but I'm still not an expert with > > >> mutex(9)). > > >> I still have the vmcore and can provide more helpful details when > requested. > > > > > > The locking itself is almost certainly fine. The error message is not > > > very helpful, but what went wrong was the page fault. You just happen > > > to panic on a witness warning before vm_fault can panic due to a bad > > > address. > > > > > > The alc(4) maintainer would probably like info on the trap (line of > > > code and where the bad pointer came from). > > > > I talked to Xin a bit and as he noted the panic was just a symptom > > of the actual issue at hand. I think the problem is that the rx ring's > > rx_m value isn't set to NULL when an error occurred, but getting to > > the exact problem at hand, the following call is failing: > > > > if (bus_dmamap_load_mbuf_sg(sc->alc_cdata.alc_rx_tag, // <-- HERE > > sc->alc_cdata.alc_rx_sparemap, m, segs, &nsegs, 0) != 0) { > > m_freem(m); > > return (ENOBUFS); > > } > > > > It's failing with ENOMEM. Still trying to determine what the exact > > reason for ENOMEM is from the x86 busdma code though.. > > ENOMEM The load request has failed due to insufficient > resources, and the caller specifically used the > BUS_DMA_NOWAIT flag. > > (bus_dmamap_load_mbuf*() imply BUS_DMA_NOWAIT.) > > You couldn't allocate enough bounce pages: > > /* Reserve Necessary Bounce Pages */ > if (map->pagesneeded != 0) { > mtx_lock(&bounce_lock); > if (flags & BUS_DMA_NOWAIT) { > if (reserve_bounce_pages(dmat, map, 0) != 0) { > mtx_unlock(&bounce_lock); > return (ENOMEM); > } > > Of course, now the question is why you even need bounce pages for alc(4): > > > /* Create DMA tag for Rx buffers. */ > error = bus_dma_tag_create( > sc->alc_cdata.alc_buffer_tag, /* parent */ > ALC_RX_BUF_ALIGN, 0, /* alignment, boundary */ > BUS_SPACE_MAXADDR, /* lowaddr */ > BUS_SPACE_MAXADDR, /* highaddr */ > NULL, NULL, /* filter, filterarg */ > MCLBYTES, /* maxsize */ > 1, /* nsegments */ > MCLBYTES, /* maxsegsize */ > 0, /* flags */ > NULL, NULL, /* lockfunc, lockarg */ > &sc->alc_cdata.alc_rx_tag); > > It can handle 64-bit DMA just fine, and mbuf clusters used for RX should > always be aligned and never need bounce pages. Right. alc(4) hardware has no DMA address limit for TX/RX buffers but its descriptors/status block DMA address should be within a 4GB. alc(4) explicitly checks whether allocated descriptor/status blocks crossed 4GB limit. If alc(4) detect that condition, it will limit DMA address space of descriptor/status block to 4GB and that can use bounce pages but that still does not explain why bounce buffers are used in RX buffer allocation. > > -- > John BaldwinReceived on Fri Aug 19 2011 - 15:15:11 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:16 UTC