Re: Panic using QLogic NetXtreme II BCM57810 with latest CURRENT snapshot

From: Sergey Kandaurov <pluknet_at_gmail.com>
Date: Wed, 13 May 2015 08:45:30 +0300
On 13 May 2015 at 00:21, Niclas Zeising <zeising_at_freebsd.org> wrote:
> Hi!
> I got the following panic with a QLogic NetXtreme II BCM57810 when
> trying to assign an IP address using dhclient.  The network card uses
> the bxe driver.  The machine in question is a HP DL380 Gen9.
>
> Kernel page fault with the following non-sleepable locks held:
> shared rw if_addr_lock (if_addr_lock) locked _at_ /usr/src/sys/net/if.c:1539
> exclusive sleep mutex bxe0_mcast_lock lockeed _at_
> /usr/src/sys/dev/bxe/bxe.c:12548
>
> See screenshots at the links below for details and a stack trace.
> I can provoke this panic at will, let me know if you need more details.
>  Unfortunately I don't have access to a console where I can copy things
> out currently, so screenshots have to do.
>
> Screenshot 1: https://people.freebsd.org/~zeising/panic1.png
> Screenshot 2: https://people.freebsd.org/~zeising/panic2.png
>

I'm not in any way a network/bxe expert, and this is probably unrelated,
but I see there at least a missing unlock at the error path.

Index: sys/dev/bxe/bxe.c
===================================================================
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
_at__at_ -12551,6 +12551,7 _at__at_
     rc = ecore_config_mcast(sc, &rparam, ECORE_MCAST_CMD_DEL);
     if (rc < 0) {
         BLOGE(sc, "Failed to clear multicast configuration: %d\n", rc);
+        BXE_MCAST_UNLOCK(sc);
         return (rc);
     }

BXE_MCAST_LOCK acquires two locks: sc mutex, and if_maddr_rlock(ifp)

OTOH, in bxe_init_mcast_macs_list(), down the path, if_maddr_rlock is acquired
(and released) one more time: in if_multiaddr_array / if_multiaddr_count
functions. Is it recursive?

Another one is bcopy under lock. It is probably inlined
under bxe_handle_rx_mode_tq() in ddb, so the actual place
where it's called is not visible.
My guess is bcopy in bxe_init_mcast_macs_list():

         bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN);

Previously, there was a pointer assignment, see stable/10:

        mc_mac->mac = (uint8_t *)LLADDR((struct sockaddr_dl *)ifma->ifma_addr);

mc_mac itself is malloc(M_ZERO)'ed, so that mc_mac->mac is NULL.

Probably bcopy should be restored to assignment (not even compile tested):

Index: sys/dev/bxe/bxe.c
===================================================================
--- sys/dev/bxe/bxe.c   (revision 282468)
+++ sys/dev/bxe/bxe.c   (working copy)
_at__at_ -12506,7 +12506,7 _at__at_
                                                       to be  different */
     for(i=0; i< mcnt; i++) {

-        bcopy((mta + (i * ETHER_ADDR_LEN)), mc_mac->mac, ETHER_ADDR_LEN);
+        mc_mac->mac = (uint8_t *)(mta + (i * ETHER_ADDR_LEN));
         ECORE_LIST_PUSH_TAIL(&mc_mac->link, &p->mcast_list);

         BLOGD(sc, DBG_LOAD,

-- 
wbr,
pluknet
Received on Wed May 13 2015 - 03:45:32 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:57 UTC