Re: sk ethernet driver: watchdog timeout

From: Bruce Evans <bde_at_zeta.org.au>
Date: Thu, 8 Apr 2004 00:17:06 +1000 (EST)
On Wed, 7 Apr 2004, Palle Girgensohn wrote:

> --On onsdag, mars 17, 2004 00.21.44 +0100 "Arno J. Klaassen"
> <arno_at_heho.snv.jussieu.fr> wrote:
>
> > Hello,
> >
> >> I have an ASUS motherboard A7V8X-E Deluxe with onboard 10/100/1000
> >> Mbit/s NIC from Marvell Semiconductor.
> >>
> >> My problem is that it sometimes lock up with the error message
> >>
> >>  sk0: watchdog timeout
> >
> > I have a similar problem with 3Com cards on an ASUS A7N266;
> > I just post in case this might be related (and in hope for
> > a hint for a solution )
>
> Hi again,
>
> I've since this thread started tried this on more different systems, with
> exactly the same results. Anyone else experiencing this? Anything I can do
> to help fixing it?

The following patch reduces the problem on A7V8X-E a little.  It limits
the tx queue to 1 packet and fixes handling of the timeout on txeof.
The first part probably makes the second part a no-op.  Without this,
my A7V8X-E hangs on even light nfs activity (e.g., copying a 1MB file
to nfs).  With it, it takes heavier nfs activity to hang (makeworld
never completes, and a flood ping always hangs).

I first suspected an interrupt-related bug, but the bug seems to be
more hardware-specific.  Examination of the output queues shows that
the tx sometimes just stops before processing all packets.  Resetting
in sk_watchdog() doesn't always fix the problem, and the timeout usually
stops firing after a couple of unsuccessful resets, giving a completely
hung device.  But the problem may be related to interrupt timing, since
it is much smaller under RELENG_4.  RELENG_4 hangs about as often
without this hack as -current does with it.

nv0 hangs similarly.  fxp0 just works.

%%%
Index: if_sk.c
===================================================================
RCS file: /home/ncvs/src/sys/pci/if_sk.c,v
retrieving revision 1.78
diff -u -2 -r1.78 if_sk.c
--- if_sk.c	31 Mar 2004 12:35:51 -0000	1.78
+++ if_sk.c	1 Apr 2004 07:33:58 -0000
_at__at_ -1830,4 +1830,9 _at__at_
 	SK_IF_LOCK(sc_if);

+	if (sc_if->sk_cdata.sk_tx_cnt > 0) {
+		SK_IF_UNLOCK(sc_if);
+		return;
+	}
+
 	idx = sc_if->sk_cdata.sk_tx_prod;

_at__at_ -1853,4 +1858,5 _at__at_
 		 */
 		BPF_MTAP(ifp, m_head);
+		break;
 	}

_at__at_ -2000,5 +2031,4 _at__at_
 		sc_if->sk_cdata.sk_tx_cnt--;
 		SK_INC(idx, SK_TX_RING_CNT);
-		ifp->if_timer = 0;
 	}

_at__at_ -2007,4 +2037,6 _at__at_
 	if (cur_tx != NULL)
 		ifp->if_flags &= ~IFF_OACTIVE;
+
+	ifp->if_timer = (sc_if->sk_cdata.sk_tx_cnt == 0) ? 0 : 5;

 	return;
%%%

Bruce
Received on Wed Apr 07 2004 - 05:17:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:50 UTC