Re: xl0: watchdog timeout

From: Matt Smith <matt_at_xtaz.net>
Date: Mon, 24 Nov 2003 23:26:12 +0000
Matt Smith wrote:
> Matt Smith wrote:
> 
>> Jimmy Selgen wrote:
>>
>>> On Fri, 2003-11-21 at 21:29, Kris Kennaway wrote:
>>>
>>>> On Fri, Nov 21, 2003 at 09:22:49PM +0100, Jimmy Selgen wrote:
>>>> I saw this with some of sam's locking changes that (temporarily) broke
>>>> DUMMYNET.  I see you're using ipfilter - it's possible that this
>>>> configuration has not been well-tested.  Are you passing much traffic
>>>> through ipfilter on this box?
>>>
>>>
>>>
>>> The box in question is my workstation, so I guess i'm not passing that
>>> much traffic through ipfilter. Also, when I said that the NIC still
>>> worked, I might have mislead you a bit. I had about 5-10 timeouts while
>>> scp'ing the dmesg output to my other workstation.
>>> Data seems to move from userland to the kernel, then get stuck in
>>> buffers there for 10-15 seconds, "generating" timeouts, before they're
>>> shipped off. I assume this is expected behaviour when a NIC isnt
>>> behaving correctly.
>>>
>>>
>>>> It would be helpful if you can do a binary search to narrow down when
>>>> the problem started.
>>>
>>>
>>>
>>> What would you have me search ? I'm a faily seasoned C programmer (12
>>> years experience, some of them doing RTOS kernel work), but dont know
>>> much about FreeBSD kernel development, or the process of checking out
>>> different kernel revisions.
>>>
>>>
>>> I've tried a build without IPFILTER, and the problem still exists.
>>> I've also tried booting with ACPI disabled, and the problem is still
>>> there.
>>>
>>> I have attached a copy of my kernel config file, in case i'm doing
>>> something wrong.
>>>
>>
>> <snip kernel file>
>>
>> I have just noticed that my xl0 card is misbehaving as well. I have a 
>> 3c905c in my desktop and noticed that an ftp of a file from another 
>> machine on the lan (100 meg switched) was only going at around 
>> 70KB/sec. Normally I get around 9MB/sec.
>>
>> A netstat -bi xl0 shows lots of errors:
>>
>> Name    Mtu Network       Address              Ipkts Ierrs     Ibytes 
>>  Opkts Oerrs     Obytes  Coll
>> xl0    1500 <Link#1>      00:04:76:8d:c5:fd  3081878 217616 3778632119 
>> 2451968     6  368229701     0
>>
>> I also have this in my messages file:
>>
>> xl0: transmission error: 90
>> xl0: tx underrun, increasing tx start threshold to 180 bytes
>> xl0: transmission error: 90
>> xl0: tx underrun, increasing tx start threshold to 240 bytes
>> xl0: transmission error: 90
>> xl0: tx underrun, increasing tx start threshold to 300 bytes
>> xl0: transmission error: 90
>> xl0: tx underrun, increasing tx start threshold to 360 bytes
>> xl0: transmission error: 90
>> xl0: tx underrun, increasing tx start threshold to 420 bytes
>>
>> I do not currently have any debugging options compiled into this kernel.
>>
>> FreeBSD fraggle.xtaz.co.uk 5.1-CURRENT FreeBSD 5.1-CURRENT #0: Tue Nov 
>> 18 20:05:52 GMT 2003 
>> root_at_fraggle.xtaz.co.uk:/usr/obj/usr/src/sys/FRAGGLE  i386
>>
>> I am actually in the process of building a new world/kernel to update 
>> it again as I thought it might be something that's fixed. I 
>> unfortunatly can not boot the old kernel to see if it works fine in 
>> that because of the statfs changes so it *could* possibly be the NIC 
>> has gone funny.
>>
>> I also have a 3c905a and a 3c905b in my router machine and this is 
>> showing no issues at all with the same dated kernel.
>>http://xtaz.net/
>> Matt.
>>
> 
> I am now running a 5.2-BETA kernel from today and still have the problem 
> with my xl0 card here. I can only get a max throughput of around 
> 110KB/sec through it. And I am getting huge amounts of errors in the 
> interface stats (5 minutes after booting):
> 
> Name    Mtu Network       Address              Ipkts Ierrs     Ibytes 
>  Opkts Oerrs     Obytes  Coll
> xl0    1500 <Link#1>      00:04:76:8d:c5:fd   217042  1290   57669634 
> 309460     0  208178476     0
> 
> So the question is, is this my network card has died and I need to throw 
> it out or is it related to Jimmy Selgen's email about the watchdog 
> timeouts?
> 
> It's a shame I can't boot an old kernel to test really.
> 
> Matt.
> 

I have done some testing on this. I've changed the network cable, switch 
port etc. No affect.

I've found though that if I ftp this box and GET a file it goes at 
around 6MB/sec. But if I PUT a file it goes at 100KB/sec.

Previously this has worked at around 9-10MB/sec both ways. I can't place 
a date on it though because I've not tried to do large file transfers 
for a long time and only just noticed it this week.

So it looks like it is driver related I guess. The "buffer" scenario 
Jimmy reported looks likely.

Matt.
Received on Mon Nov 24 2003 - 14:26:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:30 UTC