em(4) watchdog timeout

From: Jeremie Le Hen <jeremie_at_le-hen.org>
Date: Fri, 21 Jul 2006 14:34:48 +0200
Hi,

I am running a two month old current (dated from May 24), and I am
experiencing watchdog timeouts with my em(4) adapter when running
some CPU bound workload involving a computational perl script.
Unfortunately this bugs occurs very infrequently, I can't trigger
it each time I run this job.

FWIW, the command line is something like this :
%   gzip -dc data.gz | perlscript > chewed_data

I recompiled em(4) with DEBUG_INIT, DEBUG_IOCTL and DEBUG_HW
all set to 1, but it doesn't seem to provide valuable information :

% Jul 21 11:17:14 neuneuf kernel: em0: watchdog timeout -- resetting
% Jul 21 11:17:14 neuneuf kernel: em_init: begin
% Jul 21 11:17:14 neuneuf kernel: em_stop: begin
% Jul 21 11:17:14 neuneuf kernel: free_transmit_structures: begin
% Jul 21 11:17:14 neuneuf kernel: free_receive_structures: begin
% Jul 21 11:17:14 neuneuf kernel: em_init: pba=48K
% Jul 21 11:17:14 neuneuf kernel: em_hardware_init: begin
% Jul 21 11:17:14 neuneuf kernel: em_initialize_transmit_unit: begin
% Jul 21 11:17:14 neuneuf kernel: Base = 1ebf9000, Length = 1000
% Jul 21 11:17:14 neuneuf kernel: 
% Jul 21 11:17:14 neuneuf kernel: em_set_multi: begin
% Jul 21 11:17:14 neuneuf kernel: em_initialize_receive_unit: begin
% Jul 21 11:17:14 neuneuf kernel: em0: link state changed to DOWN
% Jul 21 11:17:16 neuneuf kernel: em0: link state changed to UP
% Jul 21 11:17:16 neuneuf kernel: ioctl rcv'd: SIOCxIFMEDIA (Get/Set Interface Media)
% Jul 21 11:17:16 neuneuf kernel: em_media_status: begin

The ship is:
% em0_at_pci3:11:0:  class=0x020000 card=0x02871014 chip=0x10138086 rev=0x00 hdr=0x00
%     vendor   = 'Intel Corporation'
%     device   = '82541EI Gigabit Ethernet Controller (Copper)'
%     class    = network
%     subclass = ethernet

The interrupt is shared with uhci0:
% neuneuf:/sys:112# vmstat -i
% interrupt                          total       rate
% irq1: atkbd0                       39216          0
% irq14: ata0                      4801030          3
% irq16: em0 uhci0++             919491852        688
% irq19: uhci1                       35141          0
% irq23: ehci0                           1          0
% cpu0: timer                   2670435076       1999
% Total                         3594802316       2692

I can't try DEVICE_POLLING right now since IIRC I should recompile the whole
kernel (right now I am using the if_em module so that I can tune the driver
without rebooting).

Thank you.
Regards,
-- 
Jeremie Le Hen
< jeremie at le-hen dot org >< ttz at chchile dot org >
Received on Fri Jul 21 2006 - 10:34:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:58 UTC