Still troubles with indefinite wait buffer errors

From: Martin Welk <mw_at_theatre.sax.de>
Date: Mon, 22 Mar 2004 07:11:57 +0100
Good morning,

I'm still having some troubles with "indefinite wait buffer errors" as they
still come up from time to time since upgrading to 5.2.1-RELEASE(-p1) and
I'm still trying to find the problem. I'm very sure that I can exclude
hardware errors, but please read my "full story"... I've sent that to
-questions a while ago, without any reply.

>From time to time, I get the following error messages under heavy load, for
example, when copying a larger amount of data or while having some load on
the machine while compiling something from the ports:

Mar  4 03:29:00 theatre kernel: swap_pager: indefinite wait buffer: device:
vinum/scratch, blkno: 190, size: 4096
Mar  4 03:29:00 theatre kernel: swap_pager: indefinite wait buffer: device:
vinum/var, blkno: 536, size: 4096

This happens sometimes during the daily jobs running at night,
the block numbers differ every time, so I think that's no bad block thing.

The machine is based on an Asus P2B-S board with a P-II CPU (350 MHz), 256
MBytes of memory and three ATA hard disks. I have replaced the ATA
controller meanwhile, so I'm no longer using the on-board chipset but
a cheap Silicon Image SiI0680 based one which is recognized correctly, as
it looks like.

This machine is running like this for about 1 year now, was running
rock-solid before (I think I've started with 5.1-REL on it).

I have organized all my disks with Vinum and I'm quite happy with it: there
are two 120 GByte disks, bootable, all volumes mirrored through Vinum
including root but except swap space. I know that it's probably not the
best idea to run swapping through a logical volume manager at all, but as
this is a small home serve which shouldn't swap to much at all, I don't
care for it but it helps keeping disk organizing more simple - simply
having all as Vinum devices :) I would appreciate to be able to keep
swapping to Vinum devices in the future again (...latest GEOM changes...),
but that's a another issue - as swap space is usually almost unused (a few
hundred KBytes), I'm currently swapping to a md device (*cough*) hoping
that this will be fixed soon.

The machine is running as a small home server and DSL gateway, so it's
running ppp, natd, ipfw-based firewall, Squid, sendmail, SpamAssassin,
natd, Samba, NFS server, DHCP server, ntpd and a few other small things.

I've done an upgrade to 5.2.1-RELEASE a few days ago, with an update to -p1
a day later, and the other change during that time was to enable fxp1,
which hasn't been used before and that's now using the same IRQ (9) as the
on-board Adaptec SCSI adapter, but when the machine shows the described
symptoms, fxp1 wasn't used heavily and the SCSI adapter is completely
unused - well, there's a CD-ROM and a CD-R connected to it but never used.

I have to screen shots of kernel backtraces from the point of time when the
error happens - "real" screen shots, please see them at
http://www.sax.de/~mw/KIF_1374.JPG and
http://www.sax.de/~mw/KIF_1377.JPG

Again, at all no ata disk errors, and if I force the machine to read out
the full disks (dd if=/dev/ad[n]s1c of=/dev/null) I get no ata errors at
all and it runs through smoothly with throughput between 25 and 41
MBytes/sec.

Thanks in advance for any help.

Regards,
	Martin

(...)
FreeBSD 5.2.1-RELEASE-p1 #3: Thu Mar  4 12:26:22 CET 2004
CPU: Pentium II/Pentium II Xeon/Celeron (350.80-MHz 686-class CPU)
(...)
real memory  = 268423168 (255 MB)
avail memory = 251105280 (239 MB)
(...)
Pentium Pro MTRR support enabled
pcibios: BIOS version 2.10
Using $PIR table, 8 entries at 0xc00f0d10
pcib0: <Intel 82443BX (440 BX) host to PCI bridge> at pcibus 0 on motherboard
pci0: <PCI bus> on pcib0
pci_cfgintr: 0:6 INTA BIOS irq 9
pci_cfgintr: 0:7 INTA BIOS irq 5
pci_cfgintr: 0:9 INTA BIOS irq 9
pci_cfgintr: 0:10 INTA BIOS irq 5
pci_cfgintr: 0:11 INTA BIOS irq 12
agp0: <Intel 82443BX (440 BX) host to PCI bridge> mem 0xe4000000-0xe7ffffff at device 0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci_cfgintr: 0:1 INTA routed to irq 4
pcib1: slot 0 INTA is routed to irq 4
pci1: <display, VGA> at device 0.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 4.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 UDMA33 controller> port 0xd800-0xd80f at device 4.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0xd400-0xd41f at device 4.2 on pci0
pci_cfgintr: 0:4 INTD routed to irq 9
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
pci0: <bridge, PCI-unknown> at device 4.3 (no driver attached)
ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> port 0xd000-0xd0ff mem 0xdf800000-0xdf800fff irq 9 at device 6.0 on pci0
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs
fxp0: <Intel 82558 Pro/100 Ethernet> port 0xb800-0xb81f mem 0xdf000000-0xdf0fffff,0xe2000000-0xe2000fff irq 5 at device 7.0 on pci0
miibus0: <MII bus> on fxp0
inphy0: <i82555 10/100 media interface> on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
fxp1: <Intel 82559 Pro/100 Ethernet> port 0xb400-0xb43f mem 0xde000000-0xde0fffff,0xde800000-0xde800fff irq 9 at device 9.0 on pci0
miibus1: <MII bus> on fxp1
inphy1: <i82555 10/100 media interface> on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcm0: <AudioPCI ES1370> port 0xb000-0xb03f irq 5 at device 10.0 on pci0
de0: <Digital 21041 Ethernet> port 0xa800-0xa87f mem 0xdd800000-0xdd80007f irq 12 at device 11.0 on pci0
de0: 21041 [10Mb/s] pass 2.1
orm0: <Option ROMs> at iomem 0xc8000-0xc8fff,0xc0000-0xc7fff on isa0
pmtimer0 on isa0
(...)
GEOM: create disk ad0 dp=0xc2dec760
ad0: 114473MB <WDC WD1200AB-00CBA1> [232581/16/63] at ata0-master UDMA33
GEOM: create disk ad1 dp=0xc2dec660
ad1: 57241MB <ST360020A> [116301/16/63] at ata0-slave UDMA33
GEOM: create disk ad2 dp=0xc2deb760
ad2: 114473MB <WDC WD1200AB-00CBA1> [232581/16/63] at ata1-master UDMA33
Waiting 15 seconds for SCSI devices to settle
GEOM: create disk cd0 dp=0xc2e1a600
GEOM: create disk cd1 dp=0xc2e19e00
(...)

-- 
      ,,Oh, there's a lot of opportunities, if you're knowing to take them,
                  you know, there's a lot of opportunities, if there aren't
                    you can make them, make or break them!'' (Tennant/Lowe)

-- 
      ,,Oh, there's a lot of opportunities, if you're knowing to take them,
                  you know, there's a lot of opportunities, if there aren't
                    you can make them, make or break them!'' (Tennant/Lowe)
Received on Sun Mar 21 2004 - 21:30:15 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:48 UTC