With recent CURRENT (at least for the last 2 days, but probably longer), two of my systems can be brought to their knees (live-lock) with a simple "dd if=/dev/zero of=test bs=128k" command. I have not tested any other systems. I keep both servers synced running 6-CURRENT: Server #1: dual AthlonMP 2600+, Compaq SmartArray 5302/64 hardware raid card (ciss). The card hosts two arrays, one RAID-5 built from 4 discs that holds the system and one RAID-0 built from 14 discs. All the discs are 36GB 10krpm and I have one array on each channel on the card. Server #2: AthlonXP 2500+ with an old Maxtor 27GB UDMA66 disc for the system. What made me take notice was that server #2 ran through a "make installkernel; make installworld" faster than server #1 during a recent upgrade. This makes no sense given the superior I/O performance of the hardware scsi raid array on server #1, and I know that in the past server #1 has finished the process ahead of server #2. After the upgrade was done I ran some simple tests with 'dd', and it only took ~1 minute for the system to live-lock. Breaking into DDB and killing the 'dd' process brought the machine back to life. I assumed the problem was ciss-related, CAM-related or SMP-related, but I just tried doing the same thing on the UP machine (server #2), and it too live-locked within a minute. Both systems use pretty much the same config, with the only major difference being SMP or not: * SCHED_4BSD, PREEMPTION, ADAPTIVE_GIANT, DEVICE_POLLING, HZ=2000 * debug.mpsafenet="1", debug.mpsafevfs="1" The problem manifests itself like this: Shortly after 'dd' is started, the machine starts to swap. The swapping makes the machine very unresponsive. After about a minute or so the machine enters some sort of live-lock where the IP-stack replies to icmp echos, but nothing else can be done. The last test I did was on a system compiled from sources dated 2005.04.22.01.00.00 (earlier today). The oldest system I've tested is from 2005.04.20.14.30.00 (but I did notice the system being slightly sluggish earlier in the week too, so I think the problem is older than that). This is a serious regression! I don't know when I last did any testing with 'dd', but I'm pretty sure it was less than 3 months ago (and back then neither system live-locked). /Daniel ErikssonReceived on Fri Apr 22 2005 - 11:13:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:32 UTC