On 6 Jun, Tai-hwa Liang wrote: > On Sat, 5 Jun 2004, Don Lewis wrote: >> On 5 Jun, Tai-hwa Liang wrote: >> > On Fri, 4 Jun 2004, Don Lewis wrote: >> >> On 4 Jun, Tai-hwa Liang wrote: >> >> > On Fri, 4 Jun 2004, Daniel Lang wrote: >> >> >> Hi, >> >> >> >> >> >> I just went through my syslog and stumbled across a LOR, which is >> >> >> documented on the Zabbadoz LOR page (id 009). >> >> >> >> >> >> The reason for this mail is the following kernel messages: >> >> >> >> >> >> [..] >> >> >> Jun 1 11:08:06 atrbg11 kernel: x: 2 >> >> >> Jun 1 11:08:06 atrbg11 kernel: x: 2 >> >> >> [..] >> >> >> >> >> >> Seems some leftovers from a developer who did not want to >> >> >> bother others with the gory details. ;-)) >> >> > >> >> > I guess that came from sys/dev/sound/pcm/sound.c:872. My Thinkpad T40 also >> >> > ran into this when the disk/network(combination) load is high enough. >> >> >> >> Looks likely, but I'd expect this would only get triggered when sysctl >> >> is used to set or query the number of vchans. In any case, this marks >> >> another spot where locking is broken in the sound code. >> > >> > I use the default setting without tweaking any vchan related sysctl >> > such like hw.snd.pcm0.vchans(leave it to default 0) or hw.snd.maxautovchans >> > (also left to 0 on my T40). >> >> Maybe your sound client software (or something else on the system) is >> querying for the number of vchans. Even "sysctl -a" will invoke the >> handler. > > I'm using the stock mpg123 0.59r to play MP3 files. Since I've never read > the source before, I'm not sure whether it utilised vchan related sysctls > or not. The code that calls pcm_inprog() and prints the "x: 2" debug message appears to be an attempt at implementing a reader/writer lock. I'm pretty sure the failure that triggers the debug message is harmless, other than causing the sysctl() call to return EWOULDBLOCK. The most likely trigger for this event is if the client software is writing data to sound device and is blocked in the write() syscall when another thread calls sysctl(). You could find out which process is calling sysctl() by printing curproc->p_pid in place of or in addition to x. >> > I'm wondering about whether this "bug" is ACPI related(interrupt storm?) >> > since recently when I run "make buildworld buildkernel" on one terminal and >> > do large file transmission(FTP ISO images on using em0) at the same time, >> > the sound tends to become "broken." Meanwhile, the "x: 2" message popped >> > on my console and later on my em0 doesn't work anymore(I have to do a reboot >> > to get it back; however, the other part of the system still works while >> > em0 was dead: sound continue playing, can compile/edit programs). >> >> It is quite possible that locking collisions in the sound code might >> happen more frequently when the system is busy. >> >> Do your sound hardware and em0 share the same irq? If they do, that > > Nope. > > interrupt total rate > irq0: clk 356127 99 > irq1: atkbd0 6006 1 > irq4: cbb0 em0++ 1378364 386 > irq6: cbb1 pcm0 110211 30 > irq7: ppc0 3 0 > irq8: rtc 455852 127 > irq12: psm0 16785 4 > irq13: npx0 1 0 > irq14: ata0 322853 90 > irq15: ata1 55 0 > Total 2646257 742 In that case, the em0 problem is likely to be something else. Does em0 lock up even if you don't use the sound hardware? >> might be a clue that the problems are related. Can you boot with ACPI >> disabled, and if so, does it change the symptoms? > > Whoops! My first trial on booting without ACPI -- kernel panic in > usb_get_next_event(). Apparently I have to disconnect the USB mouse > prior to kernel booting. See final section for the backtrace. Sounds like bug #3. > And, yes, the em0 didn't hang after booting to the non-ACPI environment. > However, there're still something weird happening at that moment: > > 1. There're still a couple of "x: 2" dumped on console. Test case: > "make buildworld buildkernel -DNOCLEAN" in one xterm, mpg123 in > another xterm, and providing file downloading to another Windoze > box. > > 2. While the remote box was downloading(Intel eepro 100/VE) files, > the number of interrupt on em0 was dramatically reduced to 300 ~ 500+ > per second. It was 3000+ when ACPI was enabled(no device polling in > kernel). > > 3. The system average load is surprising low(I'm using SCHED_4BSD), > about 0.06 ~ 0.30; however, the download was awfully slooowwwww. > It took my brother's Windoze XP about 25 minutes to complete the > whole downloading(3 ISO files, about 2GB in total). I'm pretty sure > that the mediaopt on em0 was 100baseTX + full-duplex. I wonder if the em0 interrupt is not getting enabled or is getting misrouted and the em0 interrupt service routing is getting triggered by another interrupt source that is happening at a lower rate. Print out the irq mapping with ACPI disabled to see what em0 is sharing its irq with. This is probably something that jhb and njl will need to look at.Received on Sun Jun 06 2004 - 11:12:42 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:56 UTC