Re: page fault panic tracked down (selwakeuppri())

From: Don Lewis <truckman_at_FreeBSD.org> Date: Sun, 04 Jan 2004 23:35:00 -0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:40 UTC

On  4 Jan, Stefan Ehmann wrote:
> On Sun, 2004-01-04 at 23:24, Don Lewis wrote:
>> On  4 Jan, Stefan Ehmann wrote:
>> > I took out the debug options because it was just too slow. Put back
>> > INVARIANTS (but no WITNESS) now and speed is nice again.
>> 
>> This problem is more likely to be caught by INVARIANTS than WITNESS.
>> 
>> > Applied your suggested changes which resulted in a panic. No
>> > assertations were triggered though.
>> 
>> Bummer!
> 
> Updated to plain (= no patches/hacks) again, also put in the
> DEBUG_VFS_LOCKS.
> 
> For the first time I got a backtrace that ended in the soundcard module
> - So maybe this is the right direction (on the other hand this might be
> some newly introduced error)
> 
> panic: bad bufsize
> #0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
> #1  0xc04e5198 in boot (howto=256) at
> /usr/src/sys/kern/kern_shutdown.c:372
> #2  0xc04e5527 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #3  0xc07ec648 in feed_vchan_s16 () from /boot/kernel/snd_pcm.ko
> #4  0xc07e2c6d in sndbuf_feed () from /boot/kernel/snd_pcm.ko
> #5  0xc07e3225 in chn_wrfeed () from /boot/kernel/snd_pcm.ko
> #6  0xc07e327c in chn_wrintr () from /boot/kernel/snd_pcm.ko
> #7  0xc07e3990 in chn_intr () from /boot/kernel/snd_pcm.ko
> #8  0xc07fca2f in csa_intr () from /boot/kernel/snd_csa.ko
> #9  0xc07fb724 in csa_intr () from /boot/kernel/snd_csa.ko
> #10 0xc04d1692 in ithread_loop (arg=0xc1737b00)
>     at /usr/src/sys/kern/kern_intr.c:544
> #11 0xc04d0684 in fork_exit (callout=0xc04d1500 <ithread_loop>, arg=0x0,
>     frame=0x0) at /usr/src/sys/kern/kern_fork.c:796

I think this is an important clue.  I'm guessing that the
	KASSERT(sndbuf_getsize(src) >= count, ("bad bufsize"))
in feed_vchan_s16() is getting tripped.  Notice that a bit further down
we have the following code:
	count &= ~1;
	bzero(b, count);
	[ snip ]
	tmp = (int16_t *)sndbuf_getbuf(src);
	bzero(tmp, count);

As I recall from our previous debugging efforts, the data structures
that are getting corrupted are getting zeroed.  I suspect that either
the source or b parameters to feed_vchan_s16() are bogus, causing some
unrelated part of the heap to get stomped on.  Because the KASSERT() is
getting triggered here, I'm more suspicious of the source parameter.