On 8 Jan, Mathew Kanner wrote: > On Jan 08, Don Lewis wrote: >> On 7 Jan, Stefan Ehmann wrote: >> > On Wed, 2004-01-07 at 21:15, Don Lewis wrote: >> >> Try the following patch. I won't guarantee that you still won't see >> >> panics, but at least it should panic cleanly instead of stomping on some >> >> innocent block of memory that corrupts data or some critical structure >> >> that will trigger a mysterious and hard to debug panic at some later >> >> time. >> >> >> >> The reason for changing the KASSERT to a test and an explicit call to >> >> panic() is to make sure that the error is always caught because the >> >> sound code is typically not compiled with INVARIANTS, so the KASSERT >> >> will typically be ignored. The vchan stuff really needs a proper fix, >> >> but I don't understand the sound code well enough. >> > >> > Nice patch - unfortunately sound doesn't work any longer with vchans >> > enabled (which is set to 4 at boot time here). >> > >> > I get "pcm0:virtual:0: play interrupt timeout, channel dead" whenever I >> > try to play anything. >> > >> > However, if I do "hw.snd.pcm0.vchans: 4 -> 0" it works nice again. >> > >> > I'll probably try later if I'm able to get a panic with vchans disabled >> > (Especially those that had - at least not in an obvious way - any sound >> > stuff in backtrace) >> > >> > If not, maybe the panic gets triggered if for some reason the dsp device >> > isn't ready again after a song played and a new virtual dsp device is >> > opened and vchan code gets in action (combined with other bad things). >> > >> > It can't be vchans alone because I just had three mpg123s open and they >> > just played fine. >> >> Some other wierdness that I noticed is that if one of the >> chn_setblocksize() is called for one of the vchans, vchan_setblocksize() >> will get called, which will call chn_notify(parent, CHN_N_BLOCKSIZE). >> When this happens, the parent will interate over all of its children >> looking for the one with the minimum bufhard blksz. It will then call >> chn_setblocksize() on itself, and chn_setblocksize() will call >> sndbuf_remalloc() on its bufsoft, which will set reallocate the buffer >> with the size (blkcnt * blksz). If this channel is the parent vchan and >> the new size of bufsoft is smaller than the size of bufhard (which never >> gets reallocated), feed_vchan_s16() will write past the end of bufsoft >> and things will go boom sometime later. >> >> Try the patch below in place of my previous patch. As you might guess, >> I'm grasping at straws. >> > > Hello Don, > I'm joining this discussion late and I dont understand the > circumstances that cause the panic. Is it reproduceable with by a regular > user? Yes. The problem is that a vchan is getting tweaked so that the size of its bufsoft is smaller that the size of its bufhard. When we finally got to the point where we could panic the system close to where the fatal problem occured (by compiling with INVARIANTS so that KASSERTS are checked), we found that the size of bufhard was 2048 and bufsoft was 688. We tripped the KASSERT when chn_intr() called chn_wrintr(), which called chn_wrfeed() which called sndbuf_feed(), which called feed_vchan_s16(). chn_wrfeed() calculates "amt" as the free space in bufhard, which ends up being 2048, then it passes bufsoft, bufhard, and amt to sndbuf_feed() as the "from", "to", and "count" parameters. sndbuf_feed() verifies that the free space in "to" aka bufhard is >= "count", which is obviously going to be true. It then feed_vchan_s16(), passing to->tmpbuf (bufsoft's temporary buffer), "count", and "from" to feed_vchan_s16() as the "b", "count", and "source" parameters. If feed_vchan_s16() has been compiled with INVARIANTS, it will panic if the size of "source" (bufsoft) is less than count, but since modules historically not been compiled with kernel options, and/or FreeBSD releases are not compiled with INVARIANTS, the KASSERT code here is typically ignored. If so, feed_vchan_s16() executes the following code: tmp = (int16_t *)sndbuf_getbuf(src); bzero(tmp, count); If src aka bufsoft is smaller than count, bzero() will write past the end. The system may run for a while, but eventually some vital kernel data structure may get stomped on and the system will panic() for some mysterious reason that is unlikely to appear related to the sound code. > Anyway, the area which you are working (vchans, feeders, etc) > is extremely confusing to me but I will spend some time with it this > weekend. You are not the only one confused ... > I think Orion might be a good person to ask about the role of > buffers and channels, I've cc'ed him with this e-mail. > > --Mat >Received on Thu Jan 08 2004 - 11:56:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:37 UTC