Re: Is anything being done re: the pcm timeout issue?

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Tue, 10 Aug 2004 01:26:36 -0700 (PDT)
On  9 Aug, Rusty Nejdl wrote:
> Conrad J. Sabatier said:
>> 
> 
>>> Sound works fine for me on my Dell laptop (mss driver).  I do get a
>>> mutex-related panic on my desktop (sb16 driver), but haven't sent in
>>> the stack trace to anyone yet.
>> 
>> My problems are occurring on an amd64 box (Athlon 64) with the nVidia
>> nForce3 chipset (snd_ich driver).
>> 
>> Sounds works fine for a while, then suddenly I get a pcm play timeout,
>> and game over.  Have to reboot to get sound to work again.
>> 
>> Others have reported similar problems, but I've seen no followups
>> indicating anything is being done about it.
>> 
> 
> I've been seeing a sound issue on 5.2.1-release and I wonder if it's
> related at all to what you are seeing.  I have :
> 
> hw.snd.maxautovchans: 4
> hw.snd.pcm0.vchans: 4
> 
> And I have seen that these will eventually stop working one by one
> until I have none left.  lsof and fstat don't show any programs using
> them, but nonetheless, programms like xmms and gaim can't use them
> anymore.

The vchan code is fairly broken.  I was hoping to have to some time to
work on this (and other problems in the top half of the sound code)
before 5.3, but it looks like the clock has just about run out.

> Do you have any more details on the pcm play timeout?  Are you using
> vchans?  What program are you using?

My suspicion is that there is either a problem in ich_intr() that it
causing it to stop receiving interrupts or to stop calling chn_intr(),
or there is enough interrupt latency to allow the DMA pointer to wrap
and fool chn_dmaupdate() into thinking no data was consumed.  It is
possible that the ich_intr() problem is specific to amd64.

I previously sent out these suggestions on how to debug the problem:

------ Forwarded message ------
    From: Don Lewis <truckman_at_freebsd.org>
 Subject: Re: Questionable code in sys/dev/sound/pcm/channel.c
    Date: Tue, 27 Jul 2004 15:15:06 -0700 (PDT)
      To: mat_at_cnd.mcgill.ca
      Cc: freebsd-current_at_freebsd.org

On 27 Jul, Mathew Kanner wrote:
> On Jul 26, John-Mark Gurney wrote:
>> Conrad J. Sabatier wrote this message on Mon, Jul 26, 2004 at 16:35 -0500:
>> > Why the formulaic calculation of timeout, if it's simply going to be
>> > unconditionally set to 1 immediately afterwards anyway?  What's going on
>> > here?
>> 
>> Well, if you look at the annotations, that absolute set of timeout was
>> added in rev 1.65 by cg with the comment:
>> tweaks to reduce latency/pauses in output
>> 
> 
> 
> 	I think this has been raised on the mailling list before.
> IIRC, the logic for this is to check frequently for dead channels but
> CG is the authoriy.

My suspicion is that this change was made to reduce the consequences of
lost wakeups from the interrupt routine.  This would have been more of a
problem when tsleep() was used in chn_sleep() and shouldn't be needed
now that the top and bottom halves of the code use the channel lock and
chn_sleep() uses msleep() to atomically release the lock and wait for
the wakeup from the interrupt code.  That said, setting timeout to 1
shouldn't hurt anything and will just waste a bit of CPU time.


>> > Also, at the end of the function:
>> > 
>> >     if (count <= 0) {
>> >         c->flags |= CHN_F_DEAD;
>> >         printf("%s: play interrupt timeout, channel dead\n", c->name);
>> >     }
>> > 
>> >     return ret;
>> > }
>> 
>> that was changed in rev1.52 (by cg also), and previously was just a check
>> for count == 0..
>> 
>> So, I'd recommend a message off to cg and ask why he made this changes...

The original version of the code always set timeout to 1 and looped on
(count > 0), so count could never go negative.  When the code was
changed to set count to something larger than 1, count could go negative
if (hz % timeout != 0), so the condition for setting CHN_F_DEAD had to
be modified accordingly.

My suspicion is that there is sometimes enough latency in executing the
interrupt routine that the hardware DMA pointer is wrapping and
chn_dmaupdate() is calculating delta as zero.  This would cause
chn_wrfeed() not to consume any data from the software buffer (and skip
the wakeup()), which might be enough to cause the chn_write() to time
out while waiting for space to become available in the software buffer.
It would be interesting to enable the debug code in chn_dmaupdate(), and
add (delta == 0) as a condition to trigger the device_printf().

The bigger question is what is the cause of the latency ...


------ Forwarded message ------
    From: Don Lewis <truckman_at_freebsd.org>
 Subject: Re: Questionable code in sys/dev/sound/pcm/channel.c
    Date: Tue, 27 Jul 2004 15:21:57 -0700 (PDT)
      To: conrads_at_cox.net
      Cc: freebsd-current_at_freebsd.org

On 27 Jul, Conrad J. Sabatier wrote:
> 
> On 26-Jul-2004 Conrad J. Sabatier wrote:
>> 
>> On 26-Jul-2004 Conrad J. Sabatier wrote:
>>> I'm a little perplexed at the following bit of logic in chn_write()
>>> (which is where the "interrupt timeout, channel dead" messages are
>>> being generated).
> 
> [snip]
> 
>>> Also, at the end of the function:
>>> 
>>>     if (count <= 0) {
>>>         c->flags |= CHN_F_DEAD;
>>>         printf("%s: play interrupt timeout, channel dead\n",
>>> c->name);
>>>     }
>>> 
>>>     return ret;
>>> }
>>> 
>>> Could it be that the conditional test is wrong here?  Perhaps
>>> we should be using (count < 0) instead?
>> 
>> I'm now running a kernel built with this last conditional test
>> changed to "if (count < 0)" and sound is still working OK.  Have yet
>> to see if this eliminates the interrupt timeout messages.
> 
> Well, that was a failure.  :-)  Didn't see any timeout error messages,
> but the device still died eventually, nonetheless.  I've since changed
> back to the original code.

That's an interesting data point. At this point I'd start looking at the
driver code for your sound hardware.  I suspect that the driver
interrupt code is either no longer seeing interrupts, or it is no longer
calling chn_intr().
Received on Tue Aug 10 2004 - 06:26:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:05 UTC