Re: pccbb crashes when detaching (unsafe interrupt handler)

From: Brian Fundakowski Feldman <green_at_FreeBSD.ORG> Date: Sat, 17 Jul 2004 14:28:41 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:02 UTC

On Sat, Jul 17, 2004 at 12:17:12PM -0600, M. Warner Losh wrote:
> In message: <20040717054014.GP1626_at_green.homeunix.org>
>             Brian Fundakowski Feldman <green_at_freebsd.org> writes:
> : On Fri, Jul 16, 2004 at 11:28:44PM -0600, M. Warner Losh wrote:
> : > In message: <20040715180854.GZ1626_at_green.homeunix.org>
> : >             Brian Fundakowski Feldman <green_at_freebsd.org> writes:
> : > : Devices that attach to a pccbb end up getting screwed if they share
> : > : interrupt handlers with anything else and one comes in at such time
> : > : that the interrupt handler has been deallocated but not torn down.
> : > 
> : > How do they get screwed by this case?  I considered these cases.  What
> : > scenario, exactly, does this cause problems for.
> : > 
> : > : There is no interrupt protection for the pccbb interrupt handler,
> : > 
> : > There doesn't need to be one because we preclude calling it when
> : > things are in a bad state.  Is there some case that we've not taken
> : > care of?
> : > 
> : > : and the cleanest method I can think of is add a "don't run" count to
> : > : the real ithread interrupt handler (parent); however, this is a fair
> : > : bit of infrastructure if it's not something very useful, and in SMP,
> : > : concurrent accesses of the local interrupt handler list in pccbb do
> : > : need to be protected, not just preventing interrupt handlers.
> : > 
> : > I don't think this sort of insfrastructure is useful.  I don't
> : > understand when the case you are trying to protect against can happen,
> : > so I'm having trouble seeing why the code is necessary.
> : > 
> : > : This is what I have so far, but I'm not happy with it because it
> : > : seems like it would be so much nicer to increment a count on the
> : > : ithread's intrhand and drain any current interrupt out of the
> : > : ithread, but possibly making each interrupt a lot more expensive.
> : > 
> : > Like I said in another note, I don't like this because I don't want to
> : > take out a lock on every interrupt to guard against such a rare case
> : > as you've discovered (I'm assuming it is rare becuase I've never seen
> : > it in 5 years of maintaining the code).
> : > 
> : > ...
> : > :  	if (sc->flags & CBB_CARD_OK) {
> : > ...
> : > 
> : > What I want to know is why the above check doesn't guard against this
> : > problem.  If the card is detached because you ejected it, the isr
> : > already guards against calling the hanlders for this card.
> : 
> : Do you want to see the driver?  This is a simple problem.  The removal
> : from the interrupt handler list in pccbb is not even remotely safe with
> : regard to SMP _or_to_interrupts_.  I am literally receiving crashes
> : with the function 0xdeadc0de() being called from the interrupts.  Do you
> : not believe me?  It's extremely easy to test.  I just play something on the
> : sound card while I kldunload, or eject, my cardbus wireless card.
> 
> I've read the driver, and I've shared interrupts and I never see this
> problem.  Please try not to use hyperbole in your mail to me and tone
> down the rheteric.  You are only succeeding in making me more angry
> without achieving your goal of getting this issue resolved.
> 
> The card_ok test generally catches these problems, and I've made a few
> changes in p4 to make it stronger, but not perfect.  I see the race.
> There's trivial ways to make it harder to happen, but completly fixing
> it without impacting the performance of every interrupt is much
> harder.  I'm not too interested in using an sx lock, unless it is the
> fastest way to deal with the problem.

It's just as unfair for you to use hyperbole and deny it as a problem
or basically call me a liar, but I'm glad at least you're admitting it
is a problem now.  You need to use atomic/SMP-safe synchronizations,
and that does not come at zero cost.  There is a race between the driver
checking if the card is still "okay" and running the interrupt handler
that is impossible to solve without at least two atomic operations.

Zero will not do.

Be pissed off all you want, Warner.  FreeBSD 5.x is not based on the spl
model anymore, and unless you want to ADD the ability to do something like
that, which is certainly quite feasible (that is, add a "blocking" count
to the interrupt handler, which the ithread has to ack before you can
return with the interrupt "blocked") there will be a specific cost inside
the pccbb driver to do things safely and it will involve some sort of
synchronization.

-- 
Brian Fundakowski Feldman                           \'[ FreeBSD ]''''''''''\
  <> green_at_FreeBSD.org                               \  The Power to Serve! \
 Opinions expressed are my own.                       \,,,,,,,,,,,,,,,,,,,,,,\