Re: syscons(4) races

From: Bruce Evans <brde_at_optusnet.com.au> Date: Sat, 22 Nov 2008 22:52:30 +1100 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:37 UTC

On Thu, 20 Nov 2008, Maksim Yevmenkin wrote:

> [moving to -current]

I'm not subscribed, so when I send mail there it is delayed.

>>> --- head/sys/dev/syscons/syscons.c      Sun Nov 16 21:57:54 2008
>>>  (r185012)
>>> +++ head/sys/dev/syscons/syscons.c      Sun Nov 16 22:39:04 2008
>>>  (r185013)
>>> ...
>>> +    mtx_lock(&Giant);
>>>    /* assert(sc_console != NULL) */
>>>
>>>    /*
>>
>> Acquiring Giant in debugger mode is especially invalid (except deadlock
>> is not so likely for a recursive lock).
>
> ok, in this case i have a somewhat stupid question. when kbdmux(4) is
> the default keyboard, all those kbdd_xxx() calls (that sccngetch()
> makes)  will call into kbdmux(4) functions that will grab giant.
> kbdmux was changed about 2 months ago to do that. it sounds like those
> changes are completely wrong. am i correct here?

Yes, the low-level console functions are supposed to be reentrant (more
details below), so they cannot do things like that.

>> BTW, sccngetch() shouldn't exist.  It existed to demultiplex the 2
>> console driver entry points sc_cngetch() and sc_cncheckc(), but
>> sc_cncheckc() has gone away.
>
> the entry point in consdev struct is still there. also cncheckc() is
> trying to call it (if its present). so it looks like it may not be
> completely gone, just sysconst(4) no longer implements it.

Ah, this is compatibility cruft which is bogusly only used by a driver
that is newer than the cruft (xen).  Normal drivers use CONSOLE_DRIVER()
but xen uses the old interface CONS_DRIVER().  CONSOLE_DRIVER() differs
from CONS_DRIVER() only in not taking the cn_checkc() entry point.
Binary compatibility and correct naming were broken by renaming
cn_checkc() to cn_getc().  Names are still correct at the top level
(there, cncheckc() gives the non-blocking interface and cngetc() gives
the blocking interface, while in normal drivers there is now only
cn_getc() which gives the non-blocking interface).

It is cn_dbctl() that has completely gone away.

>> There is still a problem for calls to sc_cngetc() from the low-console
>> driver.  These can race with sckbdevent().  In debugger mode, no locking
>> is permitted, so syscons' getc functions must be carefully written to
>> do only harmless things when they lose races.  They are not carefully
>> written (e.g., almost the first thing they do, shown in the above, may
>> give a buffer overrun:
>>
>>    if (fkeycp < fkey.len) {
>>        mtx_unlock(&Giant);
>>        splx(s);
>>        return fkey.str[fkeycp++];
>>    }
>
> all right, so we can not use locks, i'm guessing this includes spin
> locks too. can we use atomic operations here? since, in polling mode,
> consumer is going to call getc() in a loop, can we use atomic
> reference counter to make sure there is only one caller running at a
> time? if someone already grabbed reference counter just return -1 as
> if there is no input?

It is safer to use local spinlocks and safer still to use home-made
locks based on atomic_cmpset(), but still wrong to do simply that since
it can cause deadlock, especially for debugging -- it is not possible to
make sure that there is only 1 caller running, since debugging (or NMI,
or perhaps an ordinary interrupt that cannot reasonably be masked) can
preempt any caller.  sio uses its local spinlock (I don't like this)
together with a hack to allow reentering from the debugger only:

% 	need_unlock = 0;
% 	if (!kdb_active && sio_inited == 2 && !mtx_owned(&sio_lock)) {
% 		mtx_lock_spin(&sio_lock);
% 		need_unlock = 1;
% 	}

This is only done for cn_putc().  cn_getc() (really cn_checkc()) is
completely missing locking (except in debugger mode) and needs
higher-level locking anyway, since the whole polling loop in cngetc()
needs to be locked to prevent interference.

Here sio_lock is the local spinlock (actually per-driver so it isn't very
local), sio_inited is a flag to prevent use of sio_lock here before
sio_lock is initialized (this flag requires delicate initialization
using atomic_cmpset to avoid races), and !kdb_active is the hack to
let the debugger in irrespective of the state of the lock and without
touching the lock.

Any lock used for the console must be MTX_QUIET and/or MTX_NOWITNESS
to prevent console i/o generating endless i/o about itself.  I think
it must be a spinlock too.  The lock thus reduces to a simple spinlock
which can be implemented directly using atomic_cmpset(), thus making it
clear that the lock has no interactions with other parts of the system.

sio shouldn't be using its normal-i/o driver lock for the console.  It
attempts to be reentrant and does a context switch of the normal-device
state while in console mode.  However, this only works for UP.  For
SMP, something more is needed to stop interference from other CPUs.
In debugger mode, the system provides the something by stopping other
CPUs so the old UP code works (this also provides the necessary locking
for the loop in cngetc()).  In non-debugger mode, abusing the normal-i/o
driver lock provides the something.  This is ugly but works quite well.
Analysis:
- Suppose one CPU is doing console i/o and acquires the lock.  Then
   other CPUs have be prevented from touching any device state (since
   sio console i/o is not completely reentrant -- for UP it depends
   on masking interrupts to prevent more than 1 level of reentrance
   (although it preserves the state on the stack), and stopping other
   CPUs corresponds to masking interrupts.
- Suppose this code is called with the lock held.  Then waiting until
   the lock is not held is best provided the wait isn't very long (in
   particular, provided The wait isn't infinite due to deadlock).  If
   the lock is held by another CPU, then waiting corresponds to waiting
   in the previous case (except since we are abusing the normal-i/o
   dirver lock, we may be waiting for normal i/o and not console i/o).
   Using a spinlock prevents the lock being held by the same CPU except
   in cases involving non-maskable exceptions:
   - debugger exceptions are handled by the !kdb_active hack.  Now sio's
     console i/o reentrancy can be used a couple of layers deep (for ddb
     tracing the console i/o code; ddb doesn't support debugging itself,
     so the reentrance can't go deeper than a couple).
   - other traps like SIGBUS shouldn't get here since the console i/o
     code shouldn't cause such traps.
   - NMIs.  An NMI in console i/o code probably causes deadlock if the
     NMI handler does console i/o.  I think NMIs can cause deadlock
     generally -- the implementation of spinlocks uses disabling of
     interrupts to prevent deadlock, but NMIs cannot be disabled; thus
     NMI handlers cannot call mtx_lock_spin(), but I think they do, if
     only by wandering into console i/o code to report a problem.

>> ...
>> However, you seem to have made the races very common by (ab)using cngetc()
>> for keyboard multiplexing.  cngetc() is not designed for this.  You need
>> to know syscons' internals and do the necessary Giant locking in the caller.
>
> this is what i'm not getting. are you saying that lower layer keyboard
> drivers can not use any locking whatsoever in polling mode (or rather
> debug mode)? are you saying that we need a completely different path
> for handling keyboard input in debug mode? and/or possibly polling
> mode?

I'm saying thay lower layer console drivers cannot use any normal
locking in any mode.  In non-debugger modes, at least cn_putc() must
not block on any normal lock, since interrupt handlers can call printf()
which calls cn_putc().  Before [Free]BSD used mutex or ithreads, this
almost required the console i/o routines to be reentrant.  Debugger
mode requires reentrancy even more, but is otherwise simpler.  Now
with ithreads, even non-spin locks might work (at a cost of hopefully
unimportant delays) for interrupt handlers, but debugger mode still
requires reentrancy.

Bruce