Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others

From: Bruce Evans <brde_at_optusnet.com.au> Date: Fri, 31 Mar 2017 17:05:26 +1100 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC

On Fri, 31 Mar 2017, Andrey Chernov wrote:

> On 30.03.2017 21:53, Bruce Evans wrote:
>> I think it was the sizing.  The non-updated mode is 80x25, so the row
>> address can be out of bounds in the teken layer.
>
> I have text 80x30 mode set at rc stage, and _after_ that may have many
> kernel messages on console, all without causing reboot. How it is
> different from shutdown stage? Syscons mode is unchanged since rc stage.

Probably just because their weren't enough messages to go past row 24.
I had no difficulty reproducing the crash today for entering ddb and
reboot starting 80x30 and rows > 24, after removing just the window
size update in the fix.  I missed seeing it the other day because I
tested with 80x60 to see the smaller console window more clarly, but
must have only tried rebooting with row <= 24.

Another recent fix for sc reduced the problem a little.  Mode changes
are supposed to clear the screen and move the cursor to home, but they
only clear the screen.  You should have noticed the ugliness from that
after the the switch to 80x30.  There are enough boot messages to
reach row 24 and messages continued from there.  Now they start at the
top of the screen again.  Clearing the messages is not ideal, but syscons
always did it.

Syscons also has new and old bugs preserving colors across mode changes:
- it never preserved changes to the palette (FBIO_SETPALETTE ioctl).
   Some mode changes should reset the palette, but some should not.
   Especially not ones for a vt switch
- BIOSes should reset the palette for mode changes (even to the same mode).
   Some BIOSes are confused by syscons setting the DAC to 8 bit mode and
   reset to a garbage (dark) palette then.  They always switch back to
   6 bit mode
- syscons used to maintain the current colors and didn't change them for
   mode changes.  This was slightly broken, since for a mode change from
   a mode with full color to one with less color, the interpretation of
   the color indexes might change.  The colors are now maintained by
   teken and syscons tells teken to do a full window size change which
   resets the entire teken state including colors.  This bug is normally
   hidden by vidcontrol refreshing the colors.

   vidcontrol could be held responsible for refreshing or resetting
   everything after a mode change ioctl, but I think this is backwards
   since there are many low-level details that are better handled in
   the driver.  Switching to graphics modes is already a complicated
   2-ioctl process with not enough options and poor error handling.
   Like a too-simple wrapper for fork-exec.

vt has some interesting related bugs.  It doesn't support mode switches
of course, and even changing the font seems to be unsupported in text
mode.  But in graphics mode, changing the font works and even redraws
the screen where syscons would clear it for the mode change.  But there
are bugs redrawing the screen -- often old history is redrawn.  This
should work like in xterm or a general X window refresh where the
redrawing must be done for lots of other events than resize (exposure,
etc.).

>> - sysctl debug.kdb.break_to_debugger.  This is documented in ddb(4), but
>>   only as equivalent to the unbroken BREAK_TO_DEBUGGER.
>
> Thanx. Setting debug.kdb.break_to_debugger=1 makes both Ctrl-Alt-ESC and
> Ctrl-PrtScr works in sc only mode and "c" exit don't cause all chars
> beeps like in vt. I.e. it works. But I don't understand why debugging
> via serial involved in sc case while not involved in vt case and fear
> that some serial noise may provoke break.

This is because only syscons has full conflation of serial line breaks
with entering the debugger via a breakpoint instuction.  Syscons does:

 	kdb_break();

for its KDB keys, while vt does:

 	kdb_enter(KDB_WHY_BREAK, ...)

for its KDB keys.  The latter bypasses KDB's permissions on entering
the debugger with a BREAK.  It is unclear if this is a layering violation
in vt or incorrect use of kdb_break() in syscons.  It is certainly wrong
for vt to use the KDB_WHY_BREAK code if it is avoiding using kdb_break()
to fix the conflation.

> Is there a chance to untie
> serial and sc console debuggers?

This is easy to do by copying vt's arguable layering violation.  A little
more is necessary to unconflate serial breaks:
- agree that kdb_break() and KDB_WHY_BREAK are only for serial line breaks
- don't use kdb_break() and KDB_WHY_BREAK for console KDB keys of course.
   vt already has a string saying that the entry is a "manual escape to
   debugger".  Here "to debugger" is redundant, "manual escape" means
   "DDB key hit manaually by the user" and the driver that saw the key
   is left out.  "vt KDB key" would be a more useful message.  syscons
   used to print a similar message, but it now calls kdb_break() which
   produces the conflated code KDB_WHY_BREAK and the consistently
   conflated message "Break to debugger".  This is also used for serial
   line breaks.  Capitalization is also inconsistent.
- remove kdb_break().  The only correct use of it now is in 1 serial
   driver.  It saves that driver having its own enable knobs.  This
   doesn't work for multiple serial driver or even multiple console
   devices within a single driver.  Multiple console devices within
   a single driver are not supported, but the gdb device can be separate
   and needs a separate knob.
- add a global enable for all debugger entries in kdb_enter()
- unconflate kdb_alt_break() and ALT_BREAK_TO_DEBUGGER, starting with
   their names.  The <newline>~b sequences maps to the conflated
   KDB_WHY_BREAK, but is closer to a DDB key.  The message doesn't
   say exactly which key and doesn't know it at the level after the
   keymap maps a physical key to a virtual DDB key.  It is not very
   useful to distinguish this sequence from a DDB key, but easy to
   do so in the message.
- kdb_alt_break() also does reboots and panics, with a single enable
   knob (but 3 ways to configure it) for the 3 things it does.  Console
   drivers also have keys for this, with separate enable knobs and
   2 or 3 ways to configure each.  Unconflate and unobfuscate this too.

   The conflation is mostly in the names.  Who would think that a
   knob for controlling an alernative kdb entry method to serial line
   breaks also controls rebooting and panics?  Certainly not the writers
   of its documentation.  There seems to be none: kdb is undocumented.
   sysctl -da debug.kdb says "Enable alternative break to debugger".
   ddb(4) says that it enables "The (sic) alternate (sic) sequence
   to enter the debugger".  I like the sysctl message using the English
   spelling of "alternative", but after expanding it, its name is wronger
   since "alternative" suggests a single alternative almost as much
   as "alternate".

   The existence of the reboot and panic "breaks" is a larger bug.  It
   is impossible to do a clean reboot starting from "fast" interrupt
   handler context and difficult starting from normal interrupt handler
   context.  Panicing is not so bad since it is inherently unclean.
   The existence of similar commands (and dump) in ddb is another bug.
   I never use them, but use the reset command which on x86's normally
   uses the keyboard controller.  A triple fault would be another
   good way to get a clean panic.  Neither is very clean for multiple
   CPUs which are probably still running while you are panicing.

   OTOH, kdb entry has to work here.  It has very large complications
   to give it a chance of working.  First it has to stop other CPUs
   and wait for them to stop.  Panic now does the same.  Panic is not
   as careful as kdb entry, but doesn't need to be because it is not
   restartable.  Reboot from kdb_alt_break() doesn't even know that
   the context is special.

Bruce