Re: page fault panic in scioctl and console-kit-daemon

From: Joe Marcus Clarke <marcus_at_FreeBSD.org>
Date: Wed, 23 Jan 2008 16:32:13 -0500
On Wed, 2008-01-23 at 23:11 +0200, Kostik Belousov wrote:
> On Wed, Jan 23, 2008 at 02:55:59PM -0500, Joe Marcus Clarke wrote:
> > 
> > On Wed, 2008-01-23 at 20:34 +0100, Pawel Worach wrote:
> > > Kostik Belousov wrote:
> > > > On Tue, Jan 22, 2008 at 07:26:53PM +0100, Pawel Worach wrote:
> > > >> Kostik Belousov wrote:
> > > >>> On Sun, Jan 20, 2008 at 04:42:36AM +0100, Pawel Worach wrote:
> > > >>>> Hi,
> > > >>>>
> > > >>>> While starting console-kit-daemon (sysutils/consolekit 0.2.3) during 
> > > >>>> boot or in single-user mode the system panics. If I start it post-boot 
> > > >>>> it runs fine. This is on 8.0-CURRENT from about 12 hours ago, another 
> > > >>>> user also reported the same on RELENG_7. Any other information I can 
> > > >>>> provide ?
> > > >>>>
> > > >>>> Fatal trap 12: page fault while in kernel mode
> > > >>>> fault virtual address	= 0x4
> > > >>>> fault code		= supervisor read, page not present
> > > >>>> instruction pointer	= 0x20:0xc04d2ab4
> > > >>>> stack pointer	        = 0x28:0xe6499b18
> > > >>>> frame pointer	        = 0x28:0xe6499b80
> > > >>>> code segment		= base 0x0, limit 0xfffff, type 0x1b
> > > >>>> 			= DPL 0, pres 1, def32 1, gran 1
> > > >>>> processor eflags	= interrupt enabled, resume, IOPL = 0
> > > >>>> current process		= 134 (console-kit-daemon)
> > > >>>> Physical memory: 1014 MB
> > > >>>> Dumping 43 MB: 28 12
> > > >>>>
> > > >>>> #8  0xc07a34a2 in trap (frame=0xe6499ad8) at 
> > > >>>> /usr/src/sys/i386/i386/trap.c:489
> > > >>>> #9  0xc079183b in calltrap () at /usr/src/sys/i386/i386/exception.s:146
> > > >>>> #10 0xc04d2ab4 in scioctl (dev=0xc3b20d00, cmd=537163270,
> > > >>>>    data=0xe6499c70 "\002", flag=1, td=0xc3d3c880)
> > > >>>>    at /usr/src/sys/dev/syscons/syscons.c:1073
> > > >>>> #11 0xc051ed1a in giant_ioctl (dev=0xc3b20d00, cmd=537163270,
> > > >>>>    data=0xe6499c70 "\002", fflag=1, td=0xc3d3c880)
> > > >>>>    at /usr/src/sys/kern/kern_conf.c:349
> > > >>>> #12 0xc0598194 in cnioctl (dev=0xc3b20d00, cmd=537163270,
> > > >>>>    data=0xe6499c70 "\002", flag=1, td=0xc3d3c880)
> > > >>>> ---Type <return> to continue, or q <return> to quit---
> > > >>>>    at /usr/src/sys/kern/tty_cons.c:521
> > > >>> Unless the virtual screen is opened, the screen state scr_stat structure
> > > >>> is not allocated. The following patch would fix the panic, but I do not
> > > >>> know how the console-kit would react to the ENXIO from the
> > > >>> VT_WAITACTIVE ioctl. Please, test it.
> > > >> Thanks! The patch works.
> > > > 
> > > > To clarify: do you see any problems with the console-kit after the patch ?
> > > > In particular, can you verify that the program functions correctly, esp.
> > > > on the virtual terminals 1, 2 ... , whatever this means ?
> > > 
> > > The panic is of course gone, while chatting a bit with Marcus (CCd) it 
> > > looks like console-kit does not do any error handling at all. I've not 
> > > looked at what c-k does so maybe Marcus can answer the question better.
> > 
> > It tries to install a wait thread on each available VT.  That thread
> > sets the WAITACTIVE ioctl, and waits for its VT to become active.  When
> > it does, it sets the CK active VT accordingly, and reattaches the wait.
> > 
> > When an error occurs in the ioctl, no wait is attached, and CK will not
> > know when a particular VT becomes active.  This will essentially cripple
> > CK (assuming the VT really does become available at a later point).  
> > 
> > Now, admittedly, there is no error correction in CK for this situation.
> > It would be trivial to add something that periodically attempts to
> > reestablish a failed wait.  However, I am very curious why only a few
> > users are seeing this panic (me excluded on two different machines).  It
> > seems to me that the scp should be initialized when sc_attach_unit() is
> > called during device probing.  I don't see anything special in init(8)'s
> > code that would cause these VT devices to become initialized.  I would
> > assume that if one can open(2) them, then they should be available for
> > ioctls?
> 
> From my reading of the code, scp would be non-NULL after the first open
> of the corresponding /dev/ttyvX. sc_attach_unit() creates the scp for
> the console and the consolectl devices.
> 
> VT_WAITACTIVE ioctl is performed on arbitrary syscons /dev node, and
> can wait for any other screens, in particular, the screens that are
> not opened at the moment (the reason for the reported panic).

Right, and this is where I was confused.  I had thought from an old
reading of the CK code that CK opened each ttyvX device to perform the
ioctl.  It does not.  Instead, it opens /dev/console, and performs the
ioctl for each ttyvX on that fd.  That does explain the panic, but not
exactly why I did not see it.  I'm guessing a race condition, but I
can't be sure.

> 
> The patch I posted may be improved by making the VT_WAITACTIVE ioctl
> to wait for the scp being allocated, and only then for the screen to be
> switched too. Please, test.

I really appreciate your attention to this.  Funny thing is, CK 0.2.4
was just released, and it is no longer started out of rc.d.  I've also
added error correcting in the CK code path.  The problem may disappear
depending on when CK is executed out of D-BUS.  However, it would be
good to prevent this in the kernel.  Pawel said he would try and test
this patch.

Joe

> 
> 
> diff --git a/sys/dev/syscons/syscons.c b/sys/dev/syscons/syscons.c
> index e0c9725..3127ae6 100644
> --- a/sys/dev/syscons/syscons.c
> +++ b/sys/dev/syscons/syscons.c
> _at__at_ -178,6 +178,7 _at__at_ static int scparam(struct tty *tp, struct termios *t);
>  static void scstart(struct tty *tp);
>  static void scinit(int unit, int flags);
>  static scr_stat *sc_get_stat(struct cdev *devptr);
> +static void sc_set_stat(struct cdev *devptr, scr_stat *st);
>  static void scterm(int unit, int flags);
>  static void scshutdown(void *arg, int howto);
>  static u_int scgetc(sc_softc_t *sc, u_int flags);
> _at__at_ -421,7 +422,7 _at__at_ sc_attach_unit(int unit, int flags)
>  		    UID_ROOT, GID_WHEEL, 0600, "ttyv%r", vc + unit * MAXCONS);
>  	    	sc_alloc_tty(sc->dev[vc]);
>  		if (vc == 0 && sc->dev == main_devs)
> -			SC_STAT(sc->dev[0]) = &main_console;
> +			sc_set_stat(sc->dev[0], &main_console);
>  	}
>  	/*
>  	 * The first vty already has struct tty and scr_stat initialized
> _at__at_ -434,7 +435,7 _at__at_ sc_attach_unit(int unit, int flags)
>  		   UID_ROOT, GID_WHEEL, 0600, "consolectl");
>      sc_console_tty = sc_alloc_tty(dev);
>      ttyconsolemode(sc_console_tty, 0);
> -    SC_STAT(dev) = sc_console;
> +    sc_set_stat(dev, sc_console);
>  
>      return 0;
>  }
> _at__at_ -525,7 +526,8 _at__at_ scopen(struct cdev *dev, int flag, int mode, struct thread *td)
>  
>      scp = sc_get_stat(dev);
>      if (scp == NULL) {
> -	scp = SC_STAT(dev) = alloc_scp(sc, SC_VTY(dev));
> +	scp = alloc_scp(sc, SC_VTY(dev));
> +	sc_set_stat(dev, scp);
>  	if (ISGRAPHSC(scp))
>  	    sc_set_pixel_mode(scp, NULL, COL, ROW, 16, 8);
>      }
> _at__at_ -1063,6 +1065,7 _at__at_ scioctl(struct cdev *dev, u_long cmd, caddr_t data, int flag, struct thread *td)
>  #endif
>      case VT_WAITACTIVE: 	/* wait for switch to occur */
>  	i = (*(int *)data == 0) ? scp->index : (*(int *)data - 1);
> +    wait_active:
>  	if ((i < sc->first_vty) || (i >= sc->first_vty + sc->vtys))
>  	    return EINVAL;
>  	s = spltty();
> _at__at_ -1071,6 +1074,12 _at__at_ scioctl(struct cdev *dev, u_long cmd, caddr_t data, int flag, struct thread *td)
>  	if (error)
>  	    return error;
>  	scp = sc_get_stat(SC_DEV(sc, i));
> +	if (scp == NULL) {
> +		error = tsleep(&(dev)->si_drv1, PZERO | PCATCH, "waitvt0", 0);
> +		if (error)
> +			return (error);
> +		goto wait_active;
> +	}
>  	if (scp == scp->sc->cur_scp)
>  	    return 0;
>  	error = tsleep(&scp->smode, PZERO | PCATCH, "waitvt", 0);
> _at__at_ -2769,7 +2778,7 _at__at_ scinit(int unit, int flags)
>  	        UID_ROOT, GID_WHEEL, 0600, "ttyv%r", unit * MAXCONS);
>  	    sc_alloc_tty(sc->dev[0]);
>  	    scp = alloc_scp(sc, sc->first_vty);
> -	    SC_STAT(sc->dev[0]) = scp;
> +	    sc_set_stat(sc->dev[0], scp);
>  	}
>  	sc->cur_scp = scp;
>  
> _at__at_ -3662,6 +3671,14 _at__at_ sc_get_stat(struct cdev *devptr)
>  	return (SC_STAT(devptr));
>  }
>  
> +static void
> +sc_set_stat(struct cdev *devptr, scr_stat *st)
> +{
> +
> +	SC_STAT(devptr) = st;
> +	wakeup(&devptr->si_drv1);
> +}
> +
>  /*
>   * Allocate active keyboard. Try to allocate "kbdmux" keyboard first, and,
>   * if found, add all non-busy keyboards to "kbdmux". Otherwise look for
-- 
Joe Marcus Clarke
FreeBSD GNOME Team      ::      gnome_at_FreeBSD.org
FreeNode / #freebsd-gnome
http://www.FreeBSD.org/gnome

Received on Wed Jan 23 2008 - 20:32:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:26 UTC