snp panic [Was: Re: panic with tcpdrop]

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Sun, 25 Nov 2007 22:30:15 +0200
On Sun, Nov 25, 2007 at 02:37:16PM -0300, Rako wrote:
> It is not reproduceable. But a have 2 panic of that in 1 month.
> 
> The steps for the last panic:
> 
> watch -W /dev/ttyv0
> make buildkernel ....
> Ctrl-G
> Panic!
> 
> The explication from Kostik can be correct, but, if I not try to attach 
> again, only Ctrl-G; it is not the path for that explication or I 
> misunderstands. Anyway, I can probe the changes for 1 month to test if 
> any happen
> Thanks!
> Javier
C-G seems to be the exact scenario for the bug.
> 
> 
> >On Sat, Nov 24, 2007 at 09:19:42PM +0000, Robert Watson wrote:
> >>On Sat, 24 Nov 2007, Rako wrote:
> >>
> >>>the patch solve the problem with tcpdrop, Thanks!!
> >>>
> >>>An other panic ocurred, but on other area, is on snp.ko module (watch -W 
> >>>/dev/ttyv0) but can't get backtrace. This panic is simliar at
> >>>
> >>>http://lists.freebsd.org/pipermail/freebsd-current/2007-March/069990.html
> >>>
> >>>the problem may be at line 164 of /usr/src/sys/dev/snp/snp.c snp = 
> >>>ttytosnp(tp);
> >>>
> >>>where snp get NULL
> >>>
> >>>but, no familiar with this ... Any idea what can I do to solve the error?
> >>I'm having trouble reproducing this -- could you give me a detailed set 
> >>of instructions regarding the specific steps I should take to try and get 
> >>this panic, if it's reproduceable for you?
> >>
> >>Thanks,
> >>
> >>Robert N M Watson
> >>Computer Laboratory
> >>University of Cambridge
> >>
> >>>Regards,
> >>>Javier
> >>>
> >>>
> >>>Fatal trap 12: page fault while in kernel mode
> >>>fault virtual address   = 0x24
> >>>fault code              = supervisor read, page not present
> >>>instruction pointer     = 0x20:0xc3e4f230
> >>>stack pointer           = 0x28:0xd66c3b34
> >>>frame pointer           = 0x28:0xd66c3b88
> >>>code segment            = base 0x0, limit 0xfffff, type 0x1b
> >>>                      = DPL 0, pres 1, def32 1, gran 1
> >>>processor eflags        = interrupt enabled, resume, IOPL = 0
> >>>current process         = 2216 (make)
> >>>trap number             = 12
> >>>panic: page fault
> >>>KDB: stack backtrace:
> >>>db_trace_self_wrapper(c0a5f1ea,d66c39d4,c078878a,c0a5d5f4,c0b5bcc0,...) 
> >>>at db_trace_self_wrapper+0x26
> >>>kdb_backtrace(c0a5d5f4,c0b5bcc0,c0a1fb8c,d66c39e0,d66c39e0,...) at 
> >>>kdb_backtrace+0x29
> >>>panic(c0a1fb8c,c0a7c54d,c3e44770,1,1,...) at panic+0xaa
> >>>trap_fatal(c3e942b8,0,1,0,c39f5630,...) at trap_fatal+0x303
> >>>trap_pfault(0,c39f5630,c39f5630,0,c,...) at trap_pfault+0x250
> >>>trap(d66c3af4) at trap+0x382
> >>>calltrap() at calltrap+0x6
> >>>--- trap 0xc, eip = 0xc3e4f230, esp = 0xd66c3b34, ebp = 0xd66c3b88 ---
> >>>snplwrite(c33bf800,d66c3c60,0,d66c3bbc,c0754bec,...) at snplwrite+0x80
> >>>ttywrite(c3389600,d66c3c60,0,c39cf5e8,c39f5630,...) at ttywrite+0x39
> >>>giant_write(c3389600,d66c3c60,0,0,c0abb080,...) at giant_write+0x6c
> >>>devfs_write_f(c39cf5e8,d66c3c60,c3de4800,0,c39f5630,...) at 
> >>>devfs_write_f+0x75
> >>>dofilewrite(d66c3c60,ffffffff,ffffffff,0,c39cf5e8,...) at 
> >>>dofilewrite+0x97
> >>>kern_writev(c39f5630,1,d66c3c60,2813c076,0,...) at kern_writev+0x58
> >>>write(c39f5630,d66c3cfc,c,110,c337e630,...) at write+0x4f
> >>>syscall(d66c3d38) at syscall+0x335
> >>>Xint0x80_syscall() at Xint0x80_syscall+0x20
> >>>--- syscall (4, FreeBSD ELF32, write), eip = 0x8083603, esp = 
> >>>0xbfbfd4ec, ebp = 0xbfbfd528 ---
> >>>Uptime: 19m14s
> >>>Physical memory: 495 MB
> >>>Dumping 86 MB: 71 55 39 23 7
> >
> >I believe I have a plausible explanation for the panic. Please, look
> >at the snpioctl(), SNPSTTY command. First, assume that both the s > 0
> >and snoop device has attached tty. Then, snp_tty will be overwritten,
> >without detaching the old tty from the snooper. In this case, ttytosnp()
> >would not find the snp from tty, returning NULL. This would lead to the
> >trace above. This is old kernel bug.
> >
> >Now, I shall note that watch(8) does not attach to the new tty without
> >detaching from the previous one. But, after destroy_dev_sched() conversion
> >have been done for snp(4), actual detach is asynchronous. Since watch(8)
> >opens the numbered snpX clone device instead of the master /dev/snp, it
> >could reopen the same device. The condition is racy, and thus not easily
> >reproducable.
> >
> >The patch below might help with kernel panic.
> >
> >diff --git a/sys/dev/snp/snp.c b/sys/dev/snp/snp.c
> >index a84e90c..b8f3d63 100644
> >--- a/sys/dev/snp/snp.c
> >+++ b/sys/dev/snp/snp.c
> >_at__at_ -491,7 +491,7 _at__at_ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, 
> >int flags,
> >     struct thread *td)
> > {
> > 	struct snoop *snp;
> >-	struct tty *tp, *tpo;
> >+	struct tty *tp;
> > 	struct cdev *tdev;
> > 	struct file *fp;
> > 	int s;
> >_at__at_ -502,6 +502,9 _at__at_ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, 
> >int flags,
> > 		s = *(int *)data;
> > 		if (s < 0)
> > 			return (snp_down(snp));
> >+		if (snp->snp_tty != NULL)
> >+			return (EBUSY);
> >+
> > 		if (fget(td, s, &fp) != 0)
> > 			return (EINVAL);
> > 		if (fp->f_type != DTYPE_VNODE ||
> >_at__at_ -520,13 +523,6 _at__at_ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, 
> >int flags,
> > 			return (EBUSY);
> > 
> > 		s = spltty();
> >-
> >-		if (snp->snp_target == NULL) {
> >-			tpo = snp->snp_tty;
> >-			if (tpo)
> >-				tpo->t_state &= ~TS_SNOOP;
> >-		}
> >-
> > 		tp->t_state |= TS_SNOOP;
> > 		snp->snp_olddisc = tp->t_line;
> > 		tp->t_line = snooplinedisc;

Received on Sun Nov 25 2007 - 19:30:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:23 UTC