Lawrence Stewart <lstewart_at_freebsd.org> wrote: > On 06/20/10 03:58, Fabian Keil wrote: > > Lawrence Stewart<lstewart_at_freebsd.org> wrote: > > > >> On 06/13/10 18:12, Lawrence Stewart wrote: > > > >>> The time has come to solicit some external testing for my SIFTR tool. > >>> I'm hoping to commit it within a week or so unless problems are discovered. > > > >>> I'm interested in all feedback and reports of success/failure, along > >>> with details of the architecture tested and number of CPUs if you would > >>> be so kind. > > > > I got the following hand-transcribed panic maybe a second after > > sysctl net.inet.siftr.enabled=1 > > > > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 01 > > [...] > > current process = 12 (swi4: clock) > > [ thread pid 12 tid 100006 ] > > Stopped at siftr_chkpkt+0xd0: addq $0x1,0x8(%r14) > > db> where > > Tracing pid 12 tid 100006 td 0xffffff00034037e0 > > siftr_chkpt() at siftr_chkpkt+0xd0 > > pfil_run_hooks() at pfil_run_hooks+0xb4 > > ip_output() at ip_output+0x382 > > tcp_output() tcp_output+0xa41 > > tcp_timer_rexmt() at tcp_timer_rexmt+0x251 > > softclock() at softclock+0x291 > > intr_event_execute_handlers() at intr_event_execute_handlers+0x66 > > ithread_loop at ithread_loop+0x8e > > fork_exit() at fork_exit+0x112 > > fork_trampoline() at fork_trampoline+0xe > > --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 --- > > So I've tracked down the line of code where the page fault is occurring: > > if (dir == PFIL_IN) > ss->n_in++; > else > ss->n_out++; > > ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats > per-cpu and is initialised at the start of the function like so: > > ss = DPCPU_PTR(ss); > > So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your > machine. I know very little about the inner workings of the DPCPU_* > macros, but I'm pretty sure the way I use them in SIFTR is correct or at > least as intended. siftr_chkpkt() passes ss to siftr_chkreinject() before dereferencing it itself. I think if ss was NULL, the panic should already occur in siftr_chkreinject(). To be sure I added: diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c index 8bc3498..b9fdfe4 100644 --- a/sys/netinet/siftr.c +++ b/sys/netinet/siftr.c _at__at_ -788,6 +788,16 _at__at_ siftr_chkpkt(void *arg, struct mbuf **m, struct ifnet *ifp, int dir, if (siftr_chkreinject(*m, dir, ss)) goto ret; + if (ss == NULL) { + printf("ss is NULL"); + ss = DPCPU_PTR(ss); + if (ss == NULL) { + printf("ss is still NULL"); + goto ret; + } + } + + if (dir == PFIL_IN) ss->n_in++; else which doesn't seem to affect the problem. > Could you please go ahead and retest using a GENERIC kernel and see if > you can reproduce? There could be something in your custom kernel > causing the offsets or linker set magic used by the DPCPU bits to break > which in turn is triggering this panic in SIFTR. I'll retry without pf first, and with GENERIC afterwards. Fabian
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:04 UTC