Re: [CFT] SIFTR - Statistical Information For TCP Research: Uncle Lawrence needs YOU!

From: Lawrence Stewart <lstewart_at_freebsd.org> Date: Sun, 20 Jun 2010 22:48:20 +1000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:04 UTC

On 06/20/10 22:28, Fabian Keil wrote:
> Lawrence Stewart<lstewart_at_freebsd.org>  wrote:
>
>> On 06/20/10 21:15, Fabian Keil wrote:
>>> Lawrence Stewart<lstewart_at_freebsd.org>   wrote:
>>>
>>>> On 06/20/10 03:58, Fabian Keil wrote:
>>>>> Lawrence Stewart<lstewart_at_freebsd.org>    wrote:
>>>>>
>>>>>> On 06/13/10 18:12, Lawrence Stewart wrote:
>>>>>
>>>>>>> The time has come to solicit some external testing for my SIFTR tool.
>>>>>>> I'm hoping to commit it within a week or so unless problems are discovered.
>>>>>
>>>>>>> I'm interested in all feedback and reports of success/failure, along
>>>>>>> with details of the architecture tested and number of CPUs if you would
>>>>>>> be so kind.
>>>>>
>>>>> I got the following hand-transcribed panic maybe a second after
>>>>> sysctl net.inet.siftr.enabled=1
>>>>>
>>>>> Fatal trap 12: page fault while in kernel mode
>>>>> cpuid = 1; apic id = 01
>>>>> [...]
>>>>> current process = 12 (swi4: clock)
>>>>> [ thread pid 12 tid 100006 ]
>>>>> Stopped at	siftr_chkpkt+0xd0:	addq	$0x1,0x8(%r14)
>>>>> db>    where
>>>>> Tracing pid 12 tid 100006 td 0xffffff00034037e0
>>>>> siftr_chkpt() at siftr_chkpkt+0xd0
>>>>> pfil_run_hooks() at pfil_run_hooks+0xb4
>>>>> ip_output() at ip_output+0x382
>>>>> tcp_output() tcp_output+0xa41
>>>>> tcp_timer_rexmt() at tcp_timer_rexmt+0x251
>>>>> softclock() at softclock+0x291
>>>>> intr_event_execute_handlers() at intr_event_execute_handlers+0x66
>>>>> ithread_loop at ithread_loop+0x8e
>>>>> fork_exit() at fork_exit+0x112
>>>>> fork_trampoline() at fork_trampoline+0xe
>>>>> --- trap 0, rip = 0, rsp = 0xffffff800003ad30, rbp = 0 ---
>>>>
>>>> So I've tracked down the line of code where the page fault is occurring:
>>>>
>>>>            if (dir == PFIL_IN)
>>>>                    ss->n_in++;
>>>>            else
>>>>                    ss->n_out++;
>>>>
>>>> ss is a DPCPU (dynamic per-cpu) variable used to keep a set of stats
>>>> per-cpu and is initialised at the start of the function like so:
>>>>
>>>>            ss = DPCPU_PTR(ss);
>>>>
>>>> So for ss to be NULL, that implies DPCPU_PTR() is returning NULL on your
>>>> machine. I know very little about the inner workings of the DPCPU_*
>>>> macros, but I'm pretty sure the way I use them in SIFTR is correct or at
>>>> least as intended.
>>>
>>> siftr_chkpkt() passes ss to siftr_chkreinject() before dereferencing
>>> it itself. I think if ss was NULL, the panic should already occur in
>>> siftr_chkreinject().
>>
>> Yes but siftr_chkreinject() only dereferences ss in the exceptional case
>> of a malloc failure or duplicate pkt. It's unlikely either case happens
>> for you and so wouldn't trigger the panic.
>>
>>> To be sure I added:
>>>
>>> diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c
>>> index 8bc3498..b9fdfe4 100644
>>> --- a/sys/netinet/siftr.c
>>> +++ b/sys/netinet/siftr.c
>>> _at__at_ -788,6 +788,16 _at__at_ siftr_chkpkt(void *arg, struct mbuf **m, struct ifnet *ifp, int dir,
>>>           if (siftr_chkreinject(*m, dir, ss))
>>>                   goto ret;
>>>
>>> +       if (ss == NULL) {
>>> +           printf("ss is NULL");
>>> +           ss = DPCPU_PTR(ss);
>>> +           if (ss == NULL) {
>>> +              printf("ss is still NULL");
>>> +              goto ret;
>>> +           }
>>> +        }
>>> +
>>> +
>>>           if (dir == PFIL_IN)
>>>                   ss->n_in++;
>>>           else
>>>
>>> which doesn't seem to affect the problem.
>>
>> As in it still panics and the "ss is NULL" message is not printed? I
>> would have expected to at least see "ss is NULL" printed if my
>> hypothesis was correct... hmm.
>
> Yes, it still panics, but no message is printed.

It was just pointed out to me that ss doesn't have to be NULL in order 
to cause the page fault (duh). It could also just be a garbage ptr which 
is why your print statement isn't firing.

Can you trigger the panic again and look for some information along the 
lines of "fault virtual address = ..." as part of the panic info. 
Knowing the faulting address would be useful and may help further diagnosis.

>> Perhaps the way I discovered the line number at which the panic occurred
>> was wrong. I compiled SIFTR on my amd64 dev server with "CFLAGS+=-g" in
>> the SIFTR Makefile to get debug symbols, ran "objdump -Sd siftr.ko | vim
>> -", searched for the instruction reported in the panic message i.e.
>> "addq $0x1,0x8(%r14)" and then with a bit of trial and error, recompiled
>> SIFTR with the line of code "volatile int blah = 0; blah = 2;" at
>> various points in the function and looking at the change in the objdump
>> output to pinpoint which line of C code corresponded with the "addq"
>> instruction.
>>
>> The "volatile int blah = 0; blah = 2;" compiles to "movl
>> $0x0,0xffffffffffffffd4(%rbp)" followed immediately by "movl
>> $0x2,0xffffffffffffffd4(%rbp)". When I put that code above the "if (dir
>> == PFIL_IN)" statement I see the objdump output show the assembly code
>> before the "addq" instruction and when I move it after the if statement
>> the assembly code moves after the "addq" instruction.
>
> That's a neat trick.

Indeed, and I thank phk_at_ for suggesting it to me.

>> Perhaps you could reproduce the above procedure and see if you identify
>> the same point in the siftr_chkpkt function I did for the instruction
>> referenced by the panic message?
>
> I do. Using:
>
> diff --git a/sys/netinet/siftr.c b/sys/netinet/siftr.c
> index b9fdfe4..fc6bd9a 100644
> --- a/sys/netinet/siftr.c
> +++ b/sys/netinet/siftr.c
> _at__at_ -797,12 +797,15 _at__at_ siftr_chkpkt(void *arg, struct mbuf **m, struct ifnet *ifp, int dir,
>              }
>           }
>
> +        volatile int blah = 0; blah = 2;
>
>          if (dir == PFIL_IN)
>                  ss->n_in++;
>          else
>                  ss->n_out++;
>
> +        volatile int foo = 0; foo = 3;
> +
>          /*
>           * Create a tcphdr struct starting at the correct offset
>           * in the IP packet. ip->ip_hl gives the ip header length
>
> I get:
>
>       803:       c7 45 d4 00 00 00 00    movl   $0x0,0xffffffffffffffd4(%rbp)
>       80a:       c7 45 d4 02 00 00 00    movl   $0x2,0xffffffffffffffd4(%rbp)
>       811:       0f 84 8a 02 00 00       je     aa1<siftr_chkpkt+0x371>
>       817:       49 83 46 08 01          addq   $0x1,0x8(%r14)
>       81c:       c7 45 d0 00 00 00 00    movl   $0x0,0xffffffffffffffd0(%rbp)
>       823:       c7 45 d0 03 00 00 00    movl   $0x3,0xffffffffffffffd0(%rbp)
>

Ok cool, nice to be assured I didn't botch the detective work up.

>>>> Could you please go ahead and retest using a GENERIC kernel and see if
>>>> you can reproduce? There could be something in your custom kernel
>>>> causing the offsets or linker set magic used by the DPCPU bits to break
>>>> which in turn is triggering this panic in SIFTR.
>>>
>>> I'll retry without pf first, and with GENERIC afterwards.
>>
>> Sounds good, thanks.
>
> Taking pf (and altq) out of the picture doesn't seem to make
> a difference.

Wouldn't have expected it to. Will be very curious to know if the panic 
is triggered in GENERIC.

Cheers,
Lawrence