Re: [markiyan.kushnir_at_gmail.com: Re: 11.0-CURRENT panic (nfsd?)]

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Mon, 06 Jan 2014 14:42:13 +0200
Thank you for the report. Bug fixed at r260367.

> ----- Forwarded message from Markiyan Kushnir <markiyan.kushnir_at_gmail.com> -----
>
> Date: Sun, 5 Jan 2014 19:47:37 +0200
> Subject: Re: 11.0-CURRENT panic (nfsd?)
> From: Markiyan Kushnir <markiyan.kushnir_at_gmail.com>
> To: Markiyan Kushnir <markiyan.kushnir_at_gmail.com>, freebsd-current_at_freebsd.org
>
> $ nm /boot/kernel/kernel | grep svc_run_internal
> ffffffff80714db0 t svc_run_internal
> $ addr2line -e /boot/kernel/kernel 0xffffffff80715779
> /usr/src.svnup/sys/rpc/svc.c:971
>
>     949  static void
>     950  svc_executereq(struct svc_req *rqstp)
>     951  {
>     952          SVCXPRT *xprt = rqstp->rq_xprt;
>     953          SVCPOOL *pool = xprt->xp_pool;
>     954          int prog_found;
>     955          rpcvers_t low_vers;
>     956          rpcvers_t high_vers;
>     957          struct svc_callout *s;
>     958
>     959          /* now match message with a registered service*/
>     960          prog_found = FALSE;
>     961          low_vers = (rpcvers_t) -1L;
>     962          high_vers = (rpcvers_t) 0L;
>     963          TAILQ_FOREACH(s, &pool->sp_callouts, sc_link) {
>     964                  if (s->sc_prog == rqstp->rq_prog) {
>     965                          if (s->sc_vers == rqstp->rq_vers) {
>     966                                  /*
>     967                                   * We hand ownership of r to the
>     968                                   * dispatch method - they must call
>     969                                   * svc_freereq.
>     970                                   */
>     971                                  (*s->sc_dispatch)(rqstp, xprt);
>     972                                  return;
>     973                          }  /* found correct version */
>     974                          prog_found = TRUE;
>     975                          if (s->sc_vers < low_vers)
>     976                                  low_vers = s->sc_vers;
>     977                          if (s->sc_vers > high_vers)
>     978                                  high_vers = s->sc_vers;
>     979                  }   /* found correct program */
>     980          }
>     981
>     982          /*
>     983           * if we got here, the program or version
>     984           * is not served ...
>     985           */
>     986          if (prog_found)
>     987                  svcerr_progvers(rqstp, low_vers, high_vers);
>     988          else
>     989                  svcerr_noprog(rqstp);
>     990
>     991          svc_freereq(rqstp);
>     992  }
>     993
>
> 2014/1/5 John-Mark Gurney <jmg_at_funkthat.com>:
>> Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:
>>> 2014/1/5 John-Mark Gurney <jmg_at_funkthat.com>:
>>>> Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
>>>>> I started to see a reliable panic on a recent CURRENT:
>>>>>
>>>>> $ uname -a
>>>>> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
>>>>> r260296: Sun Jan  5 07:14:50 EET 2014
>>>>> root_at_vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
>>>>>
>>>>> The panic is always triggered by the first request to the nfs service
>>>>> (this machine runs a PXE server).
>>>>>
>>>>> The core.txt is attached. Please let me know if I can help more.
>>>>
>>>> Apparently the mime-type on the attachment was bad and got scrubbed...
>>>>
>>>> Maybe include it inline if it isn't too long?
>>>>
>>>
>>> It's 144KB long. I will share it via Google Drive:
>>>
>>> https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing
>>
>> Looks like a NULL function pointer was called:
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address   = 0x0
>> fault code              = supervisor read instruction, page not present
>> instruction pointer     = 0x20:0x0
>> stack pointer           = 0x28:0xfffffe00d9a2bea0
>> frame pointer           = 0x28:0xfffffe00d9a2c010
>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>                          = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags        = interrupt enabled, resume, IOPL = 0
>> current process         = 1323 (nfsd: master)
>> trap number             = 12
>> panic: page fault
>>
>> --- trap 0xc, rip = 0, rsp = 0xfffffe00d9a2bea0, rbp = 0xfffffe00d9a2c010 ---
>> uart_sab82532_class() at 0/frame 0xfffffe00d9a2c010
>> svc_run_internal() at svc_run_internal+0x9c9/frame 0xfffffe00d9a2c1b0
>> svc_run() at svc_run+0xed/frame 0xfffffe00d9a2c1f0
>> nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfffffe00d9a2c350
>> nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfffffe00d9a2c970
>> sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfffffe00d9a2c9a0
>> amd64_syscall() at amd64_syscall+0x265/frame 0xfffffe00d9a2cab0
>> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe00d9a2cab0
>> --- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 0x7fffffffd438, rbp = 0x7fffffffd6e0 ---
>>
>> The uart_sab82532_class is just the closest symbol to 0, so it's in
>> svc_run_internal that's the problem...  Could you run:
>> nm /boot/kernel/kernel | grep svc_run_internal
>>
>> This should return a line w/ a large hex number at the front, then run:
>> addr2line -e /boot/kernel/kernel $( expr 0x<largehexnumber>+0x9c9)
>>
>> This will give you a file name and line number, and can you copy/paste
>> the lines around and including that line number?  This will help make
>> sure we get the correct code...

-- 
Alexander Motin
Received on Mon Jan 06 2014 - 11:42:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:46 UTC