Re: kernel page fault with nfs

From: Marcelo Araujo <araujobsdport_at_gmail.com>
Date: Tue, 21 Oct 2014 15:45:24 +0800
Hello Tobias,

That sounds good, at least you don't have any crash so far.
I agree with you, seems a bug, I'm gonna take a look on that.

Could you share with me your testbed or how you can reproduce the issue?

Best Regards,

2014-10-21 15:36 GMT+08:00 T.C.Berner <tcberner_at_gmail.com>:

> The system now has an uptime of >24h using NFS heavily.
>
> So wsize/rsize=2^15-1 seems to have been the problem.... which is imho a
> bug therefore.
>
>
> mfg Tobias
>
> 2014-10-21 5:11 GMT+02:00 Marcelo Araujo <araujobsdport_at_gmail.com>:
>
>> Hello Tibias,
>>
>> Any news?
>>
>>
>> Best Regards,
>>
>> 2014-10-20 20:55 GMT+08:00 Rick Macklem <rmacklem_at_uoguelph.ca>:
>>
>>> Tobias C. Berner wrote:
>>> > Now that I posted it, 32767 should of course be 2^15=32768. Let me
>>> > recheck if it still
>>> > hangs with the correct value.
>>> >
>>> > On Monday 20 October 2014 09.15:39 Tobias C. Berner wrote:
>>> > > Hi Marcelo
>>> > >
>>> > > Yes, I'm using readahead:
>>> > > The mountoptions are
>>> > > "readahead=4,soft,intr,rw,tcp,wsize=32767,rsize=32767,late"
>>> > >
>>> If you type "nfsstat -m", you will see what is actually getting used.
>>> (I suspect the above rsize/wsize got clipped to 32256 or something like
>>>  that. I think it clips it to a multiple of 512.)
>>>
>>> If rsize/wsize are not a power of 2, there are issues, although I've
>>> never
>>> been able to see why it is broken. Maybe it should clip it to the power
>>> of
>>> 2 below the value, since it causes unexplained problems otherwise.
>>>
>>> rick
>>>
>>> > >
>>> > > mfg Tobias
>>> > >
>>> > > On Monday 20 October 2014 10.41:30 Marcelo Araujo wrote:
>>> > > > Hello Tobias,
>>> > > >
>>> > > > Could you show how you are mount the NFS share?
>>> > > > Are you using 'readahead' option?
>>> > > >
>>> > > > Best Regards,
>>> > > >
>>> > > > 2014-10-19 17:40 GMT+08:00 Tobias C. Berner <tcberner_at_gmail.com>:
>>> > > > > both are at 1100038.
>>> > > > >
>>> > > > > On Sunday 19 October 2014 11.12:36 Marcelo Araujo wrote:
>>> > > > > > It is still strange, could you do what Allan said and send us
>>> > > > > > the
>>> > > > > > result
>>> > > > >
>>> > > > > in
>>> > > > >
>>> > > > > > case you are not sure you have world and kernel in the same
>>> > > > > > revision!
>>> > > > > >
>>> > > > > > On Oct 19, 2014 6:48 AM, "Tobias C. Berner"
>>> > > > > > <tcberner_at_gmail.com> wrote:
>>> > > > > > >  Hi
>>> > > > > > >
>>> > > > > > > World ist from october 16, installed world and kernel then.
>>> > > > > > >
>>> > > > > > > Kernel was later rebuilt with debug-options.
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > Is the following more sensible?
>>> > > > > > >
>>> > > > > > >
>>> > ##################################################
>>> > > > > > >
>>> > > > > > > # kgdb NOXON/kernel.debug vmcore.1
>>> > > > > > >
>>> > > > > > > Fatal trap 12: page fault while in kernel mode
>>> > > > > > >
>>> > > > > > > cpuid = 5; apic id = 05
>>> > > > > > >
>>> > > > > > > fault virtual address = 0xfffffe07d1744000
>>> > > > > > >
>>> > > > > > > fault code = supervisor write data, page not present
>>> > > > > > >
>>> > > > > > > instruction pointer = 0x20:0xffffffff80d4d58a
>>> > > > > > >
>>> > > > > > > stack pointer = 0x28:0xfffffe086057f240
>>> > > > > > >
>>> > > > > > > frame pointer = 0x28:0xfffffe086057f2f0
>>> > > > > > >
>>> > > > > > > code segment = base 0x0, limit 0xfffff, type 0x1b
>>> > > > > > >
>>> > > > > > > = DPL 0, pres 1, long 1, def32 0, gran 1
>>> > > > > > >
>>> > > > > > > processor eflags = interrupt enabled, resume, IOPL = 0
>>> > > > > > >
>>> > > > > > > current process = 6524 (python2.7)
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > (kgdb) bt
>>> > > > > > >
>>> > > > > > > #0 doadump (textdump=1) at pcpu.h:219
>>> > > > > > >
>>> > > > > > > #1 0xffffffff80926b6d in kern_reboot (howto=260) at
>>> > > > > > > /usr/src/sys/kern/kern_shutdown.c:447
>>> > > > > > >
>>> > > > > > > #2 0xffffffff809270c0 in panic (fmt=<value optimized out>)
>>> > > > > > > at
>>> > > > > > > /usr/src/sys/kern/kern_shutdown.c:746
>>> > > > > > >
>>> > > > > > > #3 0xffffffff8035f167 in db_panic (addr=<value optimized
>>> > > > > > > out>,
>>> > > > > > > have_addr=2, count=0, modif=0x0) at
>>> > > > > > > /usr/src/sys/ddb/db_command.c:473
>>> > > > > > >
>>> > > > > > > #4 0xffffffff8035ed7d in db_command (cmd_table=0x0) at
>>> > > > > > > /usr/src/sys/ddb/db_command.c:440
>>> > > > > > >
>>> > > > > > > #5 0xffffffff8035eaf4 in db_command_loop () at
>>> > > > > > > /usr/src/sys/ddb/db_command.c:493
>>> > > > > > >
>>> > > > > > > #6 0xffffffff80361600 in db_trap (type=<value optimized
>>> > > > > > > out>,
>>> > > > > > > code=0)
>>> > > > >
>>> > > > > at
>>> > > > >
>>> > > > > > > /usr/src/sys/ddb/db_main.c:251
>>> > > > > > >
>>> > > > > > > #7 0xffffffff80966f01 in kdb_trap (type=12, code=0,
>>> > > > > > > tf=<value
>>> > > > > > > optimized
>>> > > > > > > out>) at /usr/src/sys/kern/subr_kdb.c:654
>>> > > > > > >
>>> > > > > > > #8 0xffffffff80d4fa7c in trap_fatal
>>> > > > > > > (frame=0xfffffe086057f190,
>>> > > > >
>>> > > > > eva=<value
>>> > > > >
>>> > > > > > > optimized out>) at /usr/src/sys/amd64/amd64/trap.c:861
>>> > > > > > >
>>> > > > > > > #9 0xffffffff80d4fe0c in trap_pfault
>>> > > > > > > (frame=0xfffffe086057f190,
>>> > > > > > > usermode=<value optimized out>) at
>>> > > > > > > /usr/src/sys/amd64/amd64/trap.c:677
>>> > > > > > >
>>> > > > > > > #10 0xffffffff80d4f42e in trap (frame=0xfffffe086057f190)
>>> > > > > > > at
>>> > > > > > > /usr/src/sys/amd64/amd64/trap.c:426
>>> > > > > > >
>>> > > > > > > #11 0xffffffff80d33972 in calltrap () at
>>> > > > > > > /usr/src/sys/amd64/amd64/exception.S:231
>>> > > > > > >
>>> > > > > > > #12 0xffffffff80d4d58a in bzero () at
>>> > > > > > > /usr/src/sys/amd64/amd64/support.S:53
>>> > > > > > >
>>> > > > > > > #13 0xffffffff80830463 in ncl_doio (vp=0xfffff801e7f99938,
>>> > > > > > > bp=0xfffffe07c5a168e8, cr=<value optimized out>, td=<value
>>> > > > > > > optimized
>>> > > > >
>>> > > > > out>,
>>> > > > >
>>> > > > > > > called_from_strategy=<value optimized out>)
>>> > > > > > >
>>> > > > > > > at /usr/src/sys/fs/nfsclient/nfs_clbio.c:1648
>>> > _______________________________________________
>>> > freebsd-current_at_freebsd.org mailing list
>>> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> > To unsubscribe, send any mail to
>>> > "freebsd-current-unsubscribe_at_freebsd.org"
>>> >
>>>
>>
>>
>>
>> --
>>
>> --
>> Marcelo Araujo            (__)araujo_at_FreeBSD.org     \\\'',)http://www.FreeBSD.org <http://www.freebsd.org/>   \/  \ ^
>> Power To Server.         .\. /_)
>>
>>
>


-- 

-- 
Marcelo Araujo            (__)araujo_at_FreeBSD.org
\\\'',)http://www.FreeBSD.org <http://www.freebsd.org/>   \/  \ ^
Power To Server.         .\. /_)
Received on Tue Oct 21 2014 - 05:45:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:53 UTC