Re: Panic in sys_fstatat()

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Fri, 15 Feb 2019 09:50:55 +0200
On 14/02/2019 22:26, John Baldwin wrote:
> On 2/13/19 6:47 PM, Steve Kargl wrote:
...
>> panic: vm_fault_hold: fault on nofault entry, addr: 0x202000

What's very suspicious here is that the fault address looks a lot like
LK_SHARED | LK_NODDLKTREAT, which would be 'flags' passed to vn_lock and which
should never be used as an address.
In a later email Steve reported that cn_lkflags = 2097152 and that's 0x200000,
LK_SHARED.  compute_cn_lkflags() adds LK_NODDLKTREAT.
However, LK_RETRY is missing.

>> cpuid = 1
>> time = 1550111772
>> KDB: stack backtrace:
>> db_trace_self_wrapper(10b42f3,8c96000,1,9341bd0,2e7b6590,...) at db_trace_self_wrapper+0x2a/frame 0x2e7b6560
>> kdb_backtrace(109973a,5c64d41c,0,2e7b661c,1,...) at kdb_backtrace+0x2d/frame 0x2e7b65c8
>> vpanic(108d309,2e7b661c,2e7b661c,2e7b6700,f734a9,...) at vpanic+0x141/frame 0x2e7b65fc
>> panic(108d309,103dfa3,202000,2e7b6664,2e7b6654,...) at panic+0x1b/frame 0x2e7b6610
>> vm_fault_hold(1ea5000,202000,1,0,0,...) at vm_fault_hold+0x29e9/frame 0x2e7b6700
>> vm_fault(1ea5000,202000,1,0,0,...) at vm_fault+0x5e/frame 0x2e7b6728
>> trap_pfault(202462,40,109e2f2,316d3480,2e7b67c0,...) at trap_pfault+0xb2/frame 0x2e7b6770
>> trap(2e7b6880,8,28,28,1836a120,...) at trap+0x3cb/frame 0x2e7b6874
>> calltrap() at PTDpde+0x4165/frame 0x2e7b6874
>> --- trap 0xc, eip = 0x1027fb8, esp = 0x2e7b68c0, ebp = 0x2e7b68f8 ---
>> VOP_LOCK1_APV(1836a120,202400,1099cc5,2c8,2e7b6ab0,...) at VOP_LOCK1_APV+0x8/frame 0x2e7b68f8

And [0x]202400 here confirms the above observations.
[0x]2c8 is 712, the line number in vfs_lookup.c.

>> lookup(2e7b6a50,0,400,2e7b6aa0,2e7b6a18,...) at lookup+0xc4/frame 0x2e7b6960
>> namei(2e7b6a50,0,4000144,0,2cced08e,...) at namei+0x4f3/frame 0x2e7b6a20
>> kern_statat(3c5dc700,0,ffffff9c,2cced08e,0,...) at kern_statat+0x85/frame 0x2e7b6af0
>> sys_fstatat(3c5dc700,3c5dc988,1384bb0,3c5dc700,0,...) at sys_fstatat+0x49/frame 0x2e7b6c00
>> syscall(2e7b6ce8,3b,3b,3b,fbafbbc8,...) at syscall+0x3ea/frame 0x2e7b6cdc
>> Xint0x80_syscall() at PTDpde+0x43af/frame 0x2e7b6cdc
> 
> Frame 18 is probably the root problem, though it doesn't look like kgdb is
> able to unwind it correctly.  Looking at frame 19 might help though.  It
> seems like a NULL pointer dereference when invoking VOP_LOCK.
> 

So, I suspect something exotic like some sort of a stack alignment issue, or a
CPU bug, or a mismatch between object files, or some local experiment, etc.


-- 
Andriy Gapon
Received on Fri Feb 15 2019 - 06:51:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:20 UTC