Re: some [big] changes to ZPL (ZFS<->VFS )

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Sat, 6 Aug 2016 14:05:53 +0300
On 05/08/2016 23:31, Alan Somers wrote:
> I'm not certain it's related, but on a head build at r303767 I see a
> LOR and a reproducible panic that involve the snapdir code.

Alan,

thank you very much for the clear report and for the very easy
reproduction scenario.  I am not sure how I missed this simple and
severe bug.
Please try r303791, it should fix the problem.

I believe that the LOR is not new and been there since we started using
distinct tags for .zfs special vnodes.

> First, the LOR:
> $ zpool destroy foo
> 
> lock order reversal:
>  1st 0xfffff800404c8b78 zfs (zfs) _at_
> /usr/home/alans/freebsd/head/sys/kern/vfs_mount.c:1244
>  2nd 0xfffff800404c85f0 zfs_gfs (zfs_gfs) _at_
> /usr/home/alans/freebsd/head/sys/cddl/contrib/opensolaris/uts/common/fs/gfs.c:484
> stack backtrace:
> #0 0xffffffff80aa90b0 at witness_debugger+0x70
> #1 0xffffffff80aa8fa4 at witness_checkorder+0xe54
> #2 0xffffffff80a22072 at __lockmgr_args+0x4c2
> #3 0xffffffff80af8e7c at vop_stdlock+0x3c
> #4 0xffffffff81018880 at VOP_LOCK1_APV+0xe0
> #5 0xffffffff80b19f2a at _vn_lock+0x9a
> #6 0xffffffff821b9c53 at gfs_file_create+0x73
> #7 0xffffffff821b9cfd at gfs_dir_create+0x1d
> #8 0xffffffff8228aa07 at zfsctl_mknode_snapdir+0x47
> #9 0xffffffff821ba1a5 at gfs_dir_lookup+0x185
> #10 0xffffffff821ba68d at gfs_vop_lookup+0x1d
> #11 0xffffffff82289a42 at zfsctl_root_lookup+0xf2
> #12 0xffffffff8228a8c3 at zfsctl_umount_snapshots+0x83
> #13 0xffffffff822a1d2b at zfs_umount+0x7b
> #14 0xffffffff80b02a14 at dounmount+0x6f4
> #15 0xffffffff80b0228d at sys_unmount+0x35d
> #16 0xffffffff80ebbb7b at amd64_syscall+0x2db
> #17 0xffffffff80e9b72b at Xfast_syscall+0xfb
> 
> 
> Here's the panic:
> $ zpool create testpool da0
> $ touch /testpool/testfile
> $ zfs snapshot testpool_at_testsnap
> $ cd /testpool/.zfs/snapshots
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 2; apic id = 04
> fault virtual address   = 0x8
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff80b19f1c
> stack pointer           = 0x28:0xfffffe0b54bf7430
> frame pointer           = 0x28:0xfffffe0b54bf74a0
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 966 (bash)
> trap number             = 12
> panic: page fault
> cpuid = 2
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0b54bf6fc0
> vpanic() at vpanic+0x182/frame 0xfffffe0b54bf7040
> panic() at panic+0x43/frame 0xfffffe0b54bf70a0
> trap_fatal() at trap_fatal+0x351/frame 0xfffffe0b54bf7100
> trap_pfault() at trap_pfault+0x1fd/frame 0xfffffe0b54bf7160
> trap() at trap+0x284/frame 0xfffffe0b54bf7370
> calltrap() at calltrap+0x8/frame 0xfffffe0b54bf7370
> --- trap 0xc, rip = 0xffffffff80b19f1c, rsp = 0xfffffe0b54bf7440, rbp
> = 0xfffffe0b54bf74a0 ---
> _vn_lock() at _vn_lock+0x8c/frame 0xfffffe0b54bf74a0
> zfs_lookup() at zfs_lookup+0x50d/frame 0xfffffe0b54bf7540
> zfs_freebsd_lookup() at zfs_freebsd_lookup+0x91/frame 0xfffffe0b54bf7680
> VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xda/frame 0xfffffe0b54bf76b0
> vfs_cache_lookup() at vfs_cache_lookup+0xd6/frame 0xfffffe0b54bf7710
> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xda/frame 0xfffffe0b54bf7740
> lookup() at lookup+0x5a2/frame 0xfffffe0b54bf77d0
> namei() at namei+0x5b2/frame 0xfffffe0b54bf7890
> kern_statat() at kern_statat+0xa8/frame 0xfffffe0b54bf7a40
> sys_stat() at sys_stat+0x2d/frame 0xfffffe0b54bf7ae0
> amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe0b54bf7bf0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0b54bf7bf0
> 
> 
> I can provide core files, test scripts, whatever you need.  Thanks for
> tackling this difficult problem.
> 
> -Alan
> 
> On Fri, Aug 5, 2016 at 12:36 AM, Andriy Gapon <avg_at_freebsd.org> wrote:
>> On 03/08/2016 17:25, Andriy Gapon wrote:
>>> Another change that was not strictly required and which is probably too
>>> intrusive is killing the support for case insensitive operations.   My
>>> thinking was that FreeBSD VFS does not provide support for those anyway.
>>>  But I'll probably restore the code, at least in the bottom half of the
>>> ZPL, before committing the change.
>>
>> It turned out that most of the removed code was dead anyway and it took
>> just a few lines of code to restore support for case-insensitive
>> filesystems.  Filesystems with mixed case sensitivity behave exactly the
>> same as case-sensitive filesystem as it has always been the case on FreeBSD.
>>
>> Anyway the big change has just been committed:
>> https://svnweb.freebsd.org/changeset/base/303763
>> Please test away.
>>
>> Another note is that the filesystem name cache is now disabled for case
>> insensitive filesystems and filesystems with normalization other than
>> none.  That may hurt the lookup performance, but should ensure
>> correctness of operations.
>>
>> --
>> Andriy Gapon
>> _______________________________________________
>> freebsd-current_at_freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"


-- 
Andriy Gapon
Received on Sat Aug 06 2016 - 09:08:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC