Re: panic: mutex ncnegl not owned at /usr/src/sys/kern/vfs_cache.c:743 in 12.0-CURRENT r308447

From: Don Lewis <truckman_at_FreeBSD.org> Date: Sat, 12 Nov 2016 17:11:57 -0800 (PST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC

On 12 Nov, Mateusz Guzik wrote:
> On Fri, Nov 11, 2016 at 11:29:48PM -0800, Don Lewis wrote:
>> On 11 Nov, To: freebsd-current_at_FreeBSD.org wrote:
>> > Yesterday I upgraded my ports builder box from r306349 to r308477.  It
>> > crashed overnight during a poudriere run.
>> > 
>> > panic: mutex ncnegl not owned at /usr/src/sys/kern/vfs_cache.c:743
>> 
>> This problem is very reproduceable for me, so I added some assertions to
>> the code to try to track it down:
>> 
>> panic: Bad link elm 0xfffff802a59cb690 prev->next != elm
>> #13 0xffffffff80b0d55a in cache_negative_remove (ncp=0xfffff802a59cb690, 
>>     neg_locked=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:747
>> #14 0xffffffff80b0d84a in cache_zap_locked (ncp=0xfffff802a59cb690, 
>>     neg_locked=false) at /usr/src/sys/kern/vfs_cache.c:890
>> #15 0xffffffff80b0ab98 in cache_zap_rlocked_bucket (ncp=0xfffff802a59cb690, 
>>     blp=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:1021
>> #16 cache_lookup (dvp=<optimized out>, vpp=<optimized out>, 
>>     cnp=<optimized out>, tsp=<optimized out>, ticksp=<optimized out>)
>>     at /usr/src/sys/kern/vfs_cache.c:1308
>> #17 0xffffffff80b0e97c in vfs_cache_lookup (ap=<optimized out>)
>>     at /usr/src/sys/kern/vfs_cache.c:2024
>> #18 0xffffffff8103fcea in VOP_LOOKUP_APV (vop=<optimized out>, 
>>     a=<optimized out>) at vnode_if.c:127
>> #19 0xffffffff80b17ed2 in VOP_LOOKUP (vpp=0xfffffe085bfa86e8, 
>>     cnp=0xfffffe085bfa8710, dvp=<optimized out>) at ./vnode_if.h:54
>> #20 lookup (ndp=<optimized out>) at /usr/src/sys/kern/vfs_lookup.c:830
>> #21 0xffffffff80b1748e in namei (ndp=<optimized out>)
>>     at /usr/src/sys/kern/vfs_lookup.c:397
>> #22 0xffffffff80b30bc4 in kern_renameat (td=<optimized out>, oldfd=10, 
>>     old=0x28e72a59 <error: Cannot access memory at address 0x28e72a59>, 
>>     newfd=10, 
>>     new=0x28e72601 <error: Cannot access memory at address 0x28e72601>, 
>>     pathseg=<optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:3515
>> #23 0xffffffff80ff98a0 in syscallenter (td=0xfffff801ebfd8500, 
>>     sa=<optimized out>)
>>     at /usr/src/sys/amd64/ia32/../../kern/subr_syscall.c:135
>> #24 ia32_syscall (frame=0xfffffe085bfa8a40)
>>     at /usr/src/sys/amd64/ia32/ia32_syscall.c:187
>> #25 <signal handler called>
>> 
>> 
>> #13 0xffffffff80b0d55a in cache_negative_remove (ncp=0xfffff802a59cb690, 
>>     neg_locked=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:747
>> 747			TAILQ_REMOVE(&ncneg_hot.nl_list, ncp, nc_dst);
>> (kgdb) list
>> 742		} else {
>> 743			mtx_assert(&neglist->nl_lock, MA_OWNED);
>> 744			mtx_assert(&ncneg_hot.nl_lock, MA_OWNED);
>> 745		}
>> 746		if (ncp->nc_flag & NCF_HOTNEGATIVE) {
>> 747			TAILQ_REMOVE(&ncneg_hot.nl_list, ncp, nc_dst);
>> 748		} else {
>> 749			TAILQ_REMOVE(&neglist->nl_list, ncp, nc_dst);
>> 750		}
>> 751		if (list_locked)
>> 
>> (kgdb) print ncp
>> $1 = (struct namecache *) 0xfffff802a59cb690
>> 
>> (kgdb) print *ncp
>> $2 = {nc_hash = {le_next = 0x0, le_prev = 0xfffffe0001814178}, nc_src = {
>>     le_next = 0xfffff80415e22290, le_prev = 0xfffff8048e5597a8}, nc_dst = {
>>     tqe_next = 0xfffff80421744a10, tqe_prev = 0xfffff80421744a30}, 
>>   nc_dvp = 0xfffff8048e559760, n_un = {nu_vp = 0x0, nu_neghits = 0}, 
>>   nc_flag = 32 ' ', nc_nlen = 25 '\031', 
>>   nc_name = 0xfffff802a59cb6d2 "gtk3-icon-browser.desktop"}
>> (kgdb) print ncp->nc_dst->tqe_prev   
>> $8 = (struct namecache **) 0xfffff80421744a30
>> (kgdb) print *(ncp->nc_dst->tqe_prev)
>> $9 = (struct namecache *) 0x0
>> 
>> 
>> The line number for the failed assertion doesn't make sense.  The value
>> of nc_flag is 32 (0x20), but NCF_HOTNEGATIVE is defined as 0x40 and
>> NCF_NEGATIVE is defined as 0x20.  In any case, the list seems to be
>> messed up.
>> 
> 
> Can you check if the problem persists after reverting r307650 and
> r307685?
> 
> I got another report of trouble with poudriere, but so far can neither
> reproduce nor diagnose what happens here.
> 
> For instance with this crash, in principle there could be a race where
> the entry is demoted and loses the HOTNEGATIVE. But the demotion code
> has the hot list locked and so does this code if it spots the flag.
> What's more, the race is handled by re-checking the flag after the lock
> was taken.
> 
> That said, there is something seriously fishing going on here.

This !neg_locked code in cache_negative_remove() looks suspicious to me:

        if (!neg_locked) {
                if (ncp->nc_flag & NCF_HOTNEGATIVE) {
                        hot_locked = true;
                        mtx_lock(&ncneg_hot.nl_lock);
                        if (!(ncp->nc_flag & NCF_HOTNEGATIVE)) {
                                list_locked = true;
                                mtx_lock(&neglist->nl_lock);
                        }
                } else {
                        list_locked = true;
                        mtx_lock(&neglist->nl_lock);
                }

It looks like you handle NCF_HOTNEGATIVE going away while waiting for
the ncneg_hot.nl_lock, but don't handle the possible appearance of
NCF_HOTNEGATIVE while waiting for neglist->nl_lock.

What protects nc_flag, the lock for the list that it resides on?

I think the only safe thing to do here is to always grab both locks.

My package builder is now up to 18 1/2 hours of runtime building
packages with the same kernel that previously crashed in a few hours.