Re: FreeBSD-HEAD gets stuck on vnode operations

From: Roger Pau Monné <roger.pau_at_citrix.com> Date: Sun, 26 May 2013 21:28:05 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:38 UTC

On 25/05/13 19:52, Roger Pau Monné wrote:
> On 20/05/13 20:34, John Baldwin wrote:
>> On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
>>> On 14/05/13 18:31, Konstantin Belousov wrote:
>>>> On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
>>>>> On 13/05/13 17:00, Konstantin Belousov wrote:
>>>>>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
>>>>>>> On 13/05/13 13:18, Roger Pau Monn? wrote:
>>>>>
>>>>> Thanks for taking a look,
>>>>>
>>>>>>> I would like to explain this a little bit more, the syncer process
>>>>>>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
>>>>>>> looping forever in what seems to be an endless loop around
>>>>>>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
>>>>>>> noticeable disk activity, so I'm unsure of what is happening.
>>>>>> How many CPUs does your VM have ?
>>>>>
>>>>> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
>>>>>
>>>>>>
>>>>>> The loop you describing means that other thread owns the vnode
>>>>>> interlock. Can you track what this thread does ? E.g. look at the
>>>>>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
>>>>>> thread owning the mutex, clear low bits as needed. Then you can
>>>>>> inspect the thread and get a backtrace.
>>>>>
>>>>> There are no other threads running, only syncer is running on CPU 1 (see
>>>>> ps in previous email). All other CPUs are idle, and as seen from the ps
>>>>> quite a lot of threads are blocked in vnode related operations, either
>>>>> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
>>>>> of alllocks in the previous email.
>>>> This is not useful.  You need to look at the mutex which fails the
>>>> trylock operation in the mnt_vnode_next_active(), see who owns it,
>>>> and then 'unwind' the locking dependencies from there.
>>>
>>> Sorry, now I get it, let's see if I can find the locked vnodes and the
>>> thread that owns them...
>>
>> You can use 'show lock <address of vp->v_interlock>' to find an owning
>> thread and then use 'show sleepchain <thread>'.  If you are using kgdb on the 
>> live system (probably easier) then you can grab my scripts at 
>> www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You can 
>> then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
>> 'sleepchain <tid>'

Hello,

I've been looking into this issue a little bit more, and the lock
dependencies look right to me, the lockup happens when the thread owning
the v_interlock mutex tries to acquire the vnode_free_list_mtx mutex
which is already owned by the syncer thread, at this point, the thread
owning the v_interlock mutex goes to sleep, and the syncer process will
start doing a sequence of:

VI_TRYLOCK -> mtx_unlock vnode_free_list_mtx -> kern_yield -> mtx_lock
vnode_free_list_mtx ...

It seems like kern_yield, which I assume is placed there in order to
allow the thread owning v_interlock to be able to also lock
vnode_free_list_mtx, doesn't get a window big enough to wake up the
waiting thread and get the vnode_free_list_mtx mutex. Since the syncer
is the only process runnable on the CPU there is no context switch, and
the syncer process continues to run.

Relying on kern_yield to provide a window big enough that allows any
other thread waiting on vnode_free_list_mtx to run doesn't seem like a
good idea on SMP systems. I've not tested this on bare metal, but waking
up an idle CPU in a virtualized environment might be more expensive than
doing it on bare metal.

Bear in mind that I'm not familiar with either the scheduler or the ufs
code, my proposed naive fix is to replace the kern_yield call with a
pause, that will allow any other threads waiting on vnode_free_list_mtx
to lock the vnode_free_list_mtx mutex and finish whatever they are doing
and release the v_interlock mutex, so the syncer thread can also finish
it's work. I've tested the patch for a couple of hours and seems to be
fine, I haven't been able to reproduce the issue anymore.