Re: FreeBSD-HEAD gets stuck on vnode operations

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 20 May 2013 14:34:55 -0400
On Tuesday, May 14, 2013 1:15:47 pm Roger Pau Monné wrote:
> On 14/05/13 18:31, Konstantin Belousov wrote:
> > On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote:
> >> On 13/05/13 17:00, Konstantin Belousov wrote:
> >>> On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote:
> >>>> On 13/05/13 13:18, Roger Pau Monn? wrote:
> >>
> >> Thanks for taking a look,
> >>
> >>>> I would like to explain this a little bit more, the syncer process
> >>>> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues
> >>>> looping forever in what seems to be an endless loop around
> >>>> mnt_vnode_next_active/ffs_sync. Also while in this state there is no
> >>>> noticeable disk activity, so I'm unsure of what is happening.
> >>> How many CPUs does your VM have ?
> >>
> >> 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs.
> >>
> >>>
> >>> The loop you describing means that other thread owns the vnode
> >>> interlock. Can you track what this thread does ? E.g. look at the
> >>> vp->v_interlock.mtx_lock, which is basically a pointer to the struct
> >>> thread owning the mutex, clear low bits as needed. Then you can
> >>> inspect the thread and get a backtrace.
> >>
> >> There are no other threads running, only syncer is running on CPU 1 (see
> >> ps in previous email). All other CPUs are idle, and as seen from the ps
> >> quite a lot of threads are blocked in vnode related operations, either
> >> "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output
> >> of alllocks in the previous email.
> > This is not useful.  You need to look at the mutex which fails the
> > trylock operation in the mnt_vnode_next_active(), see who owns it,
> > and then 'unwind' the locking dependencies from there.
> 
> Sorry, now I get it, let's see if I can find the locked vnodes and the
> thread that owns them...

You can use 'show lock <address of vp->v_interlock>' to find an owning
thread and then use 'show sleepchain <thread>'.  If you are using kgdb on the 
live system (probably easier) then you can grab my scripts at 
www.freebsd.org/~jhb/gdb/ (do 'cd /path/to/scripts; source gdb6').  You can 
then find the offending thread and do 'mtx_owner &vp->v_interlock' and then
'sleepchain <tid>'

-- 
John Baldwin
Received on Mon May 20 2013 - 18:42:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:37 UTC