On Tue, May 14, 2013 at 06:08:45PM +0200, Roger Pau Monn? wrote: > On 13/05/13 17:00, Konstantin Belousov wrote: > > On Mon, May 13, 2013 at 04:33:04PM +0200, Roger Pau Monn? wrote: > >> On 13/05/13 13:18, Roger Pau Monn? wrote: > > Thanks for taking a look, > > >> I would like to explain this a little bit more, the syncer process > >> doesn't get blocked on the _mtx_trylock_flags_ call, it just continues > >> looping forever in what seems to be an endless loop around > >> mnt_vnode_next_active/ffs_sync. Also while in this state there is no > >> noticeable disk activity, so I'm unsure of what is happening. > > How many CPUs does your VM have ? > > 7 vCPUs, but I've also seen this issue with 4 and 16 vCPUs. > > > > > The loop you describing means that other thread owns the vnode > > interlock. Can you track what this thread does ? E.g. look at the > > vp->v_interlock.mtx_lock, which is basically a pointer to the struct > > thread owning the mutex, clear low bits as needed. Then you can > > inspect the thread and get a backtrace. > > There are no other threads running, only syncer is running on CPU 1 (see > ps in previous email). All other CPUs are idle, and as seen from the ps > quite a lot of threads are blocked in vnode related operations, either > "*Name Cac", "*vnode_fr" or "*vnode in". I've also attached the output > of alllocks in the previous email. This is not useful. You need to look at the mutex which fails the trylock operation in the mnt_vnode_next_active(), see who owns it, and then 'unwind' the locking dependencies from there. I described the procedure above. > > > > > Does the loop you described stuck on the same vnode during the whole > > lock-step time, or is the progress made, possibly slowly ? > > I'm not sure how to measure "progress", but indeed the syncer process is > not locked, it is iterating over mnt_vnode_next_active. Progress means that iteration moves from vnode to vnode, instead of looping over the same vnode continuously. I did read what you said about system being un-stuck in some time, but I am asking about change of the iterator during the stuck time. > > > > > I suppose that your HEAD is recent. > > Last commit in my local repository is: > > Date: Tue, 7 May 2013 12:39:14 +0000 > Subject: [PATCH] By request, add an arrow from NetBSD-0.8 to FreeBSD-1.0. > > While here, add a few more NetBSD versions to the tree itself. > > Submitted by: Alan Barrett <apb_at_cequrux.com> > Submitted by: Thomas Klausner <wiz_at_netbsd.org>
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:37 UTC