2011/5/4 Kostik Belousov <kostikbel_at_gmail.com>: > On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: >> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper <yanegomi_at_gmail.com> wrote: >> > On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick <mckusick_at_mckusick.com> wrote: >> >>> Date: Tue, 3 May 2011 22:40:26 -0700 >> >>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS >> >>> partition when filesystem full >> >>> From: Garrett Cooper <yanegomi_at_gmail.com> >> >>> To: Jeff Roberson <jeff_at_freebsd.org>, >> >>> Marshall Kirk McKusick <mckusick_at_mckusick.com> >> >>> Cc: FreeBSD Current <freebsd-current_at_freebsd.org> >> >>> >> >>> Hi Jeff and Dr. McKusick, >> >>> Ran into this panic when /usr ran out of space doing a make >> >>> universe on amd64/r221219 (it took ~15 minutes for the panic to occur >> >>> after the filesystem ran out of space -- wasn't quite sure what it was >> >>> doing at the time): >> >>> >> >>> ... >> >>> >> >>> Let me know what other commands you would like for me to run in kgdb. >> >>> Thanks, >> >>> -Garrett >> >> >> >> You did not indicate whether you are running an 8.X system or a 9-current >> >> system. It would be helpful to know that. >> > >> > I've actually been running CURRENT for a few years now, but you're right -- >> > I didn't mention that part. >> > >> >> Jeff thinks that there may be a potential race in the locking code for >> >> softdep_request_cleanup. If so, this patch for 9-current should fix it: >> >> >> >> Index: ffs_softdep.c >> >> =================================================================== >> >> --- ffs_softdep.c (revision 221385) >> >> +++ ffs_softdep.c (working copy) >> >> _at__at_ -11380,7 +11380,8 _at__at_ >> >> continue; >> >> } >> >> MNT_IUNLOCK(mp); >> >> - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { >> >> + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, >> >> + curthread)) { >> >> MNT_ILOCK(mp); >> >> continue; >> >> } >> >> >> >> If you are running an 8.X system, hopefully you will be able to apply it. >> > >> > I've applied it, rebuilt and installed the kernel, and trying to >> > repro the case again. Will let you know how things go! >> >> Happened again with the change. It's really easy to repro: >> >> 1. Get a filesystem with UFS+SU >> 2. Execute something that does a large number of small writes to a partition. >> 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition >> >> The kernel will panic with the issue I discussed above. >> Thanks! > > Jeff' change is required to avoid LORs, but it is not sufficient to > prevent recursion. We must skip the vnode supplied as a parameter to > softdep_request_cleanup(). Theoretically, other vnodes might be also > locked by curthread, thus I think the change below is needed. Try this. > > diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c > index a6d4441..25fa5d6 100644 > --- a/sys/ufs/ffs/ffs_softdep.c > +++ b/sys/ufs/ffs/ffs_softdep.c > _at__at_ -11380,7 +11380,9 _at__at_ retry: > continue; > } > MNT_IUNLOCK(mp); > - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { > + if (VOP_ISLOCKED(lvp) || > + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, > + curthread)) { > MNT_ILOCK(mp); > continue; > } Ok. I'll let the make universe I have going run to completion, and once I get back home later on, I'll take a look at repro'ing this again with the above patch applied. Thanks! -GarrettReceived on Wed May 04 2011 - 14:41:58 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC