On Wed, May 05, 2010 at 12:54:07PM -1000, Jeff Roberson wrote: > On Mon, 3 May 2010, Fabien Thomas wrote: >>>> I'm with r207548 now and since some days i've system deadlock. >>>> It seems related to SUJ with process waiting on suspfs or ppwait. >>> >>> I've also seen it stalled in suspfs, but this information is way better >>> than what I was able to garner. I was only able to tell via ctrl-t on >>> a stalled 'ls' process in a terminal before hard booting. [..] > Can anyone who has experienced this hang test this patch: > > Thanks, > Jeff > Index: ffs_softdep.c > =================================================================== > --- ffs_softdep.c (revision 207480) > +++ ffs_softdep.c (working copy) > _at__at_ -9301,7 +9301,7 _at__at_ > hadchanges = 1; > } > /* Leave this inodeblock dirty until it's in the list. */ > - if ((inodedep->id_state & (UNLINKED | DEPCOMPLETE)) == UNLINKED) > + if ((inodedep->id_state & (UNLINKED | UNLINKONLIST)) == UNLINKED) Hi Jeff, I didn't seem to experience this problem back in May, but I'm now experiencing it on a regular basis. I seem to trigger it almost every other or 3rd day during the daily run. I wind up with cvsup or svnsync stalled and any 'ls' of my sources partition waiting on suspfs. (note, I am also running diskcheckd from ports.) My kernel sources are at: Last Changed Author: davidxu Last Changed Rev: 211534 Last Changed Date: 2010-08-20 16:51:34 -0700 (Fri, 20 Aug 2010) I have also experienced it back to at least: Last Changed Author: yongari Last Changed Rev: 210152 Last Changed Date: 2010-07-15 16:34:58 -0700 (Thu, 15 Jul 2010) Weird thing is - I can still access this partition across NFS without problems. dragon$ cd /src/fbsd Filesystem Size Used Avail Capacity Mounted on /dev/da31s1f 271G 119G 130G 48% /src dragon$ ls load: 0.12 cmd: ls 77901 [suspfs] 2.26r 0.00u 0.00s 0% 1212k quynh$ cd /src/fbsd quynh$ df . Filesystem Size Used Avail Capacity Mounted on dragon:/src 271G 119G 130G 48% /src quynh$ ls .svn/ lib/ COPYRIGHT libexec/ ..snip.. Processes also have a tendency to complete quite slowly at times - waiting in vlruwk. When I reboot, usually / and /src (but not 3 other partitions) give a "Bad cg number {negative number}" error from fsck; so a full fsck is run. This results in what seems tens of thousands iterations of: UNREF FILE I=[..snip..] RECONNECT? yes SORRY no space in lost+found directory unexpected soft update inconsistency CLEAR? yes thoughts? -- -- David (obrien_at_FreeBSD.org)Received on Fri Sep 03 2010 - 21:41:22 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:07 UTC