Re: Running linux ldconfig on tmpfs results in unkillable process

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Wed, 19 Jan 2011 14:24:18 +0200
On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote:
> On 18.01.2011 17:13, Kostik Belousov wrote:
> > On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
> >> On 18.01.2011 15:46, Kostik Belousov wrote:
> >>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
> >>>> Hi,
> >>>>
> >>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
> >>>> port which executes linux ldconfig it results in an unkillable process
> >>>> which uses 100% CPU. The problem is reproduceable without tinderbox:
> >>>>
> >>>> # uname -a
> >>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
> >>>> r216761: Tue Dec 28 15:32:26 CET 2010
> >>>> root_at_daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
> >>>> # mkdir /compat/test
> >>>> # mount -t tmpfs tmpfs /compat/test
> >>>> # cp -Rp /compat/linux/* /compat/test/
> >>>> # mount -t linprocfs linprocfs /compat/test/proc
> >>>> # /compat/linux/sbin/ldconfig -r /compat/test/
> >>>> # pgrep ldconfig
> >>>> 3449
> >>>> # procstat -i 3449 | grep KILL
> >>>>  3449 ldconfig         KILL     ---
> >>>> # kill -9 3449
> >>>> # procstat -i 3449 | grep KILL
> >>>>  3449 ldconfig         KILL     P--
> >>>>
> >>>> >From top(1):
> >>>> PID USERNAME THR PRI NICE  SIZE   RES STATE    C  TIME   WCPU COMMAND
> >>>> 3449 root     1  44    0   992K   712K CPU1    1  10:06 100.00% ldconfig
> >>>>
> >>>> When I reboot the machine it hangs after "All buffers synced.".
> >>>>
> >>>> I've uploaded some additional output of procstat and ktrace here:
> >>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
> >>>>
> >>>> Anyone knows how to fix this?
> >>> kdump for the trace of the linux binary is a garbage. You need to
> >>> use linux_kdump (from ports).
> >>>
> >>> I think that your process is looping in the kernel, you can confirm this
> >>> by dropping in the ddb and doing "bt <pid>".
> >>
> >> I've uploaded a screenshot from the output of bt <pid> in ddb:
> >> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg
> > 
> > Please try this.
> > 
> > diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
> > index 9ff1cf0..44ad193 100644
> > --- a/sys/compat/linux/linux_file.c
> > +++ b/sys/compat/linux/linux_file.c
> > _at__at_ -369,7 +369,6 _at__at_ getdents_common(struct thread *td, struct linux_getdents64_args *args,
> >  	lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
> >  	vn_lock(vp, LK_SHARED | LK_RETRY);
> >  
> > -again:
> >  	aiov.iov_base = buf;
> >  	aiov.iov_len = buflen;
> >  	auio.uio_iov = &aiov;
> > _at__at_ -506,8 +505,10 _at__at_ again:
> >  			break;
> >  	}
> >  
> > -	if (outp == (caddr_t)args->dirent)
> > -		goto again;
> > +	if (outp == (caddr_t)args->dirent) {
> > +		nbytes = resid;
> > +		goto eof;
> > +	}
> >  
> >  	fp->f_offset = off;
> >  	if (justone)
> > diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
> > index 84a2038..62dd0bf 100644
> > --- a/sys/fs/tmpfs/tmpfs_subr.c
> > +++ b/sys/fs/tmpfs/tmpfs_subr.c
> > _at__at_ -827,9 +827,10 _at__at_ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp)
> >  		/* Copy the new dirent structure into the output buffer and
> >  		 * advance pointers. */
> >  		error = uiomove(&d, d.d_reclen, uio);
> > -
> > -		(*cntp)++;
> > -		de = TAILQ_NEXT(de, td_entries);
> > +		if (error == 0) {
> > +			(*cntp)++;
> > +			de = TAILQ_NEXT(de, td_entries);
> > +		}
> >  	} while (error == 0 && uio->uio_resid > 0 && de != NULL);
> >  
> >  	/* Update the offset and cache. */
> 
> This patch solves the problem.
> 
Thank you, but apparently this is not the end of story.

I committed the linuxolator part of change, but I think that tmpfs
change is uncomplete yet. Strictly following getdirentries(2), tmpfs
must return EINVAL in the case when no single record can be returned.
Currently, it indicates EOF instead. I think this could be a complete
solution, but it might break e.g. Linux ldconfig(8) since it exposed
the linuxolator situation.

Can you apply the patch below over the latest HEAD with r217578 included
and retest ? Thanks.

diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
index 84a2038..62dd0bf 100644
--- a/sys/fs/tmpfs/tmpfs_subr.c
+++ b/sys/fs/tmpfs/tmpfs_subr.c
_at__at_ -827,9 +827,10 _at__at_ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp)
 		/* Copy the new dirent structure into the output buffer and
 		 * advance pointers. */
 		error = uiomove(&d, d.d_reclen, uio);
-
-		(*cntp)++;
-		de = TAILQ_NEXT(de, td_entries);
+		if (error == 0) {
+			(*cntp)++;
+			de = TAILQ_NEXT(de, td_entries);
+		}
 	} while (error == 0 && uio->uio_resid > 0 && de != NULL);
 
 	/* Update the offset and cache. */
diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c
index 059a790..a57c1f2 100644
--- a/sys/fs/tmpfs/tmpfs_vnops.c
+++ b/sys/fs/tmpfs/tmpfs_vnops.c
_at__at_ -1349,7 +1349,7 _at__at_ outok:
 	MPASS(error >= -1);
 
 	if (error == -1)
-		error = 0;
+		error = (cnt != 0) ? 0 : EINVAL;
 
 	if (eofflag != NULL)
 		*eofflag =

Received on Wed Jan 19 2011 - 11:24:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC