On 19.01.2011 13:24, Kostik Belousov wrote: > On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote: >> On 18.01.2011 17:13, Kostik Belousov wrote: >>> On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote: >>>> On 18.01.2011 15:46, Kostik Belousov wrote: >>>>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote: >>>>>> Hi, >>>>>> >>>>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a >>>>>> port which executes linux ldconfig it results in an unkillable process >>>>>> which uses 100% CPU. The problem is reproduceable without tinderbox: >>>>>> >>>>>> # uname -a >>>>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3 >>>>>> r216761: Tue Dec 28 15:32:26 CET 2010 >>>>>> root_at_daedalus.network.local:/usr/obj/usr/src/sys/GENERIC i386 >>>>>> # mkdir /compat/test >>>>>> # mount -t tmpfs tmpfs /compat/test >>>>>> # cp -Rp /compat/linux/* /compat/test/ >>>>>> # mount -t linprocfs linprocfs /compat/test/proc >>>>>> # /compat/linux/sbin/ldconfig -r /compat/test/ >>>>>> # pgrep ldconfig >>>>>> 3449 >>>>>> # procstat -i 3449 | grep KILL >>>>>> 3449 ldconfig KILL --- >>>>>> # kill -9 3449 >>>>>> # procstat -i 3449 | grep KILL >>>>>> 3449 ldconfig KILL P-- >>>>>> >>>>>> >From top(1): >>>>>> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND >>>>>> 3449 root 1 44 0 992K 712K CPU1 1 10:06 100.00% ldconfig >>>>>> >>>>>> When I reboot the machine it hangs after "All buffers synced.". >>>>>> >>>>>> I've uploaded some additional output of procstat and ktrace here: >>>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt >>>>>> >>>>>> Anyone knows how to fix this? >>>>> kdump for the trace of the linux binary is a garbage. You need to >>>>> use linux_kdump (from ports). >>>>> >>>>> I think that your process is looping in the kernel, you can confirm this >>>>> by dropping in the ddb and doing "bt <pid>". >>>> >>>> I've uploaded a screenshot from the output of bt <pid> in ddb: >>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg >>> >>> Please try this. >>> >>> diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c >>> index 9ff1cf0..44ad193 100644 >>> --- a/sys/compat/linux/linux_file.c >>> +++ b/sys/compat/linux/linux_file.c >>> _at__at_ -369,7 +369,6 _at__at_ getdents_common(struct thread *td, struct linux_getdents64_args *args, >>> lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO); >>> vn_lock(vp, LK_SHARED | LK_RETRY); >>> >>> -again: >>> aiov.iov_base = buf; >>> aiov.iov_len = buflen; >>> auio.uio_iov = &aiov; >>> _at__at_ -506,8 +505,10 _at__at_ again: >>> break; >>> } >>> >>> - if (outp == (caddr_t)args->dirent) >>> - goto again; >>> + if (outp == (caddr_t)args->dirent) { >>> + nbytes = resid; >>> + goto eof; >>> + } >>> >>> fp->f_offset = off; >>> if (justone) >>> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c >>> index 84a2038..62dd0bf 100644 >>> --- a/sys/fs/tmpfs/tmpfs_subr.c >>> +++ b/sys/fs/tmpfs/tmpfs_subr.c >>> _at__at_ -827,9 +827,10 _at__at_ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp) >>> /* Copy the new dirent structure into the output buffer and >>> * advance pointers. */ >>> error = uiomove(&d, d.d_reclen, uio); >>> - >>> - (*cntp)++; >>> - de = TAILQ_NEXT(de, td_entries); >>> + if (error == 0) { >>> + (*cntp)++; >>> + de = TAILQ_NEXT(de, td_entries); >>> + } >>> } while (error == 0 && uio->uio_resid > 0 && de != NULL); >>> >>> /* Update the offset and cache. */ >> >> This patch solves the problem. >> > Thank you, but apparently this is not the end of story. > > I committed the linuxolator part of change, but I think that tmpfs > change is uncomplete yet. Strictly following getdirentries(2), tmpfs > must return EINVAL in the case when no single record can be returned. > Currently, it indicates EOF instead. I think this could be a complete > solution, but it might break e.g. Linux ldconfig(8) since it exposed > the linuxolator situation. > > Can you apply the patch below over the latest HEAD with r217578 included > and retest ? Thanks. > > diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c > index 84a2038..62dd0bf 100644 > --- a/sys/fs/tmpfs/tmpfs_subr.c > +++ b/sys/fs/tmpfs/tmpfs_subr.c > _at__at_ -827,9 +827,10 _at__at_ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio *uio, off_t *cntp) > /* Copy the new dirent structure into the output buffer and > * advance pointers. */ > error = uiomove(&d, d.d_reclen, uio); > - > - (*cntp)++; > - de = TAILQ_NEXT(de, td_entries); > + if (error == 0) { > + (*cntp)++; > + de = TAILQ_NEXT(de, td_entries); > + } > } while (error == 0 && uio->uio_resid > 0 && de != NULL); > > /* Update the offset and cache. */ > diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c > index 059a790..a57c1f2 100644 > --- a/sys/fs/tmpfs/tmpfs_vnops.c > +++ b/sys/fs/tmpfs/tmpfs_vnops.c > _at__at_ -1349,7 +1349,7 _at__at_ outok: > MPASS(error >= -1); > > if (error == -1) > - error = 0; > + error = (cnt != 0) ? 0 : EINVAL; > > if (eofflag != NULL) > *eofflag = I've applied the new patch on top of r217615 and was not able to reproduce the problem. Thanks again, BeatReceived on Wed Jan 19 2011 - 23:35:11 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC