On Sun, Aug 30, 2009 at 09:34:54AM +1000, Peter Jeremy wrote: > [Redirected to amd64 because this is an amd64 kernel bug] > > On 2009-Aug-25 05:33:44 +1000, Peter Jeremy <peterjeremy_at_optushome.com.au> wrote: > >I am attempting to build an i386 jail on an amd64 box to build > >packages for my netbook. The host is running -current from just over > >two weeks ago and the jail is -current from early June. The jail was > >built by doing a dump|restore of my netbook and then tweaking various > >config files to give it a new identity. The jail's devfs is using > >"devfsrules_jail" from /etc/default/devfs.rules. > > > >The jail starts OK but when I attempt to ssh into it, I just get > >"Connection closed by <jail IP address>". > > Turns out this is a bug in the 32-bit select(2) wrapper on 64-bit > kernels. The userland fd_set arguments are not wrapped but passed > directly to kern_select(). Unfortunately, fd_set is (effectively) an > array of longs which means kern_select() assumes fd_set is a multiple > of 8-bytes whilst userland assumes it is a multiple of 4 bytes. As a > result, the kernel can over-write an extra 4 bytes of user memory. In > the case of sshd, this causes part of the RSA host key to be trashed > when privilege separation mode is enabled. > > This bug also affects linux emulation on amd64 and potentially affects > any other 64-bit kernels with 32-bit emulation modes. I have raised > amd64/138318 to cover it. I do not think that we can go the proposed route, since changing the type of __fd_mask changes the type of fd_set. The later would not affect the kernel ABI, but definitely changes the ABI of any code that passes fd_sets. Also, looking closely at the issue you found, I think that copyin is the same problematic as copyout, since we can end up reading one more word then userspace supplied. This is not a problem only because most user code keeps fd_sets on stack. Could you test that the patch below fixes real sshd issue. At least, it passes your select test from the PR. diff --git a/sys/compat/freebsd32/freebsd32_misc.c b/sys/compat/freebsd32/freebsd32_misc.c index 466aab4..71b22aa 100644 --- a/sys/compat/freebsd32/freebsd32_misc.c +++ b/sys/compat/freebsd32/freebsd32_misc.c _at__at_ -589,7 +589,8 _at__at_ freebsd32_select(struct thread *td, struct freebsd32_select_args *uap) * XXX big-endian needs to convert the fd_sets too. * XXX Do pointers need PTRIN()? */ - return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp)); + return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp, + sizeof(int32_t) * 8)); } /* diff --git a/sys/compat/linux/linux_misc.c b/sys/compat/linux/linux_misc.c index 267da07..1d5eaf8 100644 --- a/sys/compat/linux/linux_misc.c +++ b/sys/compat/linux/linux_misc.c _at__at_ -522,7 +522,7 _at__at_ linux_select(struct thread *td, struct linux_select_args *args) tvp = NULL; error = kern_select(td, args->nfds, args->readfds, args->writefds, - args->exceptfds, tvp); + args->exceptfds, tvp, sizeof(l_int) * 8); #ifdef DEBUG if (ldebug(select)) diff --git a/sys/kern/sys_generic.c b/sys/kern/sys_generic.c index bd0f279..6831fe8 100644 --- a/sys/kern/sys_generic.c +++ b/sys/kern/sys_generic.c _at__at_ -774,12 +774,13 _at__at_ select(td, uap) } else tvp = NULL; - return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp)); + return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp, + NFDBITS)); } int kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, - fd_set *fd_ex, struct timeval *tvp) + fd_set *fd_ex, struct timeval *tvp, int abi_nfdbits) { struct filedesc *fdp; /* _at__at_ -792,7 +793,7 _at__at_ kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, fd_mask *ibits[3], *obits[3], *selbits, *sbp; struct timeval atv, rtv, ttv; int error, timo; - u_int nbufbytes, ncpbytes, nfdbits; + u_int nbufbytes, ncpbytes, ncpubytes, nfdbits; if (nd < 0) return (EINVAL); _at__at_ -806,6 +807,7 _at__at_ kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, */ nfdbits = roundup(nd, NFDBITS); ncpbytes = nfdbits / NBBY; + ncpubytes = roundup(nd, abi_nfdbits) / NBBY; nbufbytes = 0; if (fd_in != NULL) nbufbytes += 2 * ncpbytes; _at__at_ -832,9 +834,11 _at__at_ kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, ibits[x] = sbp + nbufbytes / 2 / sizeof *sbp; \ obits[x] = sbp; \ sbp += ncpbytes / sizeof *sbp; \ - error = copyin(name, ibits[x], ncpbytes); \ + error = copyin(name, ibits[x], ncpubytes); \ if (error != 0) \ goto done; \ + bzero((char *)ibits[x] + ncpubytes, \ + ncpbytes - ncpubytes); \ } \ } while (0) getbits(fd_in, 0); _at__at_ -888,7 +892,7 _at__at_ done: if (error == EWOULDBLOCK) error = 0; #define putbits(name, x) \ - if (name && (error2 = copyout(obits[x], name, ncpbytes))) \ + if (name && (error2 = copyout(obits[x], name, ncpubytes))) \ error = error2; if (error == 0) { int error2; diff --git a/sys/sys/syscallsubr.h b/sys/sys/syscallsubr.h index d0f209c..e1c83cc 100644 --- a/sys/sys/syscallsubr.h +++ b/sys/sys/syscallsubr.h _at__at_ -170,7 +170,7 _at__at_ int kern_sched_rr_get_interval(struct thread *td, pid_t pid, int kern_semctl(struct thread *td, int semid, int semnum, int cmd, union semun *arg, register_t *rval); int kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou, - fd_set *fd_ex, struct timeval *tvp); + fd_set *fd_ex, struct timeval *tvp, int abi_nfdbits); int kern_sendfile(struct thread *td, struct sendfile_args *uap, struct uio *hdr_uio, struct uio *trl_uio, int compat); int kern_sendit(struct thread *td, int s, struct msghdr *mp, int flags,
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:54 UTC