On 8/8/05, Don Lewis <truckman_at_freebsd.org> wrote: > > On 7 Aug, Antoine Pelisse wrote: > > http://people.freebsd.org/~pho/stress/log/cons149.html > > http://people.freebsd.org/~pho/stress/log/cons130.html > > > > I've been working on this panic today (the two are obviously > > the same) and here is a patch to fix it: > > --- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005 > > +++ sys/kern/kern_proc.c Sun Aug 7 21:18:03 2005 > > _at__at_ -884,10 +884,8 _at__at_ > > _PHOLD(p); > > FOREACH_THREAD_IN_PROC(p, td) { > > fill_kinfo_thread(td, &kinfo_proc); > > - PROC_UNLOCK(p); > > error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc, > > sizeof(kinfo_proc)); > > - PROC_LOCK(p); > > if (error) > > break; > > } > > > > As a matter of fact, if td is removed from the list through > thread_unlink > > while > > the mutex is released and the next thread is removed just after, the > FOREACH > > > > is looping through an unlinked list where the td_ksegrp has been set to > NULL > > > > by thread_exit. > > If we absolutely have to release the lock, then it's probably safer to > check > > if > > td_ksegroup != NULL in the fill_kinfo_thread function. > > Calling SYSCTL_OUT(), which calls copyout(), is not allowed while > holding a mutex unless the userland buffer is wired. The reason is that > if any of the userland buffer is not resident in RAM, the thread can > block while the buffer is paged in, and this is not legal while holding > a mutex. > > There are two ways to fix this: > > Allocate a temporary buffer of the appropriate size, grab the > mutex, traverse the list and copy the data into the temporary > buffer, release the mutex, call SYSCTL_OUT() to copy the > contents of the temporary buffer to the user buffer, and free > the temporary buffer. > > Wire the appropriate amount of the userland buffer using > sysctl_wire_old_buffer(), grab the mutex, traverse the list, > copying each item to userland with SYSCTL_OUT(), release the > mutex, and unwire the userland buffer. Is that what you meant ? --- sys/kern/kern_proc.c.orig Mon Apr 18 04:10:36 2005 +++ sys/kern/kern_proc.c Mon Aug 8 15:09:50 2005 _at__at_ -868,7 +868,7 _at__at_ { struct thread *td; struct kinfo_proc kinfo_proc; - int error = 0; + int error, buffersize = 0; struct proc *np; pid_t pid = p->p_pid; _at__at_ -883,11 +883,23 _at__at_ } else { _PHOLD(p); FOREACH_THREAD_IN_PROC(p, td) { - fill_kinfo_thread(td, &kinfo_proc); - PROC_UNLOCK(p); + buffersize += sizeof(struct kinfo_proc); + } + PROC_UNLOCK(p); + + error = sysctl_wire_old_buffer(req, buffersize); + if (error) + { + _PRELE(p); + return error; + } + + PROC_LOCK(p); + FOREACH_THREAD_IN_PROC(p, td) { + fill_kinfo_thread(td, &kinfo_proc); error = SYSCTL_OUT(req, (caddr_t)&kinfo_proc, sizeof(kinfo_proc)); - PROC_LOCK(p); + if (error) break; } In either case, the appropriate buffer size must be calculated (possibly > requiring the mutex to be grabbed and released) before executing an > operation that could block, grabbing the mutex, and traversing the list. > It is quite possible for the pre-calculated size to not match the actual > size needed when the list is traversed to actually copy the data. > > How can this issue be addressed ? Maybe some extra memory has to be wired ?Received on Mon Aug 08 2005 - 11:24:04 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:40 UTC