Re: Exactly that commit (was Re: Latest -current 100% hang at the late boot stage)

From: Justin T. Gibbs <gibbs_at_FreeBSD.org>
Date: Fri, 24 Jun 2011 21:09:08 -0400
On 6/24/11 6:26 PM, Andrey Chernov wrote:
>  On Fri, Jun 24, 2011 at 04:20:24PM -0400, Justin T. Gibbs wrote:
> > Instead, I believe that either one of the GEOM taste methods is 
leaking an
> > access reference (so cdclose() is not called), or the CD driver is 
failing
> > to release the hold semaphore during probing. Setting 
kern.geom.debugflags
> > to '4' will trace the access calls and allow the GEOM side to be 
ruled out.
> > If GEOM is exonerated, we can add tracing to cam_perihp_(un)hold to track
> > this down further.
>
>  No problem. I just set kern.geom.debugflags=4 in loader.conf and here is
>  new photo (with recent kernel, no patches):
>  http://img803.imageshack.us/img803/4679/25062011006.jpg
>  I skip all noisy parts related to ada0 and ada1 partitions probes.
>  As you can see, only 3 cd0-related geom call issued, right before cd1
>  probe shown. Strange thing is that I see no single cd1-related geom
>  call, but it may be because of hang.

The GEOM processing is serialized, so that is not unexpected.  What your
logs are telling me is that the probe for CD0 is hanging.  I don't know
why.

Are you positive it is this specific SVN revision that prevents cd0
from probing properly and not one of my previous CAM commits?  Just
getting to multi-user doesn't mean we're ok here.  My GEOM changes may
make the system hang earlier, but you'll need to test access to cd0
even if you get to multi-user mode to be sure that the device is
functioning correctly.  I just want to be positive that we're barking
up the right tree.

--
Justin
Received on Fri Jun 24 2011 - 23:09:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC