Re: CAM breaks USB [was Re: USB causing boot to hang]

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Sat, 7 Dec 2019 01:16:13 +0100
On 2019-12-07 01:09, Alexander Motin wrote:
> On 06.12.2019 18:41, Steve Kargl wrote:
>> On Fri, Dec 06, 2019 at 06:15:32PM -0500, Alexander Motin wrote:
>>> On 06.12.2019 17:52, Steve Kargl wrote:
>>>> On Fri, Dec 06, 2019 at 03:33:09PM -0700, Warner Losh wrote:
>>>>> On Fri, Dec 6, 2019 at 3:31 PM Steve Kargl <sgk_at_troutmask.apl.washington.edu>
>>>>> wrote:
>>>>>> The problem seems to be caused 355010.  This is a commit to
>>>>>> fix CAM, which seems to break USB.
>>>>>>
>>>>> Yes. mav_at_ made this change...
>>>>>
>>>> src/UPDATING seems to be missing an entry about CAM breaking USB.
>>>
>>> And also that moon is made of cheese. :-\
>>
>> Not sure what you mean.
> 
> I mean that if we are going to write there random fairy-tales, then I
> prefer my moon.
> 
> If serious, then my change did not change semantics of any existing
> tunables, only the way some of them are implemented, so there was
> nothing to write in UPDATING.
> 
>> You made a change, and the commit log
>> even notes that there could be an issue.  Yet, you want a user
>> to waste half a day finding the root cause of the problem.
> 
> I am sorry that you wasted your time, but quick and ungrounded blames is
> the last thing I want to read on Friday evening after the long day.
> 
>>>> The commit message for 355010 states:
>>>>
>>>>     Devices appearing on USB bus later may still require setting
>>>>     kern.cam.boot_delay, but hopefully those are minority.
>>>>
>>>> There is no statement about "where" kern.cam.boot_delay should be set.
>>>> There is no statement about "what"  value(s) kern.cam.boot_delay should be.
>>>
>>> If you never needed it before, you still don't need it.
>>
>> Prior to 355010 the system just boots up.  After 355010
>> the system hangs.  Will  kern.cam.boot_delay paper over
>> whatever (latent?) bug you've exposed?
> 
> My change affected the timing of system boot process, allowing system to
> continue booting some further, not waiting for CAM to scan its buses and
> disks.  If the problem is reproducible even without USB storage, then
> CAM probably does not wait for it, so it is not the problem I first
> thought about.
> 
>>> If system hangs even without any USB disk attached, then I don't see a
>>> relation between CAM and USB here.  My change could affect some timings
>>> of the boot process, but without closer debugging it is hard to guess
>>> something.  To be sure whether USB is related I would try to disable all
>>> USB controllers either in BIOS or with set of loader tunables like
>>> hint.ehci.0.disabled=1 , hint.ohci.0.disabled=1 ,
>>> hint.xhci.0.disabled=1, ...
>>
>> Yep.  Completely disabling USB allows the system to boot.  I don't
>> see how this would be unexpected as umass using cam.
> 
> umass uses CAM, but you've told the problem happens even without umass,
> that is why I told that I don't see any relation.  Does disabling of
> _all_ USB fixes the problem?  Have you tried to narrow it down to
> specific controller or device?
> 
> Is there anything special in your system?  Are you running GENERIC
> kernel?  If not, then what do you have changed?
> 
> If your kernel includes VERBOSE_SYSINIT as GENERIC does, I would try to
> set debug.verbose_sysinit=1 and see how far the boot process goes and at
> which stage it may is hanging (if we guess that hang is related to the
> stage and not asynchronous).
> 

Hi,

There is an option you can compile into the kernel which will allow the 
keyboard to enter the debugger.

options	ALT_BREAK_TO_DEBUGGER

Sounds to me like either a leaked refcount or that one thread is 
spinning blocking execution of other threads.

--HPS
Received on Fri Dec 06 2019 - 23:17:40 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC