Re: "panic: malloc(M_WAITOK) with sleeping prohibited" at main-n245363-b3dac3913dc9

From: David Wolfskill <david_at_catwhisker.org>
Date: Tue, 9 Mar 2021 13:46:09 -0800
On Tue, Mar 09, 2021 at 01:23:16PM -0800, David Wolfskill wrote:
> On Tue, Mar 09, 2021 at 01:53:37PM -0700, Warner Losh wrote:
> > ...
> > The following reviews should fix this. It introduces a no-wait variant for
> > disk_alloc(), provides a way to free allocated, but not created, disks  and
> > changes CAM to use the new routines and take some care for not leaking when
> > an allocation fails.
> > 
> > https://reviews.freebsd.org/D29161
> > https://reviews.freebsd.org/D29162
> > https://reviews.freebsd.org/D29163
> > 
> > Maybe you can try it? I got similar tracebacks when I booted w/o these
> > changes, but not a peep with them...
> > ...
> 
> Thanks!
> 
> They applied cleanly; building now --  both on the build machine (which
> failed earlier) and on the newer laptop (which did not fail earlier, as
> it's good to find out if a change has broken somehing that had been
> working).
> ....

The laptop still works:

FreeBSD g1-48.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #169 main-n245338-221622ec0c8e-dirty: Mon Mar  8 03:50:50 PST 2021     root_at_g1-48.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY  amd64 1400005 1400005

FreeBSD g1-48.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #170 main-n245363-b3dac3913dc9-dirty: Tue Mar  9 05:06:34 PST 2021     root_at_g1-48.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY  amd64 1400005 1400005

FreeBSD g1-48.catwhisker.org 14.0-CURRENT FreeBSD 14.0-CURRENT #170 main-n245363-b3dac3913dc9-dirty: Tue Mar  9 05:06:34 PST 2021     root_at_g1-48.catwhisker.org:/common/S4/obj/usr/src/amd64.amd64/sys/CANARY  amd64 1400005 1400005


The build nmachine (still?) panics:

...
pass3: 600.000MB/s transfers (SATA 3.x, UDMA5, PIO 8192bytes)
pass3: Command Queueing enabled
pass4 at ahcich4 bus 0 scbus4 target 0 lun 0
pass4: <M4-CT512M4SSD2 0309> ACS-2 ATA SATA 3.x device
pass4: Serial Number 000000001242091982C2
pass4: 600.000MB/s transfers (SATA 3.x, ugen2.2: <vendor 0x8087 product 0x8000> at usbus2
UDMA5, PIO 8192bytes)
pass4: Command Queueing enabled
uhub4 on uhub1
pass5 at ahciem0 bus 0 scbus5 target 0 lun 0
uhub4: <vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, addr 2> on usbus2
Root mount waiting for:pass5: <AHCI SGPIO Enclosure 2.00 0001> usbus0 usbus1 usbus2ugen0.2: <Generic USB2.0-CRW> at usbus0
 CAM SEMB S-E-S 2.00 device

uma_zalloc_debug: zone "malloc-1024"umass0 on uhub2
 with the following non-sleepable locks held:
umass0: <Bulk-In, Bulk-Out, Interface> on usbus0
exclusive sleep mutex CAM device lockumass0:  SCSI over Bulk-Only; quirks = 0x4000
 (CAM device lock) r = 0 (0xfffff800122c9cd0) locked _at_ /usr/src/sys/cam/cam_xpt.c:2333
umass0:6:0: Attached to scbus6
stack backtrace:
(probe0:umass-sim0:0:0:0): Down reving Protocol Version from 2 to 0?
#0 0xffffffff80c7cce1 at witnesuhub3: 6 ports with 6 removable, self powered
s_debugger+0x71
pass6 at umass-sim0 bus 0 scbus6 target 0 lun 0
#1 0xffffffff80pass6: uhub4: 8 ports with 8 removable, self powered
c7ddfd at witness_warn+0x40d
#2<Generic- Compact Flash 1.00> Removable Direct Access SCSI device
 0xffffffff80f42fe6 at uma_zallpass6: Serial Number 20100818841300000
oc_arg+0x46
#3 0xffffffff80be34pass6: 40.000MB/s transfers
panic: malloc(M_WAITOK) with sleeping prohibited
cpuid = 1
time = 22
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e157a2d0
vpanic() at vpanic+0x181/frame 0xfffffe00e157a320
panic() at panic+0x43/frame 0xfffffe00e157a380
malloc_dbg() at malloc_dbg+0xd4/frame 0xfffffe00e157a3a0
malloc() at malloc+0x34/frame 0xfffffe00e157a400
g_post_event_x() at g_post_event_x+0x5a/frame 0xfffffe00e157a450
g_post_event() at g_post_event+0x48/frame 0xfffffe00e157a4b0
disk_create() at disk_create+0x16f/frame 0xfffffe00e157a600
daregister() at daregister+0x70a/frame 0xfffffe00e157a880
cam_periph_alloc() at cam_periph_alloc+0x57b/frame 0xfffffe00e157a950
daasync() at daasync+0x2c2/frame 0xfffffe00e157a9c0
xpt_async_process_dev() at xpt_async_process_dev+0x152/frame 0xfffffe00e157aa10
xpt_async_process() at xpt_async_process+0x334/frame 0xfffffe00e157ab20
xpt_done_process() at xpt_done_process+0x3a3/frame 0xfffffe00e157ab60
xpt_done_td() at xpt_done_td+0xf5/frame 0xfffffe00e157abb0
fork_exit() at fork_exit+0x80/frame 0xfffffe00e157abf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e157abf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 17 tid 100095 ]
Stopped at      kdb_enter+0x37: movq    $0,0x128b8ce(%rip)
db> 

I'm willing to "poke at it" a bit, given a hint or two...

Peace,
david
-- 
David H. Wolfskill                              david_at_catwhisker.org
It is supremely disingenuous to claim a lack of jurisdiction, then     
proceed to participate in a decision on the same matter.

See https://www.catwhisker.org/~david/publickey.gpg for my public key.

Received on Tue Mar 09 2021 - 20:46:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:27 UTC