Re: LOR in mpr(4)

From: geoffroy desvernay <dgeo_at_centrale-marseille.fr> Date: Wed, 19 Oct 2016 17:10:31 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC

On 11/17/2015 21:43, Pete Wright wrote:
> 
> 
> On 11/12/15 09:44, Pete Wright wrote:
>> Hi All,
>> Just wanted a sanity check before filing a PR.  I am running r290688 and
>> am seeing a LOR being triggered in the mpr(4) device:
>>
>> $ uname -ar
>> FreeBSD srd0013 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r290688: Wed Nov 11
>> 21:28:26 PST 2015     root_at_srd0013:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> <dmesg snip>
>> lock order reversal:
>>  1st 0xfffff8000d26bc60 CAM device lock (CAM device lock) _at_
>> /usr/src/sys/cam/cam_xpt.c:784
>>  2nd 0xfffffe00012811c0 MPR lock (MPR lock) _at_
>> /usr/src/sys/cam/cam_xpt.c:2620
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfffffe04608ee890
>> witness_checkorder() at witness_checkorder+0xe79/frame 0xfffffe04608ee910
>> __mtx_lock_flags() at __mtx_lock_flags+0xa4/frame 0xfffffe04608ee960
>> xpt_action_default() at xpt_action_default+0xb6c/frame 0xfffffe04608ee9b0
>> scsi_scan_bus() at scsi_scan_bus+0x1d5/frame 0xfffffe04608eea20
>> xpt_scanner_thread() at xpt_scanner_thread+0x15c/frame 0xfffffe04608eea70
>> fork_exit() at fork_exit+0x84/frame 0xfffffe04608eeab0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe04608eeab0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> <snip>
> 
> FWIW I filed the following PR as I can still reproduce this on boot:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204614
> 
> cheers,
> -pete
> 
Hi all,

Sorry for cross-posting, let me know where this should go please, I
didn't figured it out :(

On 11-RELEASE-p1 here (but replying on current_at_ where I found something
around mpr(4))

Not sure if it's related, but on a fresh new machine with Avago SAS3008
and a 24 disks enclosure (single attached).

I see a bunch of:

mpr0: Found device <401<SspTarg>,End Device> <12.0Gbps> handle<0x001b>
enclosureHandle<0x0002> slot 8
(da0:mpr0:0:8:0): UNMAPPED
(da0:mpr0:0:8:0): CAM status: SCSI Status Error
(da0:mpr0:0:8:0): SCSI status: Check Condition
(da0:mpr0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command
operation code)
(da0:mpr0:0:8:0): Error 22, Unretryable error
10:0): UNMAPPED
(da0:mpr0:0:8:0): READ(10). CDB: 28 00 e8 e0 88 71 00 00 04 00
(da0:mpr0:0:8:0): CAM status: SCSI Status Error
(da0:mpr0:0:8:0): SCSI status: Check Condition
(da0:mpr0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command
operation code)
(da0:mpr0:0:8:0): Error 22, Unretryable error
ses0: da0: Element descriptor: 'Drive Slot 0'
ses0: da0: SAS Device Slot Element: 2 Phys at Slot 0
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 520474729974b57f addr 5000c50097ce8215
ses0:  phy 1: SAS device type 1 id 1
ses0:  phy 1: protocols: Initiator( None ) Target( SSP )
ses0:  phy 1: parent 520474729974b5ff addr 5000c50097ce8216

(more complete dmesg.boot here: http://dgeo.perso.ec-m.fr/dmesg.boot )

Later, no way to use these disks with zfs:
# zpool create tank da0
cannot create 'tank': invalid argument for this pool operation

I can dd if=/dev/zero of=/dev/da0 though not tested until disk is full…

Can this be related ? Must I open a pr ? How can I help debugging this ?

I'm not kernel/driver hacker, but I'd like to help this be figured out :)

Yours,
-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille