LOR when UMA reclaims unused memory.

From: Grzegorz Bernacki <gjb_at_semihalf.com> Date: Tue, 31 Mar 2009 09:59:43 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:45 UTC

Hi,

We have observed lock order reversal that shows up when pageout daemon is running.
I could not find it FreeBSD LOR page. It started to occur after latest changes in 
uma_core.c file.

Below is the backtrace of LOR ( I completed trace with names of functions which were
not resolved by debugger).

lock order reversal:
 1st 0xc1089288 mbuf_cluster (UMA zone) _at_ /home/gjb/git/marvell/sys/vm/uma_core.c:2703
 2nd 0xc0ea56a4 pmap (pmap) _at_ /home/gjb/git/marvell/sys/arm/arm/pmap.c:3628
KDB: stack backtrace:
db_trace_thread() at db_trace_thread+0x10
scp=0xc0b46f7c rlv=0xc091e8d0 ($a+0x34)
        rsp=0xd115eb04 rfp=0xd115ec20
        r10=0xc36e54b8 r9=0x00000e2c
        r8=0xc0e60270 r7=0xc36e52b0 r6=0xffffffff r5=0xc0ba1fc4
        r4=0xd115eb0c
$a() at $a+0x10
scp=0xc091e8ac rlv=0xc09e5698 (kdb_backtrace+0x3c)
        rsp=0xd115ec24 rfp=0xd115ec34
        r4=0xc0d315d8
kdb_backtrace() at kdb_backtrace+0x10
scp=0xc09e566c rlv=0xc09f32d4 (_witness_debugger+0x5c)
        rsp=0xd115ec38 rfp=0xd115ec4c
        r4=0x00000001
_witness_debugger() at _witness_debugger+0x14
scp=0xc09f328c rlv=0xc09f3fe4 (witness_checkorder+0x664)
        rsp=0xd115ec50 rfp=0xd115ecf0
        r5=0xc0ea56a4 r4=0x00000000
witness_checkorder() at witness_checkorder+0x10
scp=0xc09f3990 rlv=0xc09b1870 (_mtx_lock_flags+0x34)
        rsp=0xd115ecf4 rfp=0xd115ed1c
        r10=0xc0ea75d0 r9=0xc0ea75d0
        r8=0x00000e2c r7=0xc0bdf97c r6=0x00000000 r5=0x00000000
        r4=0xc0ea56a4
_mtx_lock_flags() at _mtx_lock_flags+0x10
scp=0xc09b184c rlv=0xc0b4c660 (pmap_extract+0x24)
        rsp=0xd115ed20 rfp=0xd115ed38
        r10=0xc0e9bd60 r8=0xc3b43800
        r7=0x00000000 r6=0xc0ea56a4 r5=0xc3b43800 r4=0xc108db7c
pmap_extract() at pmap_extract+0x10
scp=0xc0b4c64c rlv=0xc0b25660 ($a+0x340)
        rsp=0xd115ed3c rfp=0xd115ed5c
        r6=0xc1092640 r5=0x00000000
        r4=0xc108db7c
$a() at $a+0x10                              <======= bucket_drain()
scp=0xc0b25330 rlv=0xc0b25ab0 ($a+0x60)
        rsp=0xd115ed60 rfp=0xd115ed78
        r8=0x00000000 r7=0xc37a5828
        r6=0xc0b25dc4 r5=0xc1092640 r4=0xc108db7c
$a() at $a+0x10
scp=0xc0b25a60 rlv=0xc0b25c2c (bucket_cache_drain+0x54)
        rsp=0xd115ed7c rfp=0xd115ed90
        r5=0xc1092640 r4=0xc108db7c
bucket_cache_drain() at bucket_cache_drain+0x10
scp=0xc0b25be8 rlv=0xc0b25d3c ($a+0xa0)
        rsp=0xd115ed94 rfp=0xd115edac
        r5=0x00000001 r4=0xc1092640
$a() at $a+0x10                                <====== zone_drain_wait()
scp=0xc0b25cac rlv=0xc0b23f5c ($a+0x48)
        rsp=0xd115edb0 rfp=0xd115edc8
        r5=0xc1089280 r4=0xc1092640
$a() at $a+0x10                                <====== zone_foreach()
scp=0xc0b23f24 rlv=0xc0b25de4 (uma_reclaim+0x18)
        rsp=0xd115edcc rfp=0xd115eddc
        r6=0xc36e1180 r5=0xc36e118c
        r4=0x00000000
uma_reclaim() at uma_reclaim+0x10
scp=0xc0b25ddc rlv=0xc0b39828 ($a+0x314)
        rsp=0xd115ede0 rfp=0xd115ee80
        r4=0x00000000
$a() at $a+0x10                                 <==== vm_pageout()
scp=0xc0b39524 rlv=0xc09a1f80 (fork_exit+0x64)
        rsp=0xd115ee84 rfp=0xd115eea8
        r10=0xc0b39514 r9=0xc0ea75d0
        r8=0x00000000 r7=0xc37a5828 r6=0xd115eeac r5=0xc0ea75d0
        r4=0xc3840be0
fork_exit() at fork_exit+0x10
scp=0xc09a1f2c rlv=0xc0b56bdc (fork_trampoline+0x14)
        rsp=0xd115eeac rfp=0x00000000
        r10=0xc0ea75d0 r8=0x00000104
        r7=0xfeedfeed r6=0xfeedfeed r5=0x00000000 r4=0xc0b39514

Initial order between UMA zone and pmap locks is created when pmap
allocates pv_entry. Below is function call sequence.
fork_exit()
start_init()
data_abort_handler()
vm_fault()
pmap_enter()		<== lock pmap (pmap) 1st 
pmap_enter_locked()
uma_zalloc_arg()        <== lock UMA zone (PV ENTRY) 2nd

This LOR seems to be harmless cause it doesn't concern the same 
mutexes. However I am not sure if it is possible that some thread
holds kernel pmap mutex and tries to get PV ENTRY and at the same
moment pageout daemon holds PV ENTRY mutex (taken in zone_free_item())
and tries to call pmap_kextract() and get kernel pmap mutex.
vtoslab() 

I've seen it on -current on arm machine with kernel configuration: DB-78XXX.

regards,
Grzesiek