Re: "panic: malloc(M_WAITOK) in non-sleepable context" after r364296 -> r364341

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Tue, 18 Aug 2020 16:13:30 +0200
On 8/18/20, Mark Johnston <markj_at_freebsd.org> wrote:
> On Tue, Aug 18, 2020 at 03:55:15PM +0200, Mateusz Guzik wrote:
>> On 8/18/20, Mark Johnston <markj_at_freebsd.org> wrote:
>> > On Tue, Aug 18, 2020 at 03:24:52PM +0200, Mateusz Guzik wrote:
>> >> Try this:
>> >>
>> >> diff --git a/sys/kern/kern_malloc.c b/sys/kern/kern_malloc.c
>> >> index 46eb1c4347c..7b94ca7b880 100644
>> >> --- a/sys/kern/kern_malloc.c
>> >> +++ b/sys/kern/kern_malloc.c
>> >> _at__at_ -618,8 +618,8 _at__at_ void *
>> >>         unsigned long osize = size;
>> >>  #endif
>> >>
>> >> -       KASSERT((flags & M_WAITOK) == 0 || THREAD_CAN_SLEEP(),
>> >> -           ("malloc(M_WAITOK) in non-sleepable context"));
>> >> +       if ((flags & M_WAITOK) != 0)
>> >> +               WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL,
>> >> __func__);
>> >>
>> >>  #ifdef MALLOC_DEBUG
>> >>         va = NULL;
>> >> diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
>> >> index 37d78354200..2e1267ec02f 100644
>> >> --- a/sys/vm/uma_core.c
>> >> +++ b/sys/vm/uma_core.c
>> >> _at__at_ -3355,8 +3355,8 _at__at_ uma_zalloc_arg(uma_zone_t zone, void *udata, int
>> >> flags)
>> >>         uma_cache_bucket_t bucket;
>> >>         uma_cache_t cache;
>> >>
>> >> -       KASSERT((flags & M_WAITOK) == 0 || THREAD_CAN_SLEEP(),
>> >> -           ("uma_zalloc(M_WAITOK) in non-sleepable context"));
>> >> +       if ((flags & M_WAITOK) != 0)
>> >> +               WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL,
>> >> __func__);
>> >>
>> >>         /* Enable entropy collection for RANDOM_ENABLE_UMA kernel
>> >> option
>> >> */
>> >>         random_harvest_fast_uma(&zone, sizeof(zone), RANDOM_UMA);
>> >
>> > Doesn't it only print a warning if non-sleepable locks are held?
>> > THREAD_CAN_SLEEP() catches other cases, like epoch sections and worker
>> > threads that are not allowed to sleep (CAM doneq threads in this case).
>> > Otherwise uma_zalloc_debug() already checks with WITNESS.
>> >
>> > I'm inclined to simply revert the commit for now.  Alternately,
>> > disk_alloc() could be modified to handle M_NOWAIT/M_WAITOK flags, and
>> > that could be used in daregister() and other CAM periph drivers.
>> > daregister() already uses M_NOWAIT to allocate the driver softc itself.
>> >
>>
>> If WITNESS_WARN(.., WARN_SLEEPOK) does not check for all possible
>> blockers for going off CPU that's a bug. I do support reverting the
>> patch for now until the dust settles. I don't propose the above hack
>> as the final fix.
>
> Well, IMO the bug is that we have no subroutine to perform all of these
> checks (which includes those done by WITNESS_WARN(..., WARN_SLEEPOK)).
> WITNESS' responsibilities are more narrow.  I just meant that your patch
> effectively reverts Gleb's commit.
>

not exactly, as it still adds something for malloc.

I argued for a dedicated routine to perform all the relevant checks.
Note a dedicated routine instead of a handrolled call to
witness/panic/whatever can also report the exact blockers (if it knows
them), in this case what lock was taken and where. I would not mind
any extras (e.g., critnest).

>> As for the culrpit at hand, given the backtrace:
>> devfs_alloc() at devfs_alloc+0x28/frame 0xffffffff8218d570
>> make_dev_sv() at make_dev_sv+0x97/frame 0xffffffff8218d600
>> make_dev_s() at make_dev_s+0x3b/frame 0xffffffff8218d660
>> passregister() at passregister+0x3e7/frame 0xffffffff8218d8b0
>>
>> I think this is missing wait flags, resulting in M_WAITOK later, i.e.:
>
> Ah, I was looking at a different report.  All the more reason to revert
> for now.
>


-- 
Mateusz Guzik <mjguzik gmail.com>
Received on Tue Aug 18 2020 - 12:13:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC