Re: Early drop to debugger with DEBUG_MEMGUARD

From: Davide Italiano <davide_at_freebsd.org>
Date: Mon, 12 Aug 2013 08:30:15 -0700
On Mon, Aug 12, 2013 at 8:13 AM, David Wolfskill <david_at_catwhisker.org> wrote:
> I first noticed this on my laptop on 08 Aug, after having built & booted
>
> FreeBSD 10.0-CURRENT #975  r253985M/253985:1000041: Tue Aug  6 05:28:39 PDT 2013     root_at_localhost:/common/S4/obj/usr/src/sys/CANARY  i386
>
> OK.  I'm away from home, and Internet access is a bit flaky, so
> initially, I suspected that something may have gone wrong with my
> source update; I later determined that disabling "options DEBUG_MEMGUARD"
> would avoid the panic.
>
> That said, I had been running a kernel with DEBUG_MEMGUARD for quite
> a while without issues; I suspect that this drop to debugger either
> reflects a real problem that disabling DEBUG_MEMGUARD merely hides
> or htat the assert in src/sys/kern/subr_vmem.c:1050 isn't actually
> correct in all cases.
>
> So I finally(!) had a chance to try to reproduce the error on a
> machine with a serial console; here's a cut/paste from that:
>
> ...
>  |  7. Boot [V]erbose: NO                  |    `:`                  `:`
>  |                                         |      .--             `--.
>  |                                         |         .---.....----.
>  +-----------------------------------------+
>
>
> Booting...
> GDB: no debug ports present
> KDB: debugger backends: ddb
> KDB: current backend: ddb
> Copyright (c) 1992-2013 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 10.0-CURRENT #0  r254245M/254246:1000042: Mon Aug 12 07:20:47 PDT 2013
>     root_at_freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/MEMGUARD i386
> FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
> WARNING: WITNESS option enabled, expect reduced performance.
> panic: Assertion strat == M_BESTFIT || strat == M_FIRSTFIT failed at /usr/src/sys/kern/subr_vmem.c:1050
> cpuid = 0
> KDB: stack backtrace:
> db_trace_self_wrapper(c116fcdc,73752f20,72732f72,79732f63,656b2f73,...) at db_trace_self_wrapper+0x2d/frame 0xc1820ba0
> kdb_backtrace(c11c4b23,0,c0f8a835,c1820c74,c0f8a835,...) at kdb_backtrace+0x30/frame 0xc1820c08
> vpanic(c12eea08,100,c0f8a835,c1820c74,c1820c74,...) at vpanic+0x11f/frame 0xc1820c44
> kassert_panic(c0f8a835,c1172e98,c1172e39,41a,8,...) at kassert_panic+0xea/frame 0xc1820c68
> vmem_alloc(c130d680,6681000,2,c1820cc0,3b5,...) at vmem_alloc+0x53/frame 0xc1820ca0
> memguard_init(c130d680,c0a9fa50,c6800000,20281000,1000,10000,0) at memguard_init+0x29/frame 0xc1820cc4
> kmeminit(c14b9fd4,c10efc89,0,0,c1820d30,...) at kmeminit+0x171/frame 0xc1820cf0
> mallocinit(0,0,2,0,c11d3728,...) at mallocinit+0x32/frame 0xc1820d30
> mi_startup() at mi_startup+0xf7/frame 0xc1820d58
> begin() at begin+0x2c
> KDB: enter: panic
> [ thread pid 0 tid 0 ]
> Stopped at      kdb_enter+0x3d: movl    $0,kdb_why
> db>
>
> As you can see, this is well before any device probes or much of
> anything else.  Thus, I suspect that it's fairly possible that the
> assertion may well be OK after a certain point in the boot sequence,
> but decidedly *not* OK in this specific instance.  Or perhaps the
> assertion just doesn't play well with DEBUG_MEMGUARD.
>
> I'm not about to pretend that I have anywhere near enough familiarity
> with what's going on to even suggest a fix, but it seems to me that
> Something Is Wrong Here.
>
> The kernel config (in this case) is:
>
> include GENERIC
>
> ident           MEMGUARD
>
> options         DEBUG_MEMGUARD
>
>
> The system was running a copy of:
>
> FreeBSD 10.0-CURRENT #1243  r254245M/254246:1000042: Mon Aug 12 05:39:42 PDT 2013     root_at_freebeast.catwhisker.org:/common/S4/obj/usr/src/sys/GENERIC  i386
>
> but with a newly-built MEMGUARD kernel (as above), built from the same
> sources.
>
> I have some time to poke at it for the next few hours; subject to
> my Internet access & available time, I'm happy to do that, try
> patches, or whatever, but I could use a bit of guidance.
>
> Since it's been completely reproducible for me, I suspect that
> anyone with sufficiently recenty sources running head can reproduce
> it merely by enabling "options DEBUG_MEMGUARD", rebuilding the
> kernel, and booting it.
>
> Peace,
> david
> --
> David H. Wolfskill                              david_at_catwhisker.org
> Taliban: Evil men with guns afraid of truth from a 14-year old girl.
>
> See http://www.catwhisker.org/~david/publickey.gpg for my public key.

vmem_alloc() KPI needs the consumer to specify exactly a strategy for
allocation, which is one of two between: M_FIRSTFIT/M_BESTFIT (fast
allocation vs low fragmentation), and that's the assertion that's not
respected within the code.

1050	        MPASS(strat == M_BESTFIT || strat == M_FIRSTFIT);

It looks like memguard_init() doesn't specify none of these two strategies.

209	        vmem_alloc(parent, memguard_mapsize, M_WAITOK, &base);

My guess is that you need to OR one between M_BESTFIT/M_FIRSTFIT with
M_WAITOK to have your kernel booting. What's better between the two
probably will need some measurements but this should at least make
your kernel booting.

I cannot test this change myself as long as I'm out of town until
tomorrow afternoon as well, but I will take a further look when I'll
come back.
If in the meanwhile you want to try this change, be my guest.

Thanks,

-- 
Davide

"There are no solved problems; there are only problems that are more
or less solved" -- Henri Poincare
Received on Mon Aug 12 2013 - 13:30:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:40 UTC