Re: Enabling NUMA in BIOS stop booting FreeBSD

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Mon, 12 Dec 2016 19:24:18 +0200
On Mon, Dec 12, 2016 at 08:16:34PM +0300, Slawa Olhovchenkov wrote:
> On Mon, Dec 12, 2016 at 06:54:57PM +0200, Konstantin Belousov wrote:
> 
> > On Mon, Dec 12, 2016 at 07:21:53PM +0300, Slawa Olhovchenkov wrote:
> > > On Mon, Dec 12, 2016 at 04:54:18PM +0200, Konstantin Belousov wrote:
> > > 
> > > > On Sun, Dec 11, 2016 at 11:47:09PM +0300, Slawa Olhovchenkov wrote:
> > > > > Booting...
> > > > > ESC[01;00H8+0x8+0xe9bdc]                                                                  KDB: debugger backends: ddb
> > > > > KDB: current backend: ddb
> > > > > exit from kdb_init
> > > > > KDB: enter: Boot flags requested debugger
> > > > > [ thread pid 0 tid 0 ]
> > > > - remove any video consoles from the HEAD kernel config, i.e. sc/vt and
> > > >   vga/efi,
> > > > - do not use boot -d,
> > > > - use serial console (IPMI SOL qualifies),
> > > > - set late console to 0,
> > > > and show me the verbose dmesg of such boot with the BIOS options
> > > > which cause troubles.
> > > 
> > > Booting...
> > > KDB: debugger backends: ddb
> > > KDB: current backend: ddb
> > > SMAP type=01 base=0000000000000000 len=0000000000099c00
> > > SMAP type=02 base=0000000000099c00 len=0000000000006400
> > > SMAP type=02 base=00000000000e0000 len=0000000000020000
> > > SMAP type=01 base=0000000000100000 len=000000007906b000
> > > SMAP type=02 base=000000007916b000 len=0000000000936000
> > > SMAP type=04 base=0000000079aa1000 len=0000000000509000
> > > SMAP type=02 base=0000000079faa000 len=0000000002056000
> > > SMAP type=01 base=0000000100000000 len=0000001f80000000
> > > SMAP type=02 base=000000007c000000 len=0000000014000000
> > > SMAP type=02 base=00000000fed1c000 len=0000000000029000
> > > SMAP type=02 base=00000000ff000000 len=0000000001000000
> > > 
> > > This is all. No more.
> > When you switch between variations of the NUMA enablement options, do
> > you just reboot the machine or do you sometimes physically turn it off ?
> 
> just reboot an 'power reset' via kvm client (memory preserved, i mean)
> 
> > Try to enable memtest, with the hw.memtest.tests=1 loader variable.
> > Does it change things ?
> 
> System booted, dmesg is http://zxy.spb.ru/dmesg.numa
I suspect now the reversed situation could take place, the non-interleaved
option would cause hang.

My current guess is that memory content is preserved but swizzled by
the cache line sized chunks.  So that the msgbuf header, left after the
previous boot, looks correct while the real buffer content is shuffled.

Try the debugging patch below, which unconditionally disables import of
previous buffer.  To test, you would need to boot, then frob options in
BIOS, reboot, again frob etc.

diff --git a/sys/kern/subr_msgbuf.c b/sys/kern/subr_msgbuf.c
index f275aef3b4f..d45ef502204 100644
--- a/sys/kern/subr_msgbuf.c
+++ b/sys/kern/subr_msgbuf.c
_at__at_ -85,7 +85,7 _at__at_ msgbuf_reinit(struct msgbuf *mbp, void *ptr, int size)
 {
 	u_int cksum;
 
-	if (mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
+	if (1 || mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
 		msgbuf_init(mbp, ptr, size);
 		return;
 	}
Received on Mon Dec 12 2016 - 16:24:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC