Re: Enabling NUMA in BIOS stop booting FreeBSD

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Mon, 12 Dec 2016 20:36:47 +0200
On Mon, Dec 12, 2016 at 08:43:11PM +0300, Slawa Olhovchenkov wrote:
> On Mon, Dec 12, 2016 at 07:24:18PM +0200, Konstantin Belousov wrote:
> 
> > On Mon, Dec 12, 2016 at 08:16:34PM +0300, Slawa Olhovchenkov wrote:
> > > On Mon, Dec 12, 2016 at 06:54:57PM +0200, Konstantin Belousov wrote:
> > > 
> > > > On Mon, Dec 12, 2016 at 07:21:53PM +0300, Slawa Olhovchenkov wrote:
> > > > > On Mon, Dec 12, 2016 at 04:54:18PM +0200, Konstantin Belousov wrote:
> > > > > 
> > > > > > On Sun, Dec 11, 2016 at 11:47:09PM +0300, Slawa Olhovchenkov wrote:
> > > > > > > Booting...
> > > > > > > ESC[01;00H8+0x8+0xe9bdc]                                                                  KDB: debugger backends: ddb
> > > > > > > KDB: current backend: ddb
> > > > > > > exit from kdb_init
> > > > > > > KDB: enter: Boot flags requested debugger
> > > > > > > [ thread pid 0 tid 0 ]
> > > > > > - remove any video consoles from the HEAD kernel config, i.e. sc/vt and
> > > > > >   vga/efi,
> > > > > > - do not use boot -d,
> > > > > > - use serial console (IPMI SOL qualifies),
> > > > > > - set late console to 0,
> > > > > > and show me the verbose dmesg of such boot with the BIOS options
> > > > > > which cause troubles.
> > > > > 
> > > > > Booting...
> > > > > KDB: debugger backends: ddb
> > > > > KDB: current backend: ddb
> > > > > SMAP type=01 base=0000000000000000 len=0000000000099c00
> > > > > SMAP type=02 base=0000000000099c00 len=0000000000006400
> > > > > SMAP type=02 base=00000000000e0000 len=0000000000020000
> > > > > SMAP type=01 base=0000000000100000 len=000000007906b000
> > > > > SMAP type=02 base=000000007916b000 len=0000000000936000
> > > > > SMAP type=04 base=0000000079aa1000 len=0000000000509000
> > > > > SMAP type=02 base=0000000079faa000 len=0000000002056000
> > > > > SMAP type=01 base=0000000100000000 len=0000001f80000000
> > > > > SMAP type=02 base=000000007c000000 len=0000000014000000
> > > > > SMAP type=02 base=00000000fed1c000 len=0000000000029000
> > > > > SMAP type=02 base=00000000ff000000 len=0000000001000000
> > > > > 
> > > > > This is all. No more.
> > > > When you switch between variations of the NUMA enablement options, do
> > > > you just reboot the machine or do you sometimes physically turn it off ?
> > > 
> > > just reboot an 'power reset' via kvm client (memory preserved, i mean)
> > > 
> > > > Try to enable memtest, with the hw.memtest.tests=1 loader variable.
> > > > Does it change things ?
> > > 
> > > System booted, dmesg is http://zxy.spb.ru/dmesg.numa
> > I suspect now the reversed situation could take place, the non-interleaved
> > option would cause hang.
> 
> No, also booted, dmesg http://zxy.spb.ru/dmesg.numa-ninter
I mean, it could hang if memory testing is not enabled.

> 
> > My current guess is that memory content is preserved but swizzled by
> > the cache line sized chunks.  So that the msgbuf header, left after the
> > previous boot, looks correct while the real buffer content is shuffled.
> > 
> > Try the debugging patch below, which unconditionally disables import of
> > previous buffer.  To test, you would need to boot, then frob options in
> > BIOS, reboot, again frob etc.
> 
> still need test patch? if yes, with BIOS options?
Yes, please test the patch.  I explained the procedure above.

> 
> > diff --git a/sys/kern/subr_msgbuf.c b/sys/kern/subr_msgbuf.c
> > index f275aef3b4f..d45ef502204 100644
> > --- a/sys/kern/subr_msgbuf.c
> > +++ b/sys/kern/subr_msgbuf.c
> > _at__at_ -85,7 +85,7 _at__at_ msgbuf_reinit(struct msgbuf *mbp, void *ptr, int size)
> >  {
> >  	u_int cksum;
> >  
> > -	if (mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
> > +	if (1 || mbp->msg_magic != MSG_MAGIC || mbp->msg_size != size) {
> >  		msgbuf_init(mbp, ptr, size);
> >  		return;
> >  	}
Received on Mon Dec 12 2016 - 17:36:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC