Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Sun, 25 Aug 2019 17:30:34 +0300
On Sun, Aug 25, 2019 at 07:17:20AM -0600, Rebecca Cran wrote:
> On 2019-08-25 00:24, Konstantin Belousov wrote:
> > What are the panic messages ?
> 
> Fatal trap 18: integer divide fault while in kernel mode
> 
> instruction pointer = 0x20:0xffffffff80f1027c
> 
> stack pointer = 0x28:0xffffffff845809f0
> 
> frame pointer = 0x28:0xffffffff84580a00
> 
> code segment = base 0x0, limit 0xffffff, type 0x1b
> 
>     = DPL 0, pres 1, long 1, def32 0, gran 1
> 
> processor eflags = resume, IOPL = 0
> 
> current process = 0 ()
> 
> trap number = 18
> 
> panic: integer divide fault
> 
> cpuid = 0
> 
> time = 1
> 
> 
> > What is the source line ?
> 
> (gdb) info line *0xffffffff80f1027c
> Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
> 0xffffffff80f10267 <vm_domainset_iter_first+151>
>    and ends at 0xffffffff80f1027f <vm_domainset_iter_first+175>.

There was one more source line I asked about.

So what happens, IMO, is that for memory-less domains ds_cnt is zero
because ds_mask is zero, which causes the exception on divide.  You
can try the following combined patch, but I really dislike the fact
that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct).

I would prefer for kmem_malloc_domainset(DOMAINSET_FIXED(unpopulated domain))
to fail with NULL result, and then I would manually fall-back to
DOMAINSET_PREF().

OTOH, I think the chunk for mp_realloc_cpu() is the final fix.

diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c
index b38c688f8b4..2c3dc8744f6 100644
--- a/sys/amd64/amd64/mp_machdep.c
+++ b/sys/amd64/amd64/mp_machdep.c
_at__at_ -402,6 +402,8 _at__at_ mp_realloc_pcpu(int cpuid, int domain)
 		return;
 	m = vm_page_alloc_domain(NULL, 0, domain,
 	    VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ);
+	if (m == NULL)
+		return;
 	na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 	pagecopy((void *)oa, (void *)na);
 	pmap_qenter((vm_offset_t)&__pcpu[cpuid], &m, 1);
_at__at_ -481,10 +483,10 _at__at_ native_start_all_aps(void)
 		    M_ZERO);
 		mce_stack = (char *)kmem_malloc(PAGE_SIZE, M_WAITOK | M_ZERO);
 		nmi_stack = (char *)kmem_malloc_domainset(
-		    DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
+		    DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
 		dbg_stack = (char *)kmem_malloc_domainset(
-		    DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
-		dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_FIXED(domain),
+		    DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
+		dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_PREF(domain),
 		    DPCPU_SIZE, M_WAITOK | M_ZERO);
 
 		bootSTK = (char *)bootstacks[cpu] +
Received on Sun Aug 25 2019 - 12:30:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC