On Sun, Jun 03, 2018 at 09:50:20PM +0200, Michael Gmelin wrote: > > > On Sun, 3 Jun 2018 18:04:23 +0300 > Konstantin Belousov <kostikbel_at_gmail.com> wrote: > > > On Sun, Jun 03, 2018 at 04:55:00PM +0200, Michael Gmelin wrote: > > > > > > > > > On Sun, 3 Jun 2018 16:21:10 +0300 > > > Konstantin Belousov <kostikbel_at_gmail.com> wrote: > > > > > > > On Sun, Jun 03, 2018 at 02:48:40PM +0200, Michael Gmelin wrote: > > > > > Hi, > > > > > > > > > > After upgrading CURRENT to r333992 (from something at least a > > > > > year old, quite some changes in mp_machdep.c since), this > > > > > machine crashes on boot: > > > > > > > > > > Copyright (c) 1992-2018 The FreeBSD Project. > > > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, > > > > > 1993, 1994 The Regents of the University of California. All > > > > > rights reserved. FreeBSD is a registered trademark of The > > > > > FreeBSD Foundation. FreeBSD 12.0-CURRENT #1 r333992: Tue May 22 > > > > > 00:31:04 CEST 2018 > > > > > root_at_flimsy:/usr/obj/usr/src/amd64.amd64/sys/flimsy amd64 > > > > > FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) > > > > > (based on LLVM 6.0.0) WARNING: WITNESS option enabled, expect > > > > > reduced performance. VT(vga): resolution 640x480 CPU: Intel(R) > > > > > Celeron(R) 2955U _at_ 1.40GHz (1396.80-MHz K8-class CPU) > > > > > Origin="GenuineIntel" Id=0x40651 Family=0x6 Model=0x45 > > > > > Stepping=1 > > > > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA, > > > > > CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > > > > > Features2=0x4ddaebbf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,SDBG,CX16, > > > > > xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,XSAVE,OSXSAVE,RDRAND> > > > > > AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD > > > > > Features2=0x21<LAHF,ABM> Structured Extended > > > > > Features=0x2603<FSGSBASE,TSCADJ,ERMS,INVPCID,NFPUSG> XSAVE > > > > > Features=0x1<XSAVEOPT> VT-x: (disabled in BIOS) > > > > > PAT,HLT,MTF,PAUSE,EPT,UG,VPID TSC: P-state invariant, > > > > > performance statistics real memory = 4301258752 (4102 MB) > > > > > avail memory = 1907572736 (1819 MB) Event timer "LAPIC" quality > > > > > 600 ACPI APIC Table: <CORE COREBOOT> > > > > What does this mean ? Did you flashed coreboot ? > > > > > > This machine comes with it by default (my model was delivered with > > > SeaBIOS 20131018_145217-build121-m2). So I didn't flash anything > > > (didn't feel like bricking it). > > > > > > > > > > > > kernel trap 12 with interrupts disabled > > > > > > > > > > Fatal trap 12: page fault while in kernel mode > > > > > cpuid = 0; apic id = 00 > > > > > fault virtual address = 0xfffff80001000000 > > > > > fault code = supervisor write data, protection > > > > > violation instruction pointer = 0x20:Oxffffffff8102955f > > > > > stack pointer = 0x28:0xffffffff82a79be0 > > > > > frame pointer = 0x28:0xffffffff82a79c10 > > > > > code segment = base Ox0, limit Oxfffff, type Ox1b > > > > > = DPL 0, pres 1, long 1, def32 0, gran > > > > > 1 processor eflags = resume, IOPL = 0 > > > > > current process = 0 () > > > > > [ thread pid 0 tid 0 ] > > > > > Stopped at native_start_all_aps+0x08f: movq > > > > > %rax,(%rsi) > > > > Look up the source line number for this address. > > > > > > > > > > I guess that's sys/amd64/amd64/support.S line 854 (in rdmsr), > > > called by native_start_all_aps. Any additional hints how I can > > > track it down? > > Why did you decided that this is rdmsr_safe() ? First, > > native_start_all_aps() does not call rdmsr, second the ddb > > report clearly indicates that the fault occured acessing DMAP in > > native_start_all_aps(). > > > > Just look up the source line by the address > > native_start_all_aps+0x08f. > > Okay, according to kgbd this should be here: > > https://svnweb.freebsd.org/base/head/sys/amd64/amd64/mp_machdep.c?revision=333368&view=markup#l369 > > 364 > 365 /* Create the initial 1GB replicated page tables */ > 366 for (i = 0; i < 512; i++) { > 367 /* Each slot of the level 4 pages points to the same > level 3 page */ 368 pt4[i] = > (u_int64_t)(uintptr_t)(mptramp_pagetables + PAGE_SIZE); 369 > pt4[i] |= PG_V | PG_RW | PG_U; 370 > 371 /* Each slot of the level 3 pages points to the same > level 2 page */ 372 pt3[i] = > (u_int64_t)(uintptr_t)(mptramp_pagetables + (2 * PAGE_SIZE)); > 373 pt3[i] |= PG_V | PG_RW | PG_U; 374 > 375 /* The level 2 page slots are mapped with 2MB pages for > 1GB. */ 376 pt2[i] = i * (2 * 1024 * 1024); > 377 pt2[i] |= PG_V | PG_RW | PG_PS | PG_U; > 378 } > > -m You have fault on write due to read-only mapping of the portion of the direct map, which maps the kernel text. It is consistent with the faulting address. It is not clear if it is something new on your machine, or before the kernel text was silently corrupted, since ro protection is somewhat recent. It seems that mp_bootaddress() selected the bad place for the bootstrap page tables. Even more, we do not include the kernel text into the physmem[] array, so it is not clear how did it happen. This code was also changed recently. Can you add the print of the physmap[] array somewhere before the panic, to see what is the kernel idea of the available memory ? It should be already done if you have serial console and set debug.late_console tunable to 0. > > p.s. This machine uses quirks in biosmem.c, see > > Type '?' for a list of command, 'help' for more detailed > help. > OK biosmem > bios_basemem: 0x9e400 > bios_extmem: 0x3ff00000 > memtop: 0x3c000000 > high_heap_base: 0x3c000000 > high_heap_size: 0x4000000 > bios_quirks: 0x01 BQ_DISTRUST_820_EXTMEM > b_bios_probed: 0x0a B_BASEMEM_12 B_EXTMEM_E801 > > -- > Michael Gmelin > > -- > Michael GmelinReceived on Sun Jun 03 2018 - 18:53:52 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC