Re: Xen HVM Panic, HEAD

From: Sean Bruno <sbruno_at_ignoranthack.me>
Date: Wed, 18 Feb 2015 21:03:03 -0800
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 02/17/15 12:30, Sean Bruno wrote:
> On 02/17/15 12:26, Konstantin Belousov wrote:
>> On Tue, Feb 17, 2015 at 12:00:04PM -0800, Sean Bruno wrote:
>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
>>> 
>>> On 02/17/15 00:56, Konstantin Belousov wrote:
>>>> On Mon, Feb 16, 2015 at 08:10:06PM -0800, Sean Bruno wrote:
>>>>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512
>>>>> 
>>>>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>>>>> 
>>>>> I suspect that there may be one or two more lines above
>>>>> this that are relevant to this panic, but XENHVM kernel's
>>>>> now panic booting on Xen server.  The working kernel output
>>>>> looks like this:
>>>>> 
>>>>> FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 
>>>>> 208032) 20140512 XEN: Hypervisor version 4.2 detected.
>>>>> CPU: Intel(R) Xeon(R) CPU           E5620  _at_ 2.40GHz
>>>>> (2400.05-MHz K8-class CPU) Origin="GenuineIntel"
>>>>> Id=0x206c2  Family=0x6 Model=0x2c Stepping=2 
>>>>> Features=0x1783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2,HTT>
>>>>>
>>>>>
>>>
>>>>>
>
>>>>> 
Features2=0x81ba2201<SSE3,SSSE3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,HV>
>>>>> AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM> AMD 
>>>>> Features2=0x1<LAHF> Hypervisor: Origin = "XenVMMXenVMM"
>>>>> real memory  = 1434451968 (1368 MB) avail memory =
>>>>> 1353293824 (1290 MB) Event timer "LAPIC" quality 400 ACPI
>>>>> APIC Table: <Xen HVM> FreeBSD/SMP: Multiprocessor System
>>>>> Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0
>>>>> (BSP): APIC ID:  0 cpu1 (AP): APIC ID:  2 ioapic0: Changing
>>>>> APIC ID to 1 MADT: Forcing active-low polarity and level
>>>>> trigger for SCI
>>>> I am not sure why your machine uses native lapic instead of 
>>>> xen lapic, and should it be other way, or not.
>>>> 
>>>> Regardless, show the line number for the ipi_startup+0x56.
>>>> Did you performed clean kernel build ?
>>>> 
>>>> 
>>> 
>>> I have rebuilt a kernel/world based on head at svn r276627.  I 
>>> have delete /usr/obj completely and started from scratch.
>>> 
>>> Updated kernelpanic image at 
>>> https://people.freebsd.org/~sbruno/Xen_APIC_panic.png
>>> 
>>> /usr/src/sys/x86/include # kgdb /boot/kernel/kernel GNU gdb
>>> 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public
>>> License, and you are welcome to change it and/or distribute
>>> copies of it under certain conditions. Type "show copying" to
>>> see the conditions. There is absolutely no warranty for GDB.
>>> Type "show warranty" for details. This GDB was configured as 
>>> "amd64-marcel-freebsd"... (kgdb) list *(ipi_startup+0x56) 
>>> 0xffffffff80e088c6 is in ipi_startup (apicvar.h:383). 378 379 
>>> static inline int 380	lapic_ipi_wait(int delay) 381	{ 382 383 
>>> return (apic_ops.ipi_wait(delay)); 384	} 385 386	static inline 
>>> int 387	lapic_set_lvt_mask(u_int apic_id, u_int lvt, u_char 
>>> masked)
>>> 
> 
>> Please disassemble your ipi_startup, also please do 'p
>> *apic_ops'.
> 
> 
> 
> 
> (kgdb) disassemble ipi_startup
> 
> 
> 
> Dump of assembler code for function ipi_startup: 0xffffffff80df3900
> <ipi_startup+0>:     push   %rbp 0xffffffff80df3901
> <ipi_startup+1>:     mov    %rsp,%rbp 0xffffffff80df3904
> <ipi_startup+4>:     push   %r14 0xffffffff80df3906
> <ipi_startup+6>:     push   %rbx 0xffffffff80df3907
> <ipi_startup+7>:     mov    %esi,%ebx 0xffffffff80df3909
> <ipi_startup+9>:     mov    %edi,%r14d 0xffffffff80df390c
> <ipi_startup+12>:    mov    $0xc500,%edi 0xffffffff80df3911
> <ipi_startup+17>:    mov    %r14d,%esi 0xffffffff80df3914
> <ipi_startup+20>:    callq  *0xffffffff815ac428 0xffffffff80df391b
> <ipi_startup+27>:    mov    $0x14,%edi 0xffffffff80df3920
> <ipi_startup+32>:    callq  *0xffffffff815ac438 0xffffffff80df3927
> <ipi_startup+39>:    mov    $0x8500,%edi 0xffffffff80df392c
> <ipi_startup+44>:    mov    %r14d,%esi 0xffffffff80df392f
> <ipi_startup+47>:    callq  *0xffffffff815ac428 0xffffffff80df3936
> <ipi_startup+54>:    mov    $0x2710,%edi 0xffffffff80df393b
> <ipi_startup+59>:    callq  0xffffffff80f39c10 <DELAY> 
> 0xffffffff80df3940 <ipi_startup+64>:    or     $0x4600,%ebx 
> 0xffffffff80df3946 <ipi_startup+70>:    movslq %ebx,%rbx 
> 0xffffffff80df3949 <ipi_startup+73>:    mov    %rbx,%rdi 
> 0xffffffff80df394c <ipi_startup+76>:    mov    %r14d,%esi 
> 0xffffffff80df394f <ipi_startup+79>:    callq  *0xffffffff815ac428 
> 0xffffffff80df3956 <ipi_startup+86>:    mov    $0x14,%edi 
> 0xffffffff80df395b <ipi_startup+91>:    callq  *0xffffffff815ac438 
> 0xffffffff80df3962 <ipi_startup+98>:    test   %eax,%eax 
> 0xffffffff80df3964 <ipi_startup+100>:   je     0xffffffff80df399b 
> <ipi_startup+155> 0xffffffff80df3966 <ipi_startup+102>:   mov
> $0xc8,%edi 0xffffffff80df396b <ipi_startup+107>:   callq
> 0xffffffff80f39c10 <DELAY> 0xffffffff80df3970 <ipi_startup+112>:
> mov    %rbx,%rdi 0xffffffff80df3973 <ipi_startup+115>:   mov
> %r14d,%esi 0xffffffff80df3976 <ipi_startup+118>:   callq
> *0xffffffff815ac428 0xffffffff80df397d <ipi_startup+125>:   mov
> $0x14,%edi 0xffffffff80df3982 <ipi_startup+130>:   callq
> *0xffffffff815ac438 0xffffffff80df3989 <ipi_startup+137>:   test
> %eax,%eax 0xffffffff80df398b <ipi_startup+139>:   je
> 0xffffffff80df39a4 <ipi_startup+164> 0xffffffff80df398d
> <ipi_startup+141>:   mov    $0xc8,%edi 0xffffffff80df3992
> <ipi_startup+146>:   pop    %rbx 0xffffffff80df3993
> <ipi_startup+147>:   pop    %r14 0xffffffff80df3995
> <ipi_startup+149>:   pop    %rbp 0xffffffff80df3996
> <ipi_startup+150>:   jmpq   0xffffffff80f39c10 <DELAY> 
> 0xffffffff80df399b <ipi_startup+155>:   mov
> $0xffffffff810cb5c4,%rdi 0xffffffff80df39a2 <ipi_startup+162>:
> jmp    0xffffffff80df39ab <ipi_startup+171> 0xffffffff80df39a4
> <ipi_startup+164>:   mov    $0xffffffff810cb5f3,%rdi 
> 0xffffffff80df39ab <ipi_startup+171>:   xor    %eax,%eax 
> 0xffffffff80df39ad <ipi_startup+173>:   mov    %r14d,%esi 
> 0xffffffff80df39b0 <ipi_startup+176>:   callq  0xffffffff809b40d0
> <panic> End of assembler dump.
> 
> 
> (kgdb) p apic_ops $2 = {create = 0xffffffff80f32e10
> <native_lapic_create>, init = 0xffffffff80f32f70
> <native_lapic_init>, xapic_mode = 0xffffffff80f331b0
> <native_lapic_xapic_mode>, setup = 0xffffffff80f331e0
> <native_lapic_setup>, dump = 0xffffffff80f336c0
> <native_lapic_dump>, disable = 0xffffffff80f339a0
> <native_lapic_disable>, eoi = 0xffffffff80f33a00 
> <native_lapic_eoi>, id = 0xffffffff80f33a40 <native_lapic_id>, 
> intr_pending = 0xffffffff80f33a80 <native_lapic_intr_pending>, 
> set_logical_id = 0xffffffff80f33ad0 <native_lapic_set_logical_id>, 
> cpuid = 0xffffffff80f33b10 <native_apic_cpuid>, alloc_vector =
> 0xffffffff80f33b20 <native_apic_alloc_vector>, alloc_vectors =
> 0xffffffff80f33c90 <native_apic_alloc_vectors>, enable_vector =
> 0xffffffff80f33ec0 <native_apic_enable_vector>, disable_vector =
> 0xffffffff80f33ef0 <native_apic_disable_vector>, free_vector =
> 0xffffffff80f33f00 <native_apic_free_vector>, enable_pmc =
> 0xffffffff80f34110 <native_lapic_enable_pmc>, disable_pmc =
> 0xffffffff80f34200 <native_lapic_disable_pmc>, reenable_pmc =
> 0xffffffff80f34260 <native_lapic_reenable_pmc>, enable_cmc =
> 0xffffffff80f342c0 <native_lapic_enable_cmc>, ipi_raw =
> 0xffffffff80f34310 <native_lapic_ipi_raw>, ipi_vectored = 
> 0xffffffff80f343d0 <native_lapic_ipi_vectored>, ipi_wait = 
> 0xffffffff80f34470 <native_lapic_ipi_wait>, set_lvt_mask = 
> 0xffffffff80f34550 <native_lapic_set_lvt_mask>, set_lvt_mode =
> 0xffffffff80f34650 <native_lapic_set_lvt_mode>, set_lvt_polarity =
> 0xffffffff80f347e0 <native_lapic_set_lvt_polarity>, 
> set_lvt_triggermode = 0xffffffff80f348f0 
> <native_lapic_set_lvt_triggermode>}
> 
> 
> 
> 

This panic starts at:
https://svnweb.freebsd.org/base?view=revision&revision=278473

If I use 278472, I can boot the Xen VM normally.

If I use head and set hw.x2apic_enable="0" in loader.conf and boot
head (278970), it boots normally.

Second issue:

However, the UFS disk access is SO slow on this that it took 2+hours
to do an installworld, which I couldn't abort because it had already
started.  I'm not sure if the UFS disk access is related at all or
not.  Once things are read from disk into memory they are fast and
responsive (e.g. sshd, tmux, shells, etc).

sean

bcc royger
Received on Thu Feb 19 2015 - 04:03:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC