Re: ppp triggers GPF panic

From: Lawrence Stewart <lstewart_at_freebsd.org> Date: Sun, 12 Jul 2009 11:29:37 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:51 UTC

Stefan Bethke wrote:
> Am 12.07.2009 um 09:16 schrieb Lawrence Stewart:
> 
>> Stefan Bethke wrote:
>>> Am 11.07.2009 um 20:44 schrieb Lawrence Stewart:
>>>> Stefan Bethke wrote:
>>>>> Yesterday's -current, amd64, C2D, 4 GB RAM. Full dmesg below.
>>>>> Fatal trap 9: general protection fault while in kernel mode
>>>>> cpuid = 0; apic id = 00
>>>>> instruction pointer    = 0x20:0xffffffff802fc2ce
>>>>> stack pointer            = 0x28:0xffffff8000037b10
>>>>> frame pointer            = 0x28:0xffffff8000037b30
>>>>> code segment        = base 0x0, limit 0xfffff, type 0x1b
>>>>>           = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>> processor eflags    = interrupt enabled, resume, IOPL = 0
>>>>> current process        = 12 (swi1: netisr 0)
>>>>> [thread pid 12 tid 100007 ]
>>>>> Stopped at      _mtx_lock_sleep+0x4e:   movl    0x288(%rcx),%esi
>>>>> Didn't capture anything else there.  This happened when my ADSL 
>>>>> link was forced down (24h connection reset).
>>>>> After fixing the file system (UFS2 + softupdates on /), I got 
>>>>> another "panic: spin lock held too long" on rebooting.
>>>>> Then, the GPF panic happened again as ppp was trying to establish 
>>>>> the connection:
>>>>
>>>> 1. Do you have a crash dump?
>>> Unfortunatly not.
>>>> 2. Can you try find a sequence of events to deterministically 
>>>> reproduce this?
>>> Not if I can help it, this is my main gateway at home.  Sorry.  But 
>>> I'll try collect as much info as possible if and when it happens again.
>>
>> You can set debug.debugger_on_panic=0 in /etc/sysctl.conf which will 
>> make the system automatically dump core and reset instead of sitting 
>> at the ddb prompt. Alternatively, run "call doadump" from the ddb 
>> prompt followed by "reset" and that should also get you a usable core 
>> file. I'd suggest the first option for you though given you don't like 
>> the machine being down. Let us know if/when it happens again, but 
>> without a core file there's not much we can help with.
> 
> Happend again when ppp tried to reestablish the connection. 
> Unfortunatly, the dump wasn't good enough for savecore:
> 
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 1; apic id = 01
> instruction pointer    = 0x20:0xffffffff802fc2ce
> stack pointer            = 0x28:0xffffff807512c540
> frame pointer            = 0x28:0xffffff807512c560
> code segment        = base 0x0, limit 0xfffff, type 0x1b
>             = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags    = interrupt enabled, resume, IOPL = 0
> current process        = 9451 (ifconfig)
> [thread pid 9451 tid 100126 ]
> Stopped at      _mtx_lock_sleep+0x4e:   movl    0x288(%rcx),%esi
> db> bt
> Tracing pid 9451 tid 100126 td 0xffffff0002771390
> _mtx_lock_sleep() at _mtx_lock_sleep+0x4e
> _mtx_lock_flags() at _mtx_lock_flags+0x43
> netisr_queue_internal() at netisr_queue_internal+0x4f
> netisr_queue_src() at netisr_queue_src+0x3c
> rt_newaddrmsg() at rt_newaddrmsg+0x1d1
> rtinit() at rtinit+0x3c0
> in_ifinit() at in_ifinit+0x2f0
> in_control() at in_control+0xf12
> ifioctl() at ifioctl+0xfc1
> kern_ioctl() at kern_ioctl+0xf6
> ioctl() at ioctl+0xfd
> syscall() at syscall+0x19e
> Xfast_syscall() at Xfast_syscall+0xe1
> --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800b7df5c, rsp = 
> 0x7fffffffe058, rbp = 0x7fffffffec2d ---
> db> call doadump
> Physical memory: 3983 MB
> Dumping 2351 MB: 2336 2320 2304 2288 2272 2256 2240 2224 2208 2192 2176 
> 2160 2144 2128 2112 2096 2080 2064 2048 2032 2016 2000 1984 1968 1952 
> 1936 1920 1904 1888 1872 1856 1840 1824 1808 1792 1776 1760 1744 1728 
> 1712 1696 1680 1664 1648 1632 1616 1600 1584 1568 1552 1536 1520 1504 
> 1488 1472 1456 1440 1424 1408 1392 1376 1360 1344 1328 1312 1296 1280 
> 1264 1248 1232 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 
> 1040 1024 1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 
> 768 752 736 720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 
> 480 464 448 432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 
> 192 176 160 144 128 112 96 80 64 48 32 16
> Dump complete
> = 0
> db> reset
> /boot.config: -DhS38400
> Consoles: internal video/keyboard  serial port
> BIOS drive A: is disk0
> ...
> 
> savecore: first and last dump headers disagree on /dev/mirror/diesel_swap
> savecore: unsaved dumps found but not saved
> savecore: first and last dump headers disagree on /dev/mirror/diesel_swap
> savecore: unsaved dumps found but not saved
> No crash dumps in /var/crash.
> 
> 
> I'll reconfigure swap to use a raw disk instead ofa mirror.

Yeah, dump not working with mirrored disks is a huge PITA. Please make 
the change so we can get a usable crash dump.

Kamigishi has suggested to me that the panic isn't occurring (as much?) 
with a r195617 world/kernel. Could you perhaps try update to r195617 and 
let us know if you continue to observe the panic?

Cheers,
Lawrence