Re: Panic on kldload/kldunload in/near callout

From: Alexander V. Chernikov <melifaro_at_freebsd.org>
Date: Sat, 12 Sep 2015 15:32:30 +0300
12.09.2015, 02:22, "hiren panchasara" <hiren_at_strugglingcoder.info>:
> On 09/11/15 at 09:06P, Hans Petter Selasky wrote:
>> šOn 09/10/15 21:23, hiren panchasara wrote:
>> š> I am on 11.0-CURRENT FreeBSD 11.0-CURRENT #4 r286760M: Thu Sep 10
>> š> 08:15:43 MST 2015
>> š>
>> š> I get random (1 out of 10 tries) panics when I do:
>> š> # kldunload dummynet ; kldunload ipfw ;kldload ipfw ; kldload dummynet
>> š>
>> š> I used to get panics on a couple months old -head also.
>> š>
>> š> kernel trap 12 with interrupts disabled
>> š>
>> š> Fatal trap 12: page fault while in kernel mode
>> š> cpuid = 0; apic id = 00
>> š> fault virtual address = 0xffffffff8225cf58
>> š> fault code = supervisor read data, page not present
>> š> instruction pointer = 0x20:0xffffffff80aad500
>> š> stack pointer = 0x28:0xfffffe1f9d588700
>> š> frame pointer = 0x28:0xfffffe1f9d588790
>> š> code segment = base 0x0, limit 0xfffff, type 0x1b
>> š> = DPL 0, pres 1, long 1, def32 0, gran 1
>> š>
>> š> Following https://www.freebsd.org/doc/faq/advanced.html, I did:
>> š> # nm -n /boot/kernel/kernel | grep ffffffff80aad500
>> š> # nm -n /boot/kernel/kernel | grep ffffffff80aad50
>> š> # nm -n /boot/kernel/kernel | grep ffffffff80aad5
>> š> # nm -n /boot/kernel/kernel | grep ffffffff80aad
>> š> ffffffff80aad030 t itimers_event_hook_exec
>> š> ffffffff80aad040 t realtimer_expire
>> š> ffffffff80aad360 T callout_process
>> š> ffffffff80aad6b0 t softclock_call_cc
>> š> ffffffff80aadc10 T softclock
>> š> ffffffff80aadd20 T timeout
>> š> ffffffff80aade90 T callout_reset_sbt_on
>> š>
>> š> So I guess " ffffffff80aad360 T callout_process" is the closest match?
>> š>
>> š> I'll try to get real dump to get more information but that may take a
>> š> while.
>> š>
>> š> ccing jch and hans who've been playing in this area.
>>
>> šHi,
>>
>> šPossibly it means some timer was not drained before the module was
>> šunloaded. It is not enough to only stop timers before freeing its
>> šmemory. Or maybe a timer was restarted after drain.
>>
>> šCan you get the full backtrace and put debugging symbols into the kernel?
>
> I'll try to get it. Meanwhile I am getting another panic on idle box:
> http://pastebin.com/9qJTFMik
The easiest explanation could be lack of lla_create() result check, fixed in r286945.
This panic is triggered by fast interface down-up (or just up), when ARP packet is received but there are no (matching) IPv4 prefix on the interface.
If this is not the case (e.g. it paniced w/o any interface changes and there were no other subnets in given L2 segment) I'd be happy to debug this further.
>
> This "looks" similar to
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=156026 which got fixed
> via https://svnweb.freebsd.org/base?view=revision&revision=r214675
> "Don't leak the LLE lock if the arptimer callout is pending or
> inactive."
>
> Is what I am seeing similar to this?
>
> I'll try and get more info.
>
> Cheers,
> Hiren
Received on Sat Sep 12 2015 - 10:32:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:59 UTC