Re: panic in arptimer in r289937

From: Randall Stewart <rrs_at_netflix.com>
Date: Thu, 10 Dec 2015 07:25:56 -0800
For callout_stop/drain you get

-1 == Callout as already gone off or is not running (usually the latter) else the caller iks
          not using locking properly or did not lock and check the active() value (which would have returned not active so no stop
          would have been needed);
0 == we could not stop the callout, it was in process..
1 == it was stopped successfully 


For callout_reset() you get

0 == it was started
1 == it was restarted.

There is no -1 since it is either started or restarted never left in a not-running state...

R
On Dec 4, 2015, at 11:52 AM, Hans Petter Selasky <hps_at_selasky.org> wrote:

> On 12/04/15 20:34, Hans Petter Selasky wrote:
>> Hi Adrian,
>> 
>> On 10/31/15 16:01, Alexander V. Chernikov wrote:
>>> 
>>> 
>>> 31.10.2015, 16:46, "Adrian Chadd" <adrian_at_freebsd.org>:
>>>> On 31 October 2015 at 09:34, Alexander V. Chernikov
>>>> <melifaro_at_freebsd.org> wrote:
>>>>>  31.10.2015, 05:32, "Adrian Chadd" <adrian_at_freebsd.org>:
>>>>>>  Hiya,
>>>>>> 
>>>>>>  Here's a panic from arptimer:
>>>>>  Hi Adrian,
>>>>> 
>>>>>  As far as I see, line 205 in if_ether.c is IF_AFDATA_LOCK(ifp)
>>>>> which happens after LLE_WUNLOCK().
>>>>>  So, it looks like (pre-cached) ifp had been freed before locking
>>>>> ifdata.
>>>>>  Do you have any more details on that? (e.g. was some interface
>>>>> detached at that moment, is it reproducible, etc..)
>>>>> 
>>>>>  From a quick glance, potential use-after-free has been possible
>>>>> for quite a long time, but I wonder why it hasn't been observed before.
>>>>>  Probably lltable_free() changes might have triggered that.
>>>>> 
>>>>>  I'll take a deeper look on that and reply.
>>>> 
>> 
>> Observed on an idle box with projects/hps_head too:
>> 
>>> panic: bogus refcnt 0 on lle 0xfffff8016508ca00
>>> cpuid = 7
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>> 0xfffffe03e4e8c7e0
>>> vpanic() at vpanic+0x182/frame 0xfffffe03e4e8c860
>>> kassert_panic() at kassert_panic+0x126/frame 0xfffffe03e4e8c8d0
>>> llentry_free() at llentry_free+0x136/frame 0xfffffe03e4e8c900
>>> arptimer() at arptimer+0x20e/frame 0xfffffe03e4e8c950
>>> softclock_call_cc() at softclock_call_cc+0x170/frame 0xfffffe03e4e8c9c0
>>> softclock() at softclock+0x47/frame 0xfffffe03e4e8c9e0
>>> intr_event_execute_handlers() at
>>> intr_event_execute_handlers+0x96/frame 0xfffffe03e4e8ca20
>>> ithread_loop() at ithread_loop+0xa6/frame 0xfffffe03e4e8ca70
>>> fork_exit() at fork_exit+0x84/frame 0xfffffe03e4e8cab0
>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe03e4e8cab0
>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> 
>> Looks like callout_reset() must be examined too, and was missed by:
>> 
>> https://svnweb.freebsd.org/changeset/base/290805
>> 
>> Can you try the attached patch?
>> 
>> Randall: Can you fix this ASAP?
>> 
>> --HPS
>> 
> 
> Hi,
> 
> Randall:
> 
> I see for 11-current, callout_reset() doesn't return -1 when the callout is stopped like with callout_stop(). Is this a bug or a feature? Why can't the callout_reset() and callout_stop() functions use the same return values?
> 
> In nd6_llinfo_settimer_locked() the return value of both callout_reset() and callout_stop() is checked for positive values, but not in the other places mentioned by my patch.
> 
> --HPS

--------
Randall Stewart
rrs_at_netflix.com
803-317-4952
Received on Thu Dec 10 2015 - 14:26:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:01 UTC