Re: reproducible panic in netisr

From: Julian Elischer <julian_at_elischer.org>
Date: Mon, 10 Aug 2009 10:52:27 -0700
Bjoern A. Zeeb wrote:

> 
> Looking further at the problem you may have already noticed that
> the section for the 'master copy' starts at 0xffffffff8168f900
> and that the __start_set_pcpu is outside of that section at
> 0xffffffff8168f8e9.
> Looking at a section dump from `readelf -S kernel` you would
> notice that the __start_set_pcpu directly follows the end of the
> previous section.
> 
> The reasons for this are twofold:
> (1) ld places the __start_ symbol at '.' (the location counter),
> which at that point is at the end of the old section as the new
> (set_pcpu) is not yet added with __start_set_pcpu = ALIGN(0).
> (2) because we manually define the section, so that it is
> writeable, ld at the point of writing the __start_ symbol does
> not yet have any possible section alignment information. That
> is the reason for the ALIGN(0) in (1).
> 
> An expected behaviour would be for ld to put the *ABS* at the
> address where the section begins, once known or fixup the address.
> This could arguably be a bug in ld we should fix post-8.0.


This is getting closer to the root cause, and thus closer to
the "root fix".


> 
> One possible workaround would be to force the __start_ symbol
> and the section to be equally aligned and thus on the same address
> using linker scripts.  The drawbacks are that we need to touch
> the fragile linker scripts for each of the sections we add and
> for all architectures individually.  As the enforcement of alignment
> would be at a different place to the actual set creation, putting
> the alignment in might be easily forgotten.

personally I'd see if there is a way to align the section on a page 
boundary..

> The advantage would be that we can always be sure that __start_
> would be on the same address where the section starts.

that sounds like the preferable answer to me.

> 
> Another solution is to put minimum alignment on the objects inside the
> section in a way that it is only in a single place in the source code.
> The section directive in the respective header file, that will
> be included by each implementation file, is the ideal place for this.
> While cache line alignment seems to be the widest alignment
> restriction currently in use, one drawback, like with above ldscript
> solution, is that a symbol could possibly enforce a wider alignment
> restriction onto the section making the __start_ symbol and the
> section beginning to diverge again. Example:
>    0xffffffff8168f700  __start_set_pcpu
>    0xffffffff8168f800  set_pcpu
>    0xffffffff8168f800  pcpu_entry_sched_switch_stats
>    ..
> if we would put an alignment of 1024 on pcpu_entry_sched_switch_stats.
> This is unlikely to happen.
> 
> With the minimum alignment, ld, at the time of placing the
> __start_ symbol, already knows about the section alignment
> and will place it correctly on the section beginning doing:
> __start_set_pcpu = ALIGN(CACHE_LINE_SHIFT) at ".".
> 
> 
> Summary: The minimum alignment seems to be the least-intrusive
> solution and is believed to work for the moment. In addition
> documenting that the dpcpu and similar sections will not support
> super-cache line alignment.
> The long term solution would be to fix ld to DTRT.
> 
> 
> ------------------------------------------------------------------------
> !
> ! Put minimum alignment on the dpcpu and vnet section so that ld
> ! when adding the __start_ symbol knows the expected section alignment
> ! and can place the __start_ symbol correctly.
> !
> ! These sections will not support symbols with super-cache line alignment
> ! requirements.
> !
> ! See posting <Msg-ID>, 2009-08-10, to freebsd-current for full
> ! details.
> !
> ! Debugging and testing patches by:
> !    Kamigishi Rei (spambox haruhiism.net), np, lstewart, jhb,
> !    kib, rwatson
> ! Tested by:    Kamigishi Rei, lstewart
> ! Reviewed by:    kib
> ! Approved by: ! Index: sys/sys/pcpu.h
> ===================================================================
> --- sys/sys/pcpu.h    (revision 196086)
> +++ sys/sys/pcpu.h    (working copy)
> _at__at_ -56,12 +56,14 _at__at_
>  extern uintptr_t *__start_set_pcpu;
>  extern uintptr_t *__stop_set_pcpu;
> 
> +__asm__(
>  #if defined(__arm__)
> -__asm__(".section set_pcpu, \"aw\", %progbits");
> +    ".section set_pcpu, \"aw\", %progbits\n"
>  #else
> -__asm__(".section set_pcpu, \"aw\", _at_progbits");
> +    ".section set_pcpu, \"aw\", _at_progbits\n"
>  #endif
> -__asm__(".previous");
> +    "\t.p2align " __XSTRING(CACHE_LINE_SHIFT) "\n"
> +    "\t.previous");
> 
>  /*
>   * Array of dynamic pcpu base offsets.  Indexed by id.
> Index: sys/net/vnet.h
> ===================================================================
> --- sys/net/vnet.h    (revision 196086)
> +++ sys/net/vnet.h    (working copy)
> _at__at_ -185,12 +185,14 _at__at_
>   * Virtual network stack memory allocator, which allows global 
> variables to
>   * be automatically instantiated for each network stack instance.
>   */
> +__asm__(
>  #if defined(__arm__)
> -__asm__(".section " VNET_SETNAME ", \"aw\", %progbits");
> +    ".section " VNET_SETNAME ", \"aw\", %progbits\n"
>  #else
> -__asm__(".section " VNET_SETNAME ", \"aw\", _at_progbits");
> +    ".section " VNET_SETNAME ", \"aw\", _at_progbits\n"
>  #endif

I may be visually impaired but I'm not seeing a reason for the
ifdef arm..


> -__asm__(".previous");
> +    "\t.p2align " __XSTRING(CACHE_LINE_SHIFT) "\n"
> +    "\t.previous");
> 
>  #define    VNET_NAME(n)        vnet_entry_##n
>  #define    VNET
> 
> ------------------------------------------------------------------------
> 
Received on Mon Aug 10 2009 - 15:52:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:53 UTC