On Thursday 06 August 2009 10:11:26 am Robert Watson wrote: > On Thu, 6 Aug 2009, Larry Rosenman wrote: > > > On Thu, 6 Aug 2009, Robert Watson wrote: > > > >> On Tue, 4 Aug 2009, Navdeep Parhar wrote: > >> > >>>>> This occurs on today's HEAD + some unrelated patches. That makes it > >>>>> 8.0BETA2+ code. I haven't tried older builds. > >>>> > >>>> We have finally been able to reproduce this ourselves yesterday and > >>> > >>> Well, it happens every single time on all of my amd64 machines. After I'd > >>> already sent my email I noticed that the netisr mutex has an odd address > >>> (pun intended :-)) > >>> > >>> m=0xffffffff8144d867 > >> > >> Heh, indeed. We just spotted the same result here. In this case it's > >> causing a panic because it leads to a non-atomic read due to mtx_lock > >> spanning a cache line boundary, followed shortly by a panic because it's > >> not a valid thread pointer when it's dereferenced, as we get a fractional > >> pointer. > > [snip] > > > > Do we have an ETA for a testable patch? > > RSN, I'm afraid. We can eliminate the effect by reverting the use of DPCPU in > netisr.c (basically reverting to pre-r195019 of netisr.c). The interesting > question is where the problem originates -- is gcc/ld/etc not laying out the > elf section properly, or are the MD parts not providing an aligned base? > There are also probably issues in the DPCPU handling of modules along similar > lines, but first things first. No, gcc/ld/etc is doing the right thing. However, the DPCPU and VNET code implicitly assumes that the dpcpu/vnet sets start off with a specific alignment and that assumption is false (as it turns out). -- John BaldwinReceived on Fri Aug 07 2009 - 10:44:44 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:53 UTC