On Thu, May 17, 2018 at 11:06:42AM +0300, Andriy Gapon wrote: > On 17/05/2018 10:56, Johannes Lundberg wrote: > > > > > > On Thu, May 17, 2018 at 8:46 AM, Johannes Lundberg <johalun0_at_gmail.com > > <mailto:johalun0_at_gmail.com>> wrote: > > > > > > > > On Thu, May 17, 2018 at 7:43 AM, Andriy Gapon <avg_at_freebsd.org > > <mailto:avg_at_freebsd.org>> wrote: > > > > On 17/05/2018 02:07, Johannes Lundberg wrote: > > > https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa > > <https://github.com/freebsd/freebsd/commit/66f063557f257baa9c8aeab9f933171eaa6e1cfa> > > > x86 cpususpend_handler: call wbinvd after setting suspend state bits > > > > That's very interesting and surprising. > > That commit changes something that happens before suspend, it should not > > have > > any effect on the system state after resume. > > > > Does anyone have a theory of what could be wrong? > > > > > > Nope but moving > > CPU_CLR_ATOMIC(cpu, &suspended_cpus); > > back to the end of that scope fixes it. > > > > > > > > I did some further testing. > > Calling > > CPU_CLR_ATOMIC(cpu, &suspended_cpus); > > before > > pmap_init_pat(); > > is what "breaks" resume. > > > > Is this Intel only or this it happen on AMD as well (which this patch was > > intended for)? > > Not sure about the PAT part, but fpuresume/npxresume would affect all platforms. > It's a bit puzzling that doing PAT manipulations on one AP while another AP is > being brought up is problematic. Probably there is something that I am missing. Manipulating PAT might affect the cache consistency, since contradicting caching attributes are applied to the line of the suspended_cpus variable which is already cached. It might be not the variable itself that causes the final mis-operation, but some other data sharing the line.Received on Thu May 17 2018 - 07:20:09 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC