Re: 8.0RC2 amd64 - kernel panic running make buildworld

From: Andriy Gapon <avg_at_icyb.net.ua>
Date: Fri, 13 Nov 2009 10:08:45 +0200
on 12/11/2009 20:59 Kai Gallasch said the following:
> sonnenkraft:~ # MCA: CPU 4 UNCOR PCC OVER DTLB L1 error

Kai,

very interesting info, it matches what Serguey reported too, thank you for the test!
So in all cases where MCE information is captured it seems to be L1 data TLB error.

John,
BTW, OVER may be incorrectly reported by hardware in this case, see erratum 60:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/41322.pdf

Kai,
I have a hunch, could you please try the following _sledgehammer_ patch (only
kernel build/install is needed):
diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c
index 44b71f3..a456609 100644
--- a/sys/amd64/amd64/pmap.c
+++ b/sys/amd64/amd64/pmap.c
_at__at_ -2981,6 +2981,7 _at__at_ setpte:
 	 * Map the superpage.
 	 */
 	pde_store(pde, PG_PS | newpde);
+	pmap_invalidate_all(pmap);

 	pmap_pde_promotions++;
 	CTR2(KTR_PMAP, "pmap_promote_pde: success for va %#lx"

This will slow down an act of promotion to a superpage, but should not have any
visible impact on overall performance.

Serguey,

you problem seems to not be limited to superpages only, so I am not sure if this
patch would be of much help to you.
-- 
Andriy Gapon
Received on Fri Nov 13 2009 - 07:09:41 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC