Re: Any successful installs on a Broadcom HT1000 chipset?

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 28 Nov 2007 09:38:37 -0500
On Wednesday 28 November 2007 08:51:38 am Søren Schmidt wrote:
> John Baldwin wrote:
> > On Wednesday 28 November 2007 02:45:16 am Søren Schmidt wrote:
> >   
> >> John Baldwin wrote:
> >>     
> >>> FYI, I've seen weird in-memory corruption with machines with the HT1000_S1 
> >>> atapci device.  In all the cases I've seen so far, a single page is corrupted 
> >>> with garbage and the page happens to be used by UMA to hold credentials 
> >>> including proc0's credentials.  I've seen this corruption (trashed creds for 
> >>> proc0 and other creds in that page) on many of the same boxes (Dell 1435's 
> >>> IIRC) running on 6.2.  I've tried switching the HT1000_S1 to use SWKSMIO 
> >>> rather SWKS100 as I mentioned to you in an earlier e-mail (the Linux driver 
> >>> uses equivalent of SWKSMIO FWIW) but don't have any conclusive tests on that.
> >>>
> >>>   
> >>>       
> >> OK, seems the chipset has some real problems, I have digged through all 
> >> the (very little) docs and info I got from serverworks back when, and 
> >> the only thing I can find is that the chips doesn't support MSI in any 
> >> shape or fashion or it will do really strange things.
> >> Now on my system it seems to be disabled but I'm not sure yet how its 
> >> determined to be that way. Would be worth for you guys to check what the 
> >> sysctl's "hw.pci.enable_msi" and "hw.pci.enable_msix" are set to.
> >> I havn't looked into this yet, but I'm pretty sure we added MSI support 
> >> in the 6.2 -> 7.0 timeframe, so that might have uncovered this chipset 
> >> bug, and possibly the Promise data corruption one as well.
> >>     
> >
> > The ata driver doesn't use MSI (no calls to pci_msi_count or pci_msi_alloc,
> > etc.), so this isn't an issue.  Also, the boxes I've seen the corruption on
> > already have MSI disabled (it's still disabled by default in 6.x).
> >   
> OK, its must be *totally* disabled not just for ATA but for everything 
> on those chipsets or they'll barf all over the place.
> If we do that already we need to look into other places.
> However, if we are dealing with in-memory corruption this is going to 
> get "interesting"....
> Does that also happen if nothing uses DMA ?

Again, on the machines I'm seeing this on it was totally disabled.  I don't think
I can totally disable DMA (NICs etc. must use DMA) on the machines and since they
are in production and I only see the corruption as an after-effect when the boxes
panic or deadlock for another reason I'm not easily able to reproduce this.  Also,
we do disable MSI for devices behind HT2000 chipsets because of a chip bug, but
not on HT1000 currently.  However, MSI isn't on on 6.x anyway.

-- 
John Baldwin
Received on Wed Nov 28 2007 - 14:37:41 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:23 UTC