Re: AMD errata 169

From: Ian J Hart <ianjhart_at_ntlworld.com>
Date: Fri, 26 Jun 2009 20:34:59 +0100
Quoting Stanislav Sedov <stas_at_freebsd.org>:

> Content-Type: text/plain; charset=US-ASCII
> Content-Disposition: inline
> Content-Transfer-Encoding: quoted-printable
>
> On Fri, 26 Jun 2009 12:37:27 +0100
> Ian J Hart <ianjhart_at_ntlworld.com> mentioned:
>
>> I know I asked this before but I figure the long post may have put
>> some people off.
>>
>> #169
>> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf
>>
>> I'd like to eliminate this as a cause of my problem
>>
>> It appears I can read the value.
>>
>> #kldload cpuctl
>> #cpucontrol -m 0xc001001f /dev/cpuctl0
>> MSR 0xc001001f: 0x00400000 0x00100008
>>
>> #cpucontrol -m 0xc001001f=0x0040000000100008 /dev/cpuctl0
>>
>> Causes an nfe0 watchdog timeout and a powerdown failed, so that's
>> clearly a dumb thing to do.
>>
>> Would I be better off asking somewhere else?
>
> It looks like it is my fault in fact.  Due to the bug in cpuctl
> code the value written to MSR registers always was zero.  Can you,
> please, try with the following patch?  Thanks!
>
> Index: sys/dev/cpuctl/cpuctl.c
> ===================================================================
> --- sys/dev/cpuctl/cpuctl.c	(revision 195052)
> +++ sys/dev/cpuctl/cpuctl.c	(working copy)
> _at__at_ -222,14 +222,17 _at__at_
>  	 * Explicitly clear cpuid data to avoid returning stale
>  	 * info
>  	 */
> -	data->data = 0;
>  	DPRINTF("[cpuctl,%d]: operating on MSR %#0x for %d cpu\n", __LINE__,
>  	    data->msr, cpu);
>  	oldcpu = td->td_oncpu;
>  	is_bound = cpu_sched_is_bound(td);
>  	set_cpu(cpu, td);
> -	ret = cmd == CPUCTL_RDMSR ? rdmsr_safe(data->msr, &data->data) :
> -	    wrmsr_safe(data->msr, data->data);
> +	if (cmd == CPUCTL_RDMSR) {
> +		data->data = 0;
> +		ret = rdmsr_safe(data->msr, &data->data);
> +	} else {
> +		ret = wrmsr_safe(data->msr, data->data);
> +	}
>  	restore_cpu(oldcpu, is_bound, td);
>  	return (ret);
>  }
> _at__at_ -368,7 +371,7 _at__at_
>  	/*
>  	 * Perform update.
>  	 */
> -	wrmsr_safe(MSR_K8_UCODE_UPDATE, (uintptr_t)args->data);
> +	wrmsr_safe(MSR_K8_UCODE_UPDATE, (uintptr_t)ptr);
>
>  	/*
>  	 * Serialize instruction flow.
>
> --
> Stanislav Sedov
> ST4096-RIPE
>

I only have cheesy KVM access. If it locks I'll have to wait until  
Monday to power cycle, so I might wait until Sunday night.

OTOH I just opened a bottle of wine, so anything could happen. Cheers!

Looking at the Block diagram for the motherboard (Tyan S2895) in looks  
like the PCI-X slots have their own tunnel chip and it's the  
PCI-Express slots which run through the northbridge chips, so it's  
looking unlikely that this is the cause of the card errors

In any case I learned something and found a bug. That's why they call  
me the bugmeister. Actually that's just me. Hey, this wine is good...

Thanks again

-- 
ian j hart

-- 
ian j hart

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
Received on Fri Jun 26 2009 - 17:35:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC