Re: unknown mtx_assert at /usr/src/sys/x86/x86/io_apic.c:161

From: Michael Jung <mikej_at_paymentallianceintl.com>
Date: Tue, 18 Jan 2011 17:02:30 -0500
On 1/14/11 8:55 PM, "Michael Jung" <mikej_at_paymentallianceintl.com> wrote:

> John:
> 
> Thanks, I actually didnšt see the MCA errors on the screen as the system has
> reloaded but noted them in the ddb.txt file last night.
> 
> The Motherboard, CPU, Memory and PS were replaced today.  Išll post back if
> this has or not corrected the problem but I suspect you are on target in
> that the hardware was defective.  This machine was remote and I found the
> fan in the power supply not working, so Išm suspecting that the CPU was or
> other logic was damaged.
> 
> Thanks for your reply.
> 
> --mikej
> 
> 
> On 1/14/11 4:13 PM, "John Baldwin" <jhb_at_freebsd.org> wrote:
> 
>> > On Thursday, January 13, 2011 11:26:46 am Michael Jung wrote:
>>>> >> > Links to crash info below.
>>>> >> > http://216.26.153.6/msgbuf.txt
>> >
>> > This might be a hardware problem.  The panic you got is a "should never
>> > happen" panic.  Note that in the code line sourced, the second argument to
>> > mtx_assert() is MA_OWNED.  The panic is saying that it is some invalid
>> value
>> > (i.e. something other than MA_OWNED).  Given that is a constant, that's not
>> > very likely at all barring some hardware glitch.
>> >
>> > You do have a somewhat scary looking machine check logged before your
>> panic:
>> >
>> > MCA: Bank 1, Status 0xd000000000000171
>> > MCA: Global Cap 0x0000000000000105, Status 0x0000000000000000
>> > MCA: Vendor "AuthenticAMD", ID 0x20fc2, APIC ID 0
>> > MCA: CPU 0 COR OVER ICACHE L1 EVICT error
>> >
>> > It is a correctable error, but given the nature of the panic I'd suspect a
>> > hardware problem.
>> >
>> > mcelog doesn't provide many more details:
>> >
>> > HARDWARE ERROR. This is *NOT* a software problem!
>> > Please contact your hardware vendor
>> > CPU 0 1 instruction cache
>> >        bit62 = error overflow (multiple errors)
>> >   memory/cache error 'evict mem transaction, instruction transaction, level
>> 1'
>> > STATUS d000000000000171 MCGSTATUS 0
>> > MCGCAP 105 APICID 0 SOCKETID 0
>> > CPUID Vendor AMD Family 15 Model 44
>> >
>> > --
>> > John Baldwin
>> >
> 
> The box has run fine since hardware was replaced.  Thanks for you help.
> 
> ---mikej


CONFIDENTIALITY NOTE: This message is intended only for the use
of the individual or entity to whom it is addressed and may contain 
information that is privileged, confidential, and exempt from 
disclosure under applicable law. If the reader of this message is 
not the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication 
is strictly prohibited. If you have received this transmission 
in error, please notify us by telephone at (502) 212-4001 or 
notify us at PAI , Dept. 99, 11857 Commonwealth Drive, 
Louisville, KY  40299.  Thank you.
Received on Tue Jan 18 2011 - 21:02:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC