Re: Alpha problems (though maybe not just Alpha...)

From: Andrew Gallatin <gallatin_at_cs.duke.edu>
Date: Thu, 8 Jul 2004 10:54:03 -0400 (EDT)
Ken Smith writes:
 > 
 > FYI, the Alpha reference machine in the cluster has had problems the
 > past few days when anything significant (e.g. 'make buildworld'...)
 > gets run on it.  It usually locks up solid right after printing
 > this on the console:
 > 
 > panic() at panic+0x200
 > _mtx_assert() at _mtx_assert+0xb4
 > vrele() at vr 16 32 48 64 80

A mutex assert failed in vrele().  The first one that comes to mind is
the GIANT_REQUIRED; right at the top.  Its really too bad that we
don't have the rest of the stack.

The remainder of the output is quite strange..  It almost looks like
it has stopped printing the stack trace and started taking a dump, and
then panic'ed taking the dump (ahc intr might support that):

 > fatal kernel trap:
 > 
 >     trap entry     = 0x2 (memory management fault)
 >     cpuid          = 0
 >     faulting va    = 0x0
 >     type           = access violation
 >     cause          = instruction fetch
 >     pc             = 0x0
 >     ra             = 0x0
 >     sp             = 0xfffffe00317bfc70
 >     curthread      = 0xfffffc007d772000
 >         pid = 23, comm = intr: ahc1
 > 
 > spin lock sched lock held by 0xfffffc007d772000 for > 5 seconds

A va and ra of 0 means that it followed a null function pointer,
or the stack got corrupted and it tried to return to the wrong
place...

Can you try disabling dumps, and disabling kern.sync_on_panic
and see if you can get a decent stack trace?

Thanks,

Drew
Received on Thu Jul 08 2004 - 12:54:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:00 UTC