Re: A stuck system

From: Andre Guibert de Bruet <andy_at_siliconlandmark.com>
Date: Wed, 20 Dec 2006 10:30:16 -0500
On Dec 20, 2006, at 8:49 AM, Randall Stewart wrote:

> Ok, I was wrong on this... I recreated it.. hooked up
> my em0 card to my laptop (right now its isolated
> running the mpi tests and uses the loopback only).
>
> I do a ping
>
> And ta-da  the system comes back to life after
> being hung for 15 minutes.
>
> This time I did not see any of the usual syslog messages
> either... of course it was only "stuck" for 15 minutes or
> so...
>
> I will leave the thing running and get it stuck again and
> validate that the msk and usb will also cause the machine
> to come back to life..
>
> Is there any way this could be a lost interupt type problem (remember
> the scheduler is appearing to "stop" scheduling things). OR
> is this a problem with my hardware... somehow failing to
> deliver interupts maybe???

I am seeing something similar on my dual Xeon system. It appears that  
a kernel from December 13th did not exhibit this behavior whereas one  
from the 16th does. I am able to "revive" the machine by pushing traf  
on the msk0 interface.

Kernel config: http://bling.properkernel.com/freebsd/BLING

Andy

/*  Andre Guibert de Bruet  * 6f43 6564 7020 656f 2e74 4220 7469 6a20 */
/*   Code poet / Sysadmin   * 636f 656b 2e79 5320 7379 6461 696d 2e6e */
/*   GSM: +1 734 846 8758   * 5520 494e 2058 6c73 7565 6874 002e 0000 */
/* WWW: siliconlandmark.com * C/C++, Java, Perl, PHP, SQL, XHTML, XML */
Received on Wed Dec 20 2006 - 14:43:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC