Re: A stuck system

From: Randall Stewart <rrs_at_cisco.com>
Date: Fri, 29 Dec 2006 05:48:34 -0500
John Baldwin wrote:
> On Wednesday 20 December 2006 11:12, Randall Stewart wrote:
>> Interesting.. I have actually been having
>> this problem for a while... can't remember
>> when I last updated.. its related to pounding
>> the network.. at least mine seems to  be... (I
>> am pounding the loopback).. and it appears
>> that everything just "freezes".
>>
>> Is your machine a Gig-a-Byte motherboard?
> 
> Do you have a dual-port msk0 device?

Nope... its just a single port, on-motherboard msk0.

It does wake up though if I ping any interface...

I suspect it might be a hardware problem.. not sure
yet :-0

R
> 
>> R
>>
>> Andre Guibert de Bruet wrote:
>>>
>>> On Dec 20, 2006, at 8:49 AM, Randall Stewart wrote:
>>>
>>>> Ok, I was wrong on this... I recreated it.. hooked up
>>>> my em0 card to my laptop (right now its isolated
>>>> running the mpi tests and uses the loopback only).
>>>>
>>>> I do a ping
>>>>
>>>> And ta-da  the system comes back to life after
>>>> being hung for 15 minutes.
>>>>
>>>> This time I did not see any of the usual syslog messages
>>>> either... of course it was only "stuck" for 15 minutes or
>>>> so...
>>>>
>>>> I will leave the thing running and get it stuck again and
>>>> validate that the msk and usb will also cause the machine
>>>> to come back to life..
>>>>
>>>> Is there any way this could be a lost interupt type problem (remember
>>>> the scheduler is appearing to "stop" scheduling things). OR
>>>> is this a problem with my hardware... somehow failing to
>>>> deliver interupts maybe???
>>>
>>> I am seeing something similar on my dual Xeon system. It appears that  a 
>>> kernel from December 13th did not exhibit this behavior whereas one  
>>> from the 16th does. I am able to "revive" the machine by pushing traf  
>>> on the msk0 interface.
>>>
>>> Kernel config: http://bling.properkernel.com/freebsd/BLING
>>>
>>> Andy
>>>
>>> /*  Andre Guibert de Bruet  * 6f43 6564 7020 656f 2e74 4220 7469 6a20 */
>>> /*   Code poet / Sysadmin   * 636f 656b 2e79 5320 7379 6461 696d 2e6e */
>>> /*   GSM: +1 734 846 8758   * 5520 494e 2058 6c73 7565 6874 002e 0000 */
>>> /* WWW: siliconlandmark.com * C/C++, Java, Perl, PHP, SQL, XHTML, XML */
>>>
>>>
>>> _______________________________________________
>>> freebsd-current_at_freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>>>
>>
> 


-- 
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)
Received on Fri Dec 29 2006 - 11:31:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC