Re: Problem remains with FreeBSD 6.0-RC1 as seen in RELENG_5

From: Chad Leigh -- Shire.Net LLC <chad_at_shire.net>
Date: Sun, 30 Oct 2005 01:39:14 -0600
On Oct 19, 2005, at 10:23 AM, Philip Kizer wrote:

> I have a problem I reported on freebsd-stable several weeks ago in:
>
>   http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=47770+187449+/usr/ 
> local/www/db/text/2005/freebsd-stable/20051009.freebsd-stable
>
>
> I upgraded a test box to see if all of the reports were true that  
> threading
> and most other major problems were better in the 6.x branch, but I  
> have had
> the same kind of hangs with 6.0-RC1 that I was having with RELENG_5.
>
> I get notified that some of my services are unavailable and I  
> verify that
> new connection attempts from remote just hang.  Attempts at issuing
> commands from my existing ssh connections will let me send a  
> <return> and
> see a new prompt generated, but any attempt at execing/etc will  
> then hang
> that process.
>
>
> Moving to the console, I get the same behaviour such that if it is  
> already
> logged in, I can hit return in the shell and get a new shell  
> prompt, but
> any command I try (i.e.  'uptime' or 'uname') then results in the  
> same lack
> of any further response.  [Though, obviously from below, sending a  
> break to
> activate DDB still works.]
>
> If the console is not already logged in and I hit return at  
> "login:", I get
> a new "login:" prompt; but, as soon as I enter a username+^M or  
> even Ctrl-D
> it, I cease to receive any feedback besides terminal echo from my  
> input.
>

I won't be much help in debugging this, but one of my main production  
servers was having this exact set of symptoms (all of them), about  
once a month.  It started with 5.3-REL (happened once with 5.3-REL  
right as I was getting ready to upgrade to 5.4-REL after several  
months of running -- during a bunch of "dump" commands to files of md  
based filesystems).  Happened several times, about every 2-6 weeks,  
until I upgraded to 5.4-STABLE a few weeks ago.  Has not happened  
since but I am not sure it was "fixed."

One interesting thing:  When it would happen, I would have to do a  
hard boot by pushing the reset button.  The problem would happen 2 or  
3 more times after rebooting within 10-20 minutes of finishing the  
boot (and would have to be hard reset each time), then would no  
longer happen for 2-6 weeks.

The machine is a Tyan 2882 dual Opteron running the i386 version of  
FreeBSD and has an Adaptec 2200S controller (aac).  One time I saw an  
aac error on the console but usually not.

It has lots of md based file systems mounted (disk backed).  Has many  
jails mostly all running out of the md filesystems.  Except for ssh,  
has no services running on the base install -- everything happens  
inside a jail.

I will try and contribute but this is a major production machine and  
cannot be screwed around with until such time as an upgrade or  
planned reboot happens or the problem happens and we have to reboot  
anyway.

# uname -a
FreeBSD bywater.shire.net 5.4-STABLE FreeBSD 5.4-STABLE #7: Sun Oct   
2 13:27:37 MDT 2005     chad_at_bywater.shire.net:/usr/obj/usr/src/sys/ 
BYWATER-SMP  i386

Thanks
Chad


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad_at_shire.net
Received on Sun Oct 30 2005 - 06:39:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:46 UTC