5.1-CURRENT hangs on disk i/o? sysctl_old_user() non-sleepable locks

From: Chris Shenton <chris_at_shenton.org>
Date: 16 Jun 2003 09:14:00 -0400
(I don't know if this has any relation to the problems I reported
yesterday with qmail-send consuming 100% cpu after 5.0 to 5.1 upgrade.)

After booting 5.1-CURRENT the system runs fine for a while.  Then
later most disk i/o related actions seem to hang.  E.g., system works
but when cron kicks off a glimpseindex in the middle of the night, the
system is useless by the morning.  If I login on the console as me, it
takes my username and password then hangs (trying to run
/usr/local/bin/bash?). If I do this as root, I do get a shell
(/bin/csh).  After a point, asking for "top" will hang, even as root.
Even a "reboot" hung this morning with nothing in the logs.

The system has become almost unusable because of this, requiring
frequent reboots or hardware resets.

Sometimes when I do something as simple as "ps" I see this ominous
message on the console:

  sysctl_old_user() with the following non-sleepablelocks held:
  exclusive sleep mutex process lock r = 0 (0xc50bc9e0) locked _at_ /usr/src/sys/kern/kern_proc.c:258

which gets into /var/log/messages as:

  Jun 16 08:33:48 PECTOPAH kernel: exclusive sleep mutex process lock r = 0 (0xc50c7618) locked _at_ /usr/src/sys/kern/kern_proc.c:258

There are a bunch of these.

That file is version:

  $FreeBSD: src/sys/kern/kern_proc.c,v 1.189 2003/06/14 06:20:25 alc Exp $

and the line is the PROC_LOCK() portion of:

  struct proc *
  pfind(pid)
          register pid_t pid;
  {
          register struct proc *p;

          sx_slock(&allproc_lock);
          LIST_FOREACH(p, PIDHASH(pid), p_hash)
                  if (p->p_pid == pid) {
                          PROC_LOCK(p);
                          break;
                  }
          sx_sunlock(&allproc_lock);
          return (p);
  }

Any thoughts? Thanks.
Received on Mon Jun 16 2003 - 04:14:04 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:12 UTC