mutex-bug in recent releng_[67]?

From: Arno J. Klaassen <arno_at_heho.snv.jussieu.fr>
Date: 17 Nov 2007 17:15:12 +0100
Hello,

I have serious problems with " Runtime.getRuntime ().exec ()" : either
it hangs or gives a spinning mutex running at 99% cpu.

Attached is a simple code-example which shows the problem : it basically
just launces in iterations a process doing "/bin/ls -lRr /var/"
(the command seems to be important : e.g. ps(1) works fine, ls(1), tar(1),
cpio(1) (all doing fileio ...) fail more or less easiliy) 
and then waits for it to exit.

This works OK with linux-sun-jdk15, it fails (most often just hang,
something process ends with exit code 127) on all boxes I could test on :

 i686-releng_6-UP / jdk-1.5.0.13p7,1
 i686-releng_7-UP / jdk-1.5.0.12p6_2,1
 amd64-releng_6-SMP / jdk-1.5.0.12p6_2,1 and jdk-1.5.0.13p7,1
 amd64-releng_7-SMP / jdk-1.5.0.13p7,1


I somehow doubt this is really (only) a jdk-problem : it fails (hangs)
as well if I compile it with gcj to an executable (tested both on
 i686-releng_6-UP and amd64-releng_7-SMP).

Attached a gdb-log (for releng_7) with shows three threads, two of them
blocking in _umtx_op () (from pthread_cond_init () ), the third
in sigsuspend () (from pthread_getprio () ?).

If I create a core-dump with "gcore -s" all sixteen threads
block in (log attached for the two first threads ) :

  #0  0x00000008008cabfc in wait4 () from /lib/libc.so.7
  #1  0x000000080075616e in waitpid () from /lib/libthr.so.3
  #2  0x0000000801e43030 in Java_java_lang_UNIXProcess_waitForProcessExit (
      env=0x82c1a2998, junk=0x7ffffeef1798, pid=906)


I hope someone can help me for this, or should I write a PR?

Thanx very much in adavance.

Arno





-- 

  Arno J. Klaassen

  SCITO S.A.
  8 rue des Haies
  F-75020 Paris, France
  http://scito.com


Received on Sat Nov 17 2007 - 15:15:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:22 UTC