Unkillable and runaway processes

From: Kenneth Vestergaard Schmidt <kvs_at_pil.dk>
Date: Tue, 04 Sep 2007 15:08:20 +0200
Hello.

Our ZFS testbed is experiencing some weird problems with rsync. We run a
nightly backup of about 1.6 TB data (that's how much is stored, not how
much is transferred), but after the initial sync I haven't been able to
get the machine through one full cycle.

After many hours of rsyncing data from 50+ machines, suddenly one
rsync-process will hang, spinning on the CPU.

It switches state between CPU0, CPU1, RUN and 'zfs:(&', but doesn't
really do anything. It can't be killed, and you can't reboot the machine
- it'll get past syncing disks, but won't shutdown or reboot.

I can't do an 'ls' in the directory that rsync is running on - it'll
just hang, too.

The machine is running current from August 29th.

I could use some pointers on what to do - is there some way I can debug
this better, maybe give some better info?

-- 
Kenneth Schmidt
pil.dk
Received on Tue Sep 04 2007 - 11:33:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:17 UTC