> In message <E1CuXry-000JJd-OV_at_cs1.cs.huji.ac.il>, Danny Braniss writes: > >> In message <20050128144733.GA91982_at_green.homeunix.org>, Brian Fundakowski Feldman writes: > >> >On Fri, Jan 28, 2005 at 09:08:54AM +0200, Danny Braniss wrote: > >> >> hi, > >> >> while running 'dump 0f - /dist | restore rf -' > >> >> the dump proc. got stuck, it seems it's waiting on some lock: > >> >> > >> >> UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND > >> >> > >> >> 0 30924 30922 0 4 0 3396 2852 sbwait T p1 1:00.88 dump: > >> >> /dev/amrd0s3h: ... > >> >> 0 30925 30924 1 -8 0 3268 2784 physrd TL p1 0:53.84 dump 0f - > >> >> /dist (dump) > >> >> 0 30926 30924 1 20 0 3268 2784 pause T p1 0:53.69 dump 0f - > >> >> /dist (dump) > >> >> 0 30927 30924 1 20 0 3268 2784 pause T p1 0:54.12 dump 0f - > >> >> /dist (dump) > >> >> > >> >> (this is 5.3-STABLE, cvs'ed about a week ago, and it's a SMP system). > >> >> how can i find which lock? or who is holding it? > >> > > >> >Is the one in physrd not actually reading anything from the disk right > >> >now? I would suspect that should be how you really determine if it's > >> >hung or not. You should be able to see how long it's been waiting > >> >and how long it's due to wait still, using kgdb. > >> > >> Check also with gstat(8) if there is I/O activity going on and/or if any > >> I/O requests are stuck. > > > >it's stuck. i.e. not doing anything. i've been monitoring it via > >iostat, and nothing is moving, nada, the machine is very idle :-( > > Please use gstat(8) and look for stuck I/O requests. ok, I had to run it again, this time it worked several times, but it finaly got stuck, btw, it's been stuck for several hours. gstat: L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 0 0 0 0.0 0 0 0.0 0.0| fd0 2 0 0 0 0.0 0 0 0.0 0.0| amrd0 0 0 0 0 0.0 0 0 0.0 0.0| amrd0s1 0 0 0 0 0.0 0 0 0.0 0.0| amrd0s2 2 0 0 0 0.0 0 0 0.0 0.0| amrd0s3 the rest is all zero. top says: last pid: 1678; load averages: 0.00, 0.00, 0.00 up 0+05:22:18 22:28:00 42 processes: 1 running, 41 sleeping CPU states: 0.0% user, 0.0% nice, 0.1% system, 0.3% interrupt, 99.6% idle Mem: 231M Active, 3121M Inact, 203M Wired, 132M Cache, 112M Buf, 73M Free Swap: 4096M Total, 4096M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU COMMAND 1165 root -8 0 50420K 49924K piperd 3 4:51 0.00% 0.00% restore 1168 root 4 0 36168K 35624K sbwait 3 1:20 0.00% 0.00% dump 1170 root -8 0 36040K 35552K physrd 2 0:51 0.00% 0.00% dump 1169 root 20 0 36040K 35552K pause 1 0:51 0.00% 0.00% dump 1171 root 20 0 36040K 35552K pause 3 0:51 0.00% 0.00% dump waiting further instructions :-) dannyReceived on Fri Jan 28 2005 - 19:30:18 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:27 UTC