On 10/15/16 18:18, Ulrich Spörlein wrote: > Hey all, while 11.x is -STABLE now, this happens to my machine ever > since I upgraded it to 11-CURRENT years ago. I have no idea when this > started, actually, but what always happens is this: > > - System and X11 is up and running, I keep it running over night as I'm > too lazy to reboot and restart everthing. > - There's a bunch of xterms, Chrome, Clementine-Player and some other > programs running > - Coming back to the machine the next day (or the day after) it will > exit the screensaver just fine and then either I can use it for a couple > of seconds before it freezes, or it's pretty much dead already. The > mouse cursor still moves for a bit, but the also freezes (so it this a > GPU problem??) > > Now what I currently see on the screen is a clock widget stuck at 18:04 > but conky itself has last updated at 18:00:18 ... > > This time I had some SSH sessions from another machine to see some more > useful things. There was nothing in various logs under /var/log (I also > can't run dmesg anymore ...) > I had top(1) running in a loop, this is the last output: > > last pid: 25633; load averages: 0.27, 0.39, 0.36 up 1+23:03:28 18:00:12 > 202 processes: 2 running, 188 sleeping, 11 zombie, 1 waiting > > Mem: 8873M Active, 1783M Inact, 5072M Wired, 567M Buf, 132M Free > ARC: 1844M Total, 469M MFU, 268M MRU, 16K Anon, 96M Header, 1012M Other > Swap: 4096M Total, 2395M Used, 1701M Free, 58% Inuse > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 11 root 8 155 ki31 0K 128K CPU0 0 364.6H 772.95% idle 3122 uqs 15 28 0 7113M 5861M uwait 0 94:44 13.96% chrome 2887 uqs 28 22 0 1394M 237M select 2 172:53 6.98% chrome 2890 uqs 11 21 0 1034M 178M select 5 231:21 1.95% chrome 1062 root 9 21 0 440M 47220K select 0 67:09 0.98% Xorg 3002 uqs 15 25 5 1159M 172M uwait 2 19:09 0.00% chrome > 3139 uqs 17 25 5 1163M 156M uwait 2 16:15 0.00% chrome > 3001 uqs 18 25 5 1639M 575M uwait 0 16:05 0.00% chrome > 12 root 24 -64 - 0K 384K WAIT -1 10:53 0.00% intr > 3129 uqs 12 20 0 2820M 1746M uwait 6 8:36 0.00% chrome > 2822 uqs 9 20 0 217M 81300K select 0 5:10 0.00% conky > 3174 root 1 20 0 21532K 3188K select 0 4:20 0.00% systat > 3130 uqs 16 20 0 1058M 131M uwait 4 3:03 0.00% chrome > 2998 uqs 16 20 0 1110M 123M uwait 2 2:53 0.00% chrome > 3165 uqs 10 20 0 1209M 215M uwait 6 2:52 0.00% chrome > 3142 uqs 11 25 5 1344M 195M uwait 2 2:46 0.00% chrome > 2876 uqs 19 20 0 580M 37164K select 3 2:42 0.00% clementine-player > 20 root 2 -16 - 0K 32K psleep 6 2:25 0.00% pagedaemon > > I also had systat -vm running and it continued to update its screen ... > for a short while, this is the last update before SSH died: > > > Mem usage: 0k%Phy 5%Kmem > Mem: KB REAL VIRTUAL VN PAGER SWAP PAGER > Tot Share Tot Share Free in out in out > Act 11051k 67868 71051992 255448 61840 count > All 11051k 67924 71058776 262100 pages > Proc: Interrupts > r p d s w Csw Trp Sys Int Sof Flt ioflt 224 total > 25 730 11 724 109 404 101 13 cow 2 ehci0 16 > zfod 3 ehci1 23 > 0.0%Sys 0.1%Intr 0.0%User 0.0%Nice 99.9%Idle ozfod 16 cpu0:timer > | | | | | | | | | | %ozfod xhci0 264 > daefr 3 em0 265 > 50 dtbuf prcfr 94 hdac1 266 > Namei Name-cache Dir-cache 349167 desvn totfr ahci0 270 > Calls hits % hits % 349155 numvn react 5 cpu1:timer > 121 121 100 253501 frevn pdwak 1 cpu2:timer > pdpgs 29 cpu7:timer > Disks md0 ada0 ada1 pass0 pass1 pass2 intrn 12 cpu3:timer > KB/t 0.00 0.00 0.00 0.00 0.00 0.00 5318892 wire 41 cpu6:timer > tps 0 0 0 0 0 0 9261404 act 12 cpu5:timer > MB/s 0.00 0.00 0.00 0.00 0.00 0.00 1598184 inact 6 cpu4:timer > %busy 0 0 0 0 0 0 cache vgapci0 > 61840 free > 712304 buf > > > Why do I have a Chrome tab using about 6G? What other sort of debugging > output can be helpful to get to the bottom of this? The machine still > responds to pings just fine, TCP connections get set up but the SSH > handshake never completes. > > This always happens between 30-50h and is super annoying and has been > going on for >1year. Help? > > Note, I cut the power to the monitor overnight to save electricity, can > this mess up something in the Radeon card or X server? What combinations > would be most useful to try next? > Hi, Sounds like a memory leak. Can you track the memory use over time? Did you look at the output from: vmstat -m ? --HPSReceived on Sat Oct 15 2016 - 14:21:28 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC