Re: CURRENT slow and shaky network stability

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Sat, 9 Apr 2016 10:54:44 +0200
Am Mon, 04 Apr 2016 23:46:08 -0700
Cy Schubert <Cy.Schubert_at_komquats.com> schrieb:

> In message <20160405082047.670d7241_at_freyja.zeit4.iv.bundesimmobilien.de>, 
> "O. H
> artmann" writes:
> > On Sat, 02 Apr 2016 16:14:57 -0700
> > Cy Schubert <Cy.Schubert_at_komquats.com> wrote:
> >   
> > > In message <20160402231955.41b05526.ohartman_at_zedat.fu-berlin.de>, "O. 
> > > Hartmann"
> > >  writes:  
> > > > --Sig_/eJJPtbrEuK1nN2zIpc7BmVr
> > > > Content-Type: text/plain; charset=US-ASCII
> > > > Content-Transfer-Encoding: quoted-printable
> > > > 
> > > > Am Sat, 2 Apr 2016 11:39:10 +0200
> > > > "O. Hartmann" <ohartman_at_zedat.fu-berlin.de> schrieb:
> > > >     
> > > > > Am Sat, 2 Apr 2016 10:55:03 +0200
> > > > > "O. Hartmann" <ohartman_at_zedat.fu-berlin.de> schrieb:
> > > > >=20    
> > > > > > Am Sat, 02 Apr 2016 01:07:55 -0700
> > > > > > Cy Schubert <Cy.Schubert_at_komquats.com> schrieb:
> > > > > >  =20    
> > > > > > > In message <56F6C6B0.6010103_at_protected-networks.net>, Michael Butle  
> > r  
> > > > > > > =    
> > > > writes:   =20    
> > > > > > > > -current is not great for interactive use at all. The strategy of
> > > > > > > > pre-emptively dropping idle processes to swap is hurting .. big
> > > > > > > > tim=    
> > > > e.     =20    
> > > > > > >=20
> > > > > > > FreeBSD doesn't "preemptively" or arbitrarily push pages out to
> > > > > > > disk.=    
> > > >  LRU=20    
> > > > > > > doesn't do this.
> > > > > > >    =20    
> > > > > > > >=20
> > > > > > > > Compare inactive memory to swap in this example ..
> > > > > > > >=20
> > > > > > > > 110 processes: 1 running, 108 sleeping, 1 zombie
> > > > > > > > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt, 94.5%
> > > > > > > > i=    
> > > > dle    
> > > > > > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Free
> > > > > > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse     =20    
> > > > > > >=20
> > > > > > > To analyze this you need to capture vmstat output. You'll see the
> > > > > > > fre=    
> > > > e pool=20    
> > > > > > > dip below a threshold and pages go out to disk in response. If you
> > > > > > > ha=    
> > > > ve=20    
> > > > > > > daemons with small working sets, pages that are not part of the
> > > > > > > worki=    
> > > > ng=20    
> > > > > > > sets for daemons or applications will eventually be paged out. This
> > > > > > > i=    
> > > > s not=20    
> > > > > > > a bad thing. In your example above, the 281 MB of UFS buffers are
> > > > > > > mor=    
> > > > e=20    
> > > > > > > active than the 917 MB paged out. If it's paged out and never used
> > > > > > > ag=    
> > > > ain,=20    
> > > > > > > then it doesn't hurt. However the 281 MB of buffers saves you I/O.
> > > > > > > Th=    
> > > > e=20    
> > > > > > > inactive pages are part of your free pool that were active at one
> > > > > > > tim=    
> > > > e but=20    
> > > > > > > now are not. They may be reclaimed and if they are, you've just
> > > > > > > saved=    
> > > >  more=20    
> > > > > > > I/O.
> > > > > > >=20
> > > > > > > Top is a poor tool to analyze memory use. Vmstat is the better tool
> > > > > > > t=    
> > > > o help=20    
> > > > > > > understand memory use. Inactive memory isn't a bad thing per se.
> > > > > > > Moni=    
> > > > tor=20    
> > > > > > > page outs, scan rate and page reclaims.
> > > > > > >=20
> > > > > > >    =20    
> > > > > >=20
> > > > > > I give up! Tried to check via ssh/vmstat what is going on. Last lines
> > > > > > b=    
> > > > efore broken    
> > > > > > pipe:
> > > > > >=20
> > > > > > [...]
> > > > > > procs  memory       page                    disks     faults           
> > cpu  
> > > > > > r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy    c  
> > s  
> > > > > > =    
> > > > us sy id    
> > > > > > 22 0 22 5.8G  1.0G 46319   0   0   0 55721 1297   0   4  219 23907
> > > > > > 540=    
> > > > 0 95  5  0    
> > > > > > 22 0 22 5.4G  1.3G 51733   0   0   0 72436 1162   0   0  108 40869
> > > > > > 345=    
> > > > 9 93  7  0    
> > > > > > 15 0 22  12G  1.2G 54400   0  27   0 52188 1160   0  42  148 52192
> > > > > > 436=    
> > > > 6 91  9  0    
> > > > > > 14 0 22  12G  1.0G 44954   0  37   0 37550 1179   0  39  141 86209
> > > > > > 436=    
> > > > 8 88 12  0    
> > > > > > 26 0 22  12G  1.1G 60258   0  81   0 69459 1119   0  27  123 779569
> > > > > > 704=    
> > > > 359 87 13  0    
> > > > > > 29 3 22  13G  774M 50576   0  68   0 32204 1304   0   2  102 507337
> > > > > > 484=    
> > > > 861 93  7  0    
> > > > > > 27 0 22  13G  937M 47477   0  48   0 59458 1264   3   2  112 68131
> > > > > > 4440=    
> > > > 7 95  5  0    
> > > > > > 36 0 22  13G  829M 83164   0   2   0 82575 1225   1   0  126 99366
> > > > > > 3806=    
> > > > 0 89 11  0    
> > > > > > 35 0 22 6.2G  1.1G 98803   0  13   0 121375 1217   2   8  112 99371
> > > > > > 49=    
> > > > 99 85 15  0    
> > > > > > 34 0 22  13G  723M 54436   0  20   0 36952 1276   0  17  153 29142
> > > > > > 443=    
> > > > 1 95  5  0    
> > > > > > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Broken pip  
> > e  
> > > > > >=20
> > > > > >=20
> > > > > > This makes this crap system completely unusable. The server (FreeBSD
> > > > > > 11=    
> > > > .0-CURRENT #20    
> > > > > > r297503: Sat Apr  2 09:02:41 CEST 2016 amd64) in question did
> > > > > > poudriere=    
> > > >  bulk job. I    
> > > > > > can not even determine what terminal goes down first - another one,
> > > > > > muc=    
> > > > h more time    
> > > > > > idle than the one shwoing the "vmstat 5" output, is still alive!=20
> > > > > >=20
> > > > > > i consider this a serious bug and it is no benefit what happened sinc  
> > e  
> > > > > > =    
> > > > this "fancy"    
> > > > > > update. :-( =20    
> > > > >=20
> > > > > By the way - it might be of interest and some hint.
> > > > >=20
> > > > > One of my boxes is acting as server and gateway. It utilises NAT, IPFW,
> > > > > w=    
> > > > hen it is under    
> > > > > high load, as it was today, sometimes passing the network flow from ISP
> > > > > i=    
> > > > nto the network    
> > > > > for clients is extremely slow. I do not consider this the reason for
> > > > > coll=    
> > > > apsing ssh    
> > > > > sessions, since this incident happens also under no-load, but in the
> > > > > over=    
> > > > all-view onto    
> > > > > the problem, this could be a hint - I hope.=20    
> > > > 
> > > > I just checked on one box, that "broke pipe" very quickly after I started  
> >  p=  
> > > > oudriere,
> > > > while it did well a couple of hours before until the pipe broke. It seems  
> >  i=  
> > > > t's load
> > > > dependend when the ssh session gets wrecked, but more important, after th  
> > e =  
> > > > long-haul
> > > > poudriere run, I rebooted the box and tried again with the mentioned brok  
> > en=  
> > > >  pipe after a
> > > > couple of minutes after poudriere ran. Then I left the box for several ho  
> > ur=  
> > > > s and logged
> > > > in again and checked the swap. Although there was for hours no load or ot  
> > he=  
> > > > r pressure,
> > > > there were 31% of of swap used - still (box has 16 GB of RAM and is prope  
> > ll=  
> > > > ed by a XEON
> > > > E3-1245 V2).
> > > >     
> > > 
> > > 31%! Is it *actively* paging or is the 31% previously paged out and no 
> > > paging is *currently* being experienced? 31% of how swap space in total?
> > > 
> > > Also, what does ps aumx or ps aumxww say? Pipe it to head -40 or similar.
> > > 
> > >   
> > 
> > On FreeBSD 11.0-CURRENT #4 r297573: Tue Apr  5 07:01:19 CEST 2016 amd64, loca
> > l
> > network, no NAT. Stuck ssh session in the middle of administering and leaving
> > the console/ssh session for a couple of minutes:
> > 
> > root        2064   0.0  0.1  91416  8492  -  Is   07:18     0:00.03 sshd:
> > hartmann [priv] (sshd)
> > 
> > hartmann    2108   0.0  0.1  91416  8664  -  I    07:18     0:07.33 sshd:
> > hartmann_at_pts/0 (sshd)
> > 
> > root       72961   0.0  0.1  91416  8496  -  Is   08:11     0:00.03 sshd:
> > hartmann [priv] (sshd)
> > 
> > hartmann   72970   0.0  0.1  91416  8564  -  S    08:11     0:00.02 sshd:
> > hartmann_at_pts/1 (sshd)
> > 
> > The situation is worse and i consider this a serious bug.
> >   
> 
> There's not a lot to go on here. Do you have physical access to the machine 
> to pop into DDB and take a look? You did say you're using a lot of swap. 
> IIRC 30%. You didn't answer how much 30% was of. Without more data I can't 
> help you. At the best I can take wild guesses but that won't help you. Try 
> to answer the questions I asked last week and we can go further. Until then 
> all we can do is wildly guess.
> 
> 

Apologies for the late answer, I'm busy.

Well, The "homebox" is physical accessible as well as the systems at work, but at work
they are heavily used right now.

As you stated in your prior to this Email, I "overload" the boxes. Yes, I do this by
intention and FreeBSD CURRENT withstood those attacks - approximately until 3 or 4 weeks
ago, when these problems occured.

30% swap was the "remain" after I started poudriere, poudriere "died" due to a
lost/broken pipe ssh session and did not relax after hours! The box didn't do anything in
that time after the pipe was broken. So I mentioned this. 

You also mentioned UFS and ZFS concurrency. Yes, I use a mixed system. UFS for the
system's partitions, and ZFS for the data volumes. UFS is on SSDs "faster", but this is
only a subjective impression of mine. Having /usr/ports on UFS and ZFS and enough memory
(32 GB RAM) shows significant differences on the very same HDD drive: while UFS has
finished a "matured" svn tree, the ZFS based tree could take up to 5 or 6 minutes until
finished. I think this is due to the growing .svn-folder. But on ZFS this occurs only the
first time the update of /usr/ports is done.

Just to say: if UFS and ZFS coexistency is critical, this is defintely a must for the
handbook!

But on the other hand, what I complain about is a dramatically change in stability of
CURRENT since the first occurency of the reported problems. Before, the very same
hardware, the very same setup, the very same jobs performed well. I pushed the boxes with
poudriere and several scientific jobs to their limits, and they took it like a German
tank. 

By the way, I use csh in all scenarios - I do not know whether this helps.

So, I'm at this moment quite unfamiliar with deeper investigations of the FreeBSD OS with
tools for debugging, but this has high priority on my to-do-list. If someone can hint me
towards the right tools and literature (manpages, also maybe sections in development
literature for FreeBSD), it would be highly appreciated.

And another info that came just to my mind right now:

I use tmpfs on /tmp and /var/run. I also have a GELI encrypted swap partition (on UFS
based SSD). 

Kind regards,

Oliver

Received on Sat Apr 09 2016 - 06:54:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:04 UTC