Re: CURRENT slow and shaky network stability

From: Cy Schubert <Cy.Schubert_at_komquats.com> Date: Tue, 05 Apr 2016 01:32:53 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:03 UTC

In message <20160405092712.131ee52c_at_freyja.zeit4.iv.bundesimmobilien.de>, 
"O. H
artmann" writes:
> On Mon, 04 Apr 2016 23:46:08 -0700
> Cy Schubert <Cy.Schubert_at_komquats.com> wrote:
> 
> > In message <20160405082047.670d7241_at_freyja.zeit4.iv.bundesimmobilien.de>, 
> > "O. H
> > artmann" writes:
> > > On Sat, 02 Apr 2016 16:14:57 -0700
> > > Cy Schubert <Cy.Schubert_at_komquats.com> wrote:
> > >   
> > > > In message <20160402231955.41b05526.ohartman_at_zedat.fu-berlin.de>, "O. 
> > > > Hartmann"
> > > >  writes:  
> > > > > --Sig_/eJJPtbrEuK1nN2zIpc7BmVr
> > > > > Content-Type: text/plain; charset=US-ASCII
> > > > > Content-Transfer-Encoding: quoted-printable
> > > > > 
> > > > > Am Sat, 2 Apr 2016 11:39:10 +0200
> > > > > "O. Hartmann" <ohartman_at_zedat.fu-berlin.de> schrieb:
> > > > >     
> > > > > > Am Sat, 2 Apr 2016 10:55:03 +0200
> > > > > > "O. Hartmann" <ohartman_at_zedat.fu-berlin.de> schrieb:
> > > > > >=20    
> > > > > > > Am Sat, 02 Apr 2016 01:07:55 -0700
> > > > > > > Cy Schubert <Cy.Schubert_at_komquats.com> schrieb:
> > > > > > >  =20    
> > > > > > > > In message <56F6C6B0.6010103_at_protected-networks.net>, Michael
> > > > > > > > Butle  
> > > r  
> > > > > > > > =    
> > > > > writes:   =20    
> > > > > > > > > -current is not great for interactive use at all. The strateg
> y
> > > > > > > > > of pre-emptively dropping idle processes to swap is hurting .
> .
> > > > > > > > > big tim=    
> > > > > e.     =20    
> > > > > > > >=20
> > > > > > > > FreeBSD doesn't "preemptively" or arbitrarily push pages out to
> > > > > > > > disk.=    
> > > > >  LRU=20    
> > > > > > > > doesn't do this.
> > > > > > > >    =20    
> > > > > > > > >=20
> > > > > > > > > Compare inactive memory to swap in this example ..
> > > > > > > > >=20
> > > > > > > > > 110 processes: 1 running, 108 sleeping, 1 zombie
> > > > > > > > > CPU:  1.2% user,  0.0% nice,  4.3% system,  0.0% interrupt,
> > > > > > > > > 94.5% i=    
> > > > > dle    
> > > > > > > > > Mem: 474M Active, 1609M Inact, 764M Wired, 281M Buf, 119M Fre
> e
> > > > > > > > > Swap: 4096M Total, 917M Used, 3178M Free, 22% Inuse     =20  
>   
> > > > > > > >=20
> > > > > > > > To analyze this you need to capture vmstat output. You'll see t
> he
> > > > > > > > fre=    
> > > > > e pool=20    
> > > > > > > > dip below a threshold and pages go out to disk in response. If 
> you
> > > > > > > > ha=    
> > > > > ve=20    
> > > > > > > > daemons with small working sets, pages that are not part of the
> > > > > > > > worki=    
> > > > > ng=20    
> > > > > > > > sets for daemons or applications will eventually be paged out.
> > > > > > > > This i=    
> > > > > s not=20    
> > > > > > > > a bad thing. In your example above, the 281 MB of UFS buffers a
> re
> > > > > > > > mor=    
> > > > > e=20    
> > > > > > > > active than the 917 MB paged out. If it's paged out and never u
> sed
> > > > > > > > ag=    
> > > > > ain,=20    
> > > > > > > > then it doesn't hurt. However the 281 MB of buffers saves you I
> /O.
> > > > > > > > Th=    
> > > > > e=20    
> > > > > > > > inactive pages are part of your free pool that were active at o
> ne
> > > > > > > > tim=    
> > > > > e but=20    
> > > > > > > > now are not. They may be reclaimed and if they are, you've just
> > > > > > > > saved=    
> > > > >  more=20    
> > > > > > > > I/O.
> > > > > > > >=20
> > > > > > > > Top is a poor tool to analyze memory use. Vmstat is the better
> > > > > > > > tool t=    
> > > > > o help=20    
> > > > > > > > understand memory use. Inactive memory isn't a bad thing per se
> .
> > > > > > > > Moni=    
> > > > > tor=20    
> > > > > > > > page outs, scan rate and page reclaims.
> > > > > > > >=20
> > > > > > > >    =20    
> > > > > > >=20
> > > > > > > I give up! Tried to check via ssh/vmstat what is going on. Last
> > > > > > > lines b=    
> > > > > efore broken    
> > > > > > > pipe:
> > > > > > >=20
> > > > > > > [...]
> > > > > > > procs  memory       page                    disks
> > > > > > > faults           
> > > cpu  
> > > > > > > r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy
> > > > > > > c  
> > > s  
> > > > > > > =    
> > > > > us sy id    
> > > > > > > 22 0 22 5.8G  1.0G 46319   0   0   0 55721 1297   0   4  219 2390
> 7
> > > > > > > 540=    
> > > > > 0 95  5  0    
> > > > > > > 22 0 22 5.4G  1.3G 51733   0   0   0 72436 1162   0   0  108 4086
> 9
> > > > > > > 345=    
> > > > > 9 93  7  0    
> > > > > > > 15 0 22  12G  1.2G 54400   0  27   0 52188 1160   0  42  148 5219
> 2
> > > > > > > 436=    
> > > > > 6 91  9  0    
> > > > > > > 14 0 22  12G  1.0G 44954   0  37   0 37550 1179   0  39  141 8620
> 9
> > > > > > > 436=    
> > > > > 8 88 12  0    
> > > > > > > 26 0 22  12G  1.1G 60258   0  81   0 69459 1119   0  27  123 7795
> 69
> > > > > > > 704=    
> > > > > 359 87 13  0    
> > > > > > > 29 3 22  13G  774M 50576   0  68   0 32204 1304   0   2  102 5073
> 37
> > > > > > > 484=    
> > > > > 861 93  7  0    
> > > > > > > 27 0 22  13G  937M 47477   0  48   0 59458 1264   3   2  112 6813
> 1
> > > > > > > 4440=    
> > > > > 7 95  5  0    
> > > > > > > 36 0 22  13G  829M 83164   0   2   0 82575 1225   1   0  126 9936
> 6
> > > > > > > 3806=    
> > > > > 0 89 11  0    
> > > > > > > 35 0 22 6.2G  1.1G 98803   0  13   0 121375 1217   2   8  112 993
> 71
> > > > > > > 49=    
> > > > > 99 85 15  0    
> > > > > > > 34 0 22  13G  723M 54436   0  20   0 36952 1276   0  17  153 2914
> 2
> > > > > > > 443=    
> > > > > 1 95  5  0    
> > > > > > > Fssh_packet_write_wait: Connection to 192.168.0.1 port 22: Broken
> > > > > > > pip  
> > > e  
> > > > > > >=20
> > > > > > >=20
> > > > > > > This makes this crap system completely unusable. The server (Free
> BSD
> > > > > > > 11=    
> > > > > .0-CURRENT #20    
> > > > > > > r297503: Sat Apr  2 09:02:41 CEST 2016 amd64) in question did
> > > > > > > poudriere=    
> > > > >  bulk job. I    
> > > > > > > can not even determine what terminal goes down first - another on
> e,
> > > > > > > muc=    
> > > > > h more time    
> > > > > > > idle than the one shwoing the "vmstat 5" output, is still alive!=
> 20
> > > > > > >=20
> > > > > > > i consider this a serious bug and it is no benefit what happened
> > > > > > > sinc  
> > > e  
> > > > > > > =    
> > > > > this "fancy"    
> > > > > > > update. :-( =20    
> > > > > >=20
> > > > > > By the way - it might be of interest and some hint.
> > > > > >=20
> > > > > > One of my boxes is acting as server and gateway. It utilises NAT,
> > > > > > IPFW, w=    
> > > > > hen it is under    
> > > > > > high load, as it was today, sometimes passing the network flow from
> > > > > > ISP i=    
> > > > > nto the network    
> > > > > > for clients is extremely slow. I do not consider this the reason fo
> r
> > > > > > coll=    
> > > > > apsing ssh    
> > > > > > sessions, since this incident happens also under no-load, but in th
> e
> > > > > > over=    
> > > > > all-view onto    
> > > > > > the problem, this could be a hint - I hope.=20    
> > > > > 
> > > > > I just checked on one box, that "broke pipe" very quickly after I
> > > > > started  
> > >  p=  
> > > > > oudriere,
> > > > > while it did well a couple of hours before until the pipe broke. It
> > > > > seems  
> > >  i=  
> > > > > t's load
> > > > > dependend when the ssh session gets wrecked, but more important, afte
> r
> > > > > th  
> > > e =  
> > > > > long-haul
> > > > > poudriere run, I rebooted the box and tried again with the mentioned
> > > > > brok  
> > > en=  
> > > > >  pipe after a
> > > > > couple of minutes after poudriere ran. Then I left the box for severa
> l
> > > > > ho  
> > > ur=  
> > > > > s and logged
> > > > > in again and checked the swap. Although there was for hours no load o
> r
> > > > > ot  
> > > he=  
> > > > > r pressure,
> > > > > there were 31% of of swap used - still (box has 16 GB of RAM and is
> > > > > prope  
> > > ll=  
> > > > > ed by a XEON
> > > > > E3-1245 V2).
> > > > >     
> > > > 
> > > > 31%! Is it *actively* paging or is the 31% previously paged out and no 
> > > > paging is *currently* being experienced? 31% of how swap space in total
> ?
> > > > 
> > > > Also, what does ps aumx or ps aumxww say? Pipe it to head -40 or simila
> r.
> > > > 
> > > >   
> > > 
> > > On FreeBSD 11.0-CURRENT #4 r297573: Tue Apr  5 07:01:19 CEST 2016 amd64,
> > > loca l
> > > network, no NAT. Stuck ssh session in the middle of administering and
> > > leaving the console/ssh session for a couple of minutes:
> > > 
> > > root        2064   0.0  0.1  91416  8492  -  Is   07:18     0:00.03 sshd:
> > > hartmann [priv] (sshd)
> > > 
> > > hartmann    2108   0.0  0.1  91416  8664  -  I    07:18     0:07.33 sshd:
> > > hartmann_at_pts/0 (sshd)
> > > 
> > > root       72961   0.0  0.1  91416  8496  -  Is   08:11     0:00.03 sshd:
> > > hartmann [priv] (sshd)
> > > 
> > > hartmann   72970   0.0  0.1  91416  8564  -  S    08:11     0:00.02 sshd:
> > > hartmann_at_pts/1 (sshd)
> > > 
> > > The situation is worse and i consider this a serious bug.
> > >   
> > 
> > There's not a lot to go on here. Do you have physical access to the machine
>  
> > to pop into DDB and take a look? You did say you're using a lot of swap. 
> > IIRC 30%. You didn't answer how much 30% was of. Without more data I can't 
> > help you. At the best I can take wild guesses but that won't help you. Try 
> > to answer the questions I asked last week and we can go further. Until then
>  
> > all we can do is wildly guess.
> > 
> > 
> 
> Hello Cy, sorry for the lack of information.
> 
> The machine in question is not accessible at this very moment. The box has 16
> GB of physical RAM, 32 GB of swap (on SSD) and a 4-core/8-thread CPU (I think

So that's 10 GB of swap. Hmmm. Memory leak? What? We need to investigate 
this avenue.

4-core/8-thread: A total of 8 threads. It had 22-29 active processes in the 
run queue (based on your vmstat output). It would appear your box is simply 
overloaded.

> tht is also important due to allocation of arbitrary memory). The problem I
> described arose when using poudriere. The box uses 6 builders, but each build
> er
> can, as I understand, spwan several instances of jobs for compiling/linking e
> tc.

I'm thinking you have too much loaded on the box (22-29 active processes in 
the run queue and 10 GB swap used). I'm currently running two poudriere 
builds on two separate machines, each is dual core with no hyperthreading 
(AMD X2 5200+ and 5000+) with three builders, each single threaded: load 
average of about 3-4.

> But - that box is only a placeholder for the weirdness that is going on
> (despite the fact that it is using NAT since it is atatched to a DSL line).
> 
> To the contrary, the system I face today at work is not(!) behind NAT and
> doesn't have the "toy" network binding. This box I'm accessing now has 16 GB 
> of
> physical RAM, and two sockets, each populated with a oldish 4-core XEON 5XXX
> from the Core2Duo age (no SMT). That box does not run poudriere, only
> postgresql and some other services.

What are the performance stats there? CPU, swap, free memory (mem used), 
scan rate, reclaim rate, and also load average? Do you use ZFS, UFS or 
both? (I have both and if I use UFS, then swap is used because UFS cache 
displacement of infrequently used pages, though heavy use of ZFS cache will 
push some VM out to swap too. Not a big deal as I'm using memory I've paid 
for rather than it being idle. I wouldn't be this cheap at $JOB though.)

> 
> In February, I was able to push the other box in question (behind NAT, as a
> remark) to its limits using poudriere and using 8 builders. Network became
> slow, since the box also acts as gateway, but it never failed, broke or dropp
> ed
> the ssh session due to "broken pipe". Without changing the config except  the
> base system's sources for CURRENT, since ~ two, or at most three weeks for no
> w
> I get this weird drops. And this is why I started "wining" - there is also a
> drop in performance when compiling world, which elongates the compiling time 
> ~
> 5 - 10 minutes on the NATed box.
> 
> I'm fully aware of being on CURRENT, but I think it is considerably reasonabl
> e
> to report about those weird things happening now. I did not receive any word
> about dramatic changes that could trigger such a behaviour. And as I
> understand the thread we are in here, there has been made a change which
> results in a more aggressive swap of inactive processes. 
> 
> I tried to stop all services on the boxes (postgresql, icinga2, http etc) to
> check whether those could force the kernel to swap a process. But the loss of

Swapping isn't really performed until memory is critical. (Swapping is: 
swapping whole processes out to disk.) Paging OTOH is what you're seeing. 
That's classic LRU. Scan rate is the determining factor with LRU (or 
unreferenced interval count if you may -- the amount of time since the page 
has last been referenced).

> ssh connection and the very strange behaviour that the ssh connection gets
> irresponsive is eratic, which means: it comes sometimes very fast after
> seconds not touching the xterm/ssh  from the remote box, sometimes it lasts
> up to 30 minutes, even on load. So, there is probably a problem with me
> understanding this new feature ...  

Which new feature?

Erratic and unresponsive sessions could be anything. Your stats look way 
off, indicating the one box is overloaded CPU-wise and memory-wise. Erratic 
behavior could be due to a number of other factors. What about the other 
box? CPU, memory stats, swap... What kind of workload is run on it? What 
kind of NICs do they have too?

It's probably buried in another email but do you use UFS, ZFS or both? IIRC 
only UFS. (UFS and ZFS don't mix. They're like oil and water. The UFS 
buffer cache and ZFS ARC compete for memory. Throw applications into the 
mix (obviously) and minor to moderate paging will result.

I'll cut to the chase. Without precise information, the areas of possible 
investigation are too much load (CPU & memory), possible memory leak (?), 
or possible NIC or network issue causing the disconnects. Similarities 
between the two systems -- better to look at both and we discover 
similarities here than through a lens.

-- 
Cheers,
Cy Schubert <Cy.Schubert_at_komquats.com> or <Cy.Schubert_at_cschubert.com>
FreeBSD UNIX:  <cy_at_FreeBSD.org>   Web:  http://www.FreeBSD.org

	The need of the many outweighs the greed of the few.