Re: FreeBSD 8.0 - network stack crashes?

From: Adam Vande More <amvandemore_at_gmail.com>
Date: Tue, 3 Nov 2009 08:18:52 -0600
On Tue, Nov 3, 2009 at 7:32 AM, Weldon S Godfrey 3 <weldon_at_excelsusphoto.com
> wrote:

>
>
> If memory serves me right, sometime around Yesterday, Gavin Atkinson told
> me:
>
> Gavin, thank you A LOT for helping us with this, I have answered as much as
> I can from the most recent crash below.  We did hit max mbufs.  It is at
> 25Kclusters, which is the default.  I have upped it to 32K because a rather
> old article mentioned that as the top end and I need to get into work so I
> am not trying to do this with a remote console to go higher.  I have already
> set it to reboot next with 64K clusters.  I already have kmem maxed to what
> is bootable (or at least at one time) in 8.0, 4GB, how high can I safely go?
>  This is a NFS server running ZFS with sustained 5 min averages of
> 120-200Mb/s running as a store for a mail system.
>
>
>  Some things that would be useful:
>>
>> - Does "arp -da" fix things?
>>
>
> no, it hangs like ssh, route add, etc
>
>
>  - What's the output of "netstat -m" while the networking is broken?
>>
> Tue Nov  3 07:02:11 CST 2009
> 36971/2033/39004 mbufs in use (current/cache/total)
> 24869/731/25600/25600 mbuf clusters in use (current/cache/total/max)
> 24314/731 mbuf+clusters out of packet secondary zone in use (current/cache)
> 0/35/35/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 58980K/2110K/61091K bytes allocated to network (current/cache/total)
> 0/201276/90662 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
> 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
> 0/0/0 sfbufs in use (current/peak/max)
> 0 requests for sfbufs denied
> 0 requests for sfbufs delayed
> 0 requests for I/O initiated by sendfile
> 0 calls to protocol drain routines
>
>
>
>  - What does CTRL-T show for the hung SSH or route processes?
>>
>
> of the arp:
> load: 0.01  cmd: arp 6144 [zonelimit] 0.00u 0.00s 0% 996k
>
>
>  - What does "procstat -kk" on the same processes show?
>>
> sorry I couldn't get this to run this time, remote  console issues
>
>
>  - Does going to single user mode ("init 1" and killing off any leftover
>> processes) cause the machine to start working again?  If so, what's the
>> output of "netstat -m" afterwards?
>>
>
> no, mbuf was still maxed out
>
>
> below is the last vmstat -m         Type InUse MemUse HighUse Requests
> Size(s)
>  ntfs_nthash     1   512K       -        1
>    pfs_nodes    20     5K       -       20  256
>         GEOM   262    52K       -     4551 16,32,64,128,256,512,1024,2048
>       isadev     9     2K       -        9  128
>         cdev    13     4K       -       13  256
>        sigio     1     1K       -        1  64
>     filedesc   127    64K       -     6412  512,1024
>         kenv    75    11K       -       80  16,32,64,128
>       kqueue     0     0K       -      188  256,2048
>    proc-args    41     2K       -     5647  16,32,64,128
>      scsi_cd     0     0K       -      333  16
>      ithread   119    21K       -      119  32,128,256
>       acpica   888    78K       -   121045  16,32,64,128,256,512,1024
>       KTRACE   100    13K       -      100  128
>     acpitask     0     0K       -        1  64
>       linker   139   596K       -      181 16,32,64,128,256,512,1024,2048
>        lockf    11     2K       -      399  64,128
> CAM dev queue     4     1K       -        4  128
>       ip6ndp     5     1K       -        5  64,128
>         temp    48   562K       - 14544952
> 16,32,64,128,256,512,1024,2048,4096
>       devbuf 17105 36341K       -    24988 16,32,64,128,512,1024,2048,4096
>       module   420    53K       -      420  128
>     mtx_pool     1     8K       -        1
>          osd     2     1K       -        2  16
>    CAM queue    62    52K       -     2211 16,32,64,128,256,512,1024,2048
>      subproc   562   722K       -     6851  512,4096
>         proc     2    16K       -        2
>      session    33     5K       -      127  128
>         pgrp    37     5K       -      190  128
>         cred    62    16K       - 29192756  256
>      uidinfo     4     3K       -       99  64,2048
>       plimit    17     5K       -      910  256
>      acpisem    15     1K       -       15  64
>    sysctltmp     0     0K       -    13867
> 16,32,64,128,256,512,1024,2048,4096
>    sysctloid  5400   270K       -     5782  16,32,64,128
>       sysctl     0     0K       -    11423  16,32,64
>      callout     7  3584K       -        7
>         umtx   780    98K       -      780  128
>     p1003.1b     1     1K       -        1  16
>         SWAP     2  3281K       -        2  64
>       kbdmux     8     9K       -        8  16,256,512,2048,4096
>       bus-sc   103   188K       -     4558
> 16,32,64,128,256,512,1024,2048,4096
>          bus  1174    93K       -    57792  16,32,64,128,256,512,1024
>        clist    54     7K       -       54  128
>      devstat    32    65K       -       32  32,4096
>  eventhandler    64     6K       -       64  64,128
>         kobj   276  1104K       -      387  4096
>         rman   144    18K       -      601  16,32,128
>       mfibuf     3    21K       -       12  32,256,512,2048,4096
>         sbuf     0     0K       -    14350
> 16,32,64,128,256,512,1024,2048,4096
>      scsi_da     0     0K       -      504  16
>      CAM SIM     4     1K       -        4  256
>        stack     0     0K       -      194  256
>    taskqueue    13     2K       -       13  16,32,128
>       Unitno    11     1K       -     4759  32,64
>          iov     0     0K       -     1193  16,64,256,512
>       select    98    13K       -       98  128
>     ioctlops     0     0K       -    14716 16,32,64,128,256,512,1024,4096
>          msg     4    30K       -        4  2048,4096
>          sem     4     8K       -        4  512,1024,2048,4096
>          shm     1    16K       -        1
>          tty    25    25K       -       25  1024
>          pts     3     1K       -        3  256
>     mbuf_tag     0     0K       -        2  32
>        shmfd     1     8K       -        1
>   CAM periph    54    14K       -      371  16,32,64,128,256
>          pcb    28   157K       -      148  16,32,128,1024,2048,4096
>       soname     5     1K       -    18699  16,32,128
>       biobuf     4     8K       -        6  2048
>     vfscache     1  1024K       -        1
>   cl_savebuf     0     0K       -        7  64,128
>  export_host     5     3K       -        5  512
>     vfs_hash     1   512K       -        1
>       vnodes     2     1K       -        2  256
>  vnodemarker     0     0K       -     4832  512
>        mount   222    15K       -      807  16,32,64,128,256,1024
>  ata_generic     1     1K       -        1  1024
>          BPF     4     1K       -        4  128
>  ether_multi    22     2K       -       24  16,32,64
>       ifaddr    54    14K       -       54  32,64,128,256,512,4096
>        ifnet     5     9K       -        5  256,2048
>        clone     5    20K       -        5  4096
>       arpcom     3     1K       -        3  16
>     routetbl    65    11K       -      949  32,64,128,256,512
>     in_multi     3     1K       -        3  64
>    sctp_iter     0     0K       -        3  256
>     sctp_ifn     3     1K       -        3  128
>     sctp_ifa     4     1K       -        4  128
>     sctp_vrf     1     1K       -        1  64
>    sctp_a_it     0     0K       -        3  16
>    hostcache     1    28K       -        1
>   acd_driver     1     2K       -        1  2048
>     syncache     1    92K       -        1
>    in6_multi    19     2K       -       19  32,64,128
>  ip6_moptions     1     1K       -        1  32
>      NFS FHA    13     3K       - 18480347  64,2048
>          rpc  1381   716K       - 82214178  32,64,128,256,512,2048
> audit_evclass   168     6K       -      205  32
>       newblk     1     1K       -        1  512
>     inodedep     1   512K       -        1
>      pagedep     1   128K       -        1
>  ufs_dirhash    45     9K       -       45  16,32,64,128,512
>    ufs_mount     3    11K       -        3  512,2048
>      UMAHash     3   130K       -       12  512,1024,2048,4096
>      acpidev    56     4K       -       56  64
>    vm_pgdata     2   129K       -        2  128
>      CAM XPT   589   369K       -     2047  32,64,128,256,1024
>      io_apic     2     4K       -        2  2048
>     pci_link    16     2K       -       16  32,128
>      memdesc     1     4K       -        1  4096
>          msi     3     1K       -        3  128
>     nexusdev     3     1K       -        3  16
>      entropy  1024    64K       -     1024  64
>  twa_commands     2   104K       -      101  256
>     atkbddev     2     1K       -        2  64
>         UART     6     4K       -        6  16,512,1024
>        USBHC     1     1K       -        1  128
>       USBdev    30    11K       -       30  16,32,64,128,256,512
>          USB   157    54K       -      190  16,32,64,128,256,1024
>       DEVFS1   152    76K       -      153  512
>       DEVFS3   165    42K       -      167  256
>        DEVFS    16     1K       -       17  16,128
>      solaris 822038 707024K       - 235790398
> 16,32,64,128,256,512,1024,2048,4096
>   kstat_data     2     1K       -        2  64
>
>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
 from man tuning:

     kern.ipc.nmbclusters may be adjusted to increase the number of network
     mbufs the system is willing to allocate.  Each cluster represents
approx-
     imately 2K of memory, so a value of 1024 represents 2M of kernel memory
     reserved for network buffers.  You can do a simple calculation to
figure
     out how many you need.  If you have a web server which maxes out at
1000
     simultaneous connections, and each connection eats a 16K receive and
16K
     send buffer, you need approximately 32MB worth of network buffers to
deal
     with it.  A good rule of thumb is to multiply by 2, so 32MBx2 = 64MB/2K
=
     32768.  So for this case you would want to set kern.ipc.nmbclusters to
     32768.  We recommend values between 1024 and 4096 for machines with
mod-
     erates amount of memory, and between 4096 and 32768 for machines with
     greater amounts of memory.  Under no circumstances should you specify
an
     arbitrarily high value for this parameter, it could lead to a boot-time
     crash.  The -m option to netstat(1) may be used to observe network
clus-
     ter use.  Older versions of FreeBSD do not have this tunable and
require
     that the kernel config(8) option NMBCLUSTERS be set instead.

     More and more programs are using the sendfile(2) system call to
transmit
     files over the network.  The kern.ipc.nsfbufs sysctl controls the
number
     of file system buffers sendfile(2) is allowed to use to perform its
work.
     This parameter nominally scales with kern.maxusers so you should not
need
     to modify this parameter except under extreme circumstances.  See the
     TUNING section in the sendfile(2) manual page for details.



-- 
Adam Vande More
Received on Tue Nov 03 2009 - 13:18:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC