Re: FreeBSD 8.0 - network stack crashes?

From: Weldon S Godfrey 3 <weldon_at_excelsusphoto.com>
Date: Tue, 3 Nov 2009 08:32:24 -0500 (EST)
If memory serves me right, sometime around Yesterday, Gavin Atkinson told me:

Gavin, thank you A LOT for helping us with this, I have answered as much 
as I can from the most recent crash below.  We did hit max mbufs.  It is 
at 25Kclusters, which is the default.  I have upped it to 32K because a 
rather old article mentioned that as the top end and I need to get into 
work so I am not trying to do this with a remote console to go higher.  I 
have already set it to reboot next with 64K clusters.  I already have kmem 
maxed to what is bootable (or at least at one time) in 8.0, 4GB, how high 
can I safely go?  This is a NFS server running ZFS with sustained 5 min 
averages of 120-200Mb/s running as a store for a mail system.

> Some things that would be useful:
>
> - Does "arp -da" fix things?

no, it hangs like ssh, route add, etc

> - What's the output of "netstat -m" while the networking is broken?
Tue Nov  3 07:02:11 CST 2009
36971/2033/39004 mbufs in use (current/cache/total)
24869/731/25600/25600 mbuf clusters in use (current/cache/total/max)
24314/731 mbuf+clusters out of packet secondary zone in use 
(current/cache)
0/35/35/12800 4k (page size) jumbo clusters in use 
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
58980K/2110K/61091K bytes allocated to network (current/cache/total)
0/201276/90662 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines


> - What does CTRL-T show for the hung SSH or route processes?

of the arp:
load: 0.01  cmd: arp 6144 [zonelimit] 0.00u 0.00s 0% 996k

> - What does "procstat -kk" on the same processes show?
sorry I couldn't get this to run this time, remote  console issues

> - Does going to single user mode ("init 1" and killing off any leftover
> processes) cause the machine to start working again?  If so, what's the
> output of "netstat -m" afterwards?

no, mbuf was still maxed out


below is the last vmstat -m         Type InUse MemUse HighUse Requests 
Size(s)
   ntfs_nthash     1   512K       -        1
     pfs_nodes    20     5K       -       20  256
          GEOM   262    52K       -     4551 
16,32,64,128,256,512,1024,2048
        isadev     9     2K       -        9  128
          cdev    13     4K       -       13  256
         sigio     1     1K       -        1  64
      filedesc   127    64K       -     6412  512,1024
          kenv    75    11K       -       80  16,32,64,128
        kqueue     0     0K       -      188  256,2048
     proc-args    41     2K       -     5647  16,32,64,128
       scsi_cd     0     0K       -      333  16
       ithread   119    21K       -      119  32,128,256
        acpica   888    78K       -   121045  16,32,64,128,256,512,1024
        KTRACE   100    13K       -      100  128
      acpitask     0     0K       -        1  64
        linker   139   596K       -      181 
16,32,64,128,256,512,1024,2048
         lockf    11     2K       -      399  64,128
CAM dev queue     4     1K       -        4  128
        ip6ndp     5     1K       -        5  64,128
          temp    48   562K       - 14544952 
16,32,64,128,256,512,1024,2048,4096
        devbuf 17105 36341K       -    24988 
16,32,64,128,512,1024,2048,4096
        module   420    53K       -      420  128
      mtx_pool     1     8K       -        1
           osd     2     1K       -        2  16
     CAM queue    62    52K       -     2211 
16,32,64,128,256,512,1024,2048
       subproc   562   722K       -     6851  512,4096
          proc     2    16K       -        2
       session    33     5K       -      127  128
          pgrp    37     5K       -      190  128
          cred    62    16K       - 29192756  256
       uidinfo     4     3K       -       99  64,2048
        plimit    17     5K       -      910  256
       acpisem    15     1K       -       15  64
     sysctltmp     0     0K       -    13867 
16,32,64,128,256,512,1024,2048,4096
     sysctloid  5400   270K       -     5782  16,32,64,128
        sysctl     0     0K       -    11423  16,32,64
       callout     7  3584K       -        7
          umtx   780    98K       -      780  128
      p1003.1b     1     1K       -        1  16
          SWAP     2  3281K       -        2  64
        kbdmux     8     9K       -        8  16,256,512,2048,4096
        bus-sc   103   188K       -     4558 
16,32,64,128,256,512,1024,2048,4096
           bus  1174    93K       -    57792  16,32,64,128,256,512,1024
         clist    54     7K       -       54  128
       devstat    32    65K       -       32  32,4096
  eventhandler    64     6K       -       64  64,128
          kobj   276  1104K       -      387  4096
          rman   144    18K       -      601  16,32,128
        mfibuf     3    21K       -       12  32,256,512,2048,4096
          sbuf     0     0K       -    14350 
16,32,64,128,256,512,1024,2048,4096
       scsi_da     0     0K       -      504  16
       CAM SIM     4     1K       -        4  256
         stack     0     0K       -      194  256
     taskqueue    13     2K       -       13  16,32,128
        Unitno    11     1K       -     4759  32,64
           iov     0     0K       -     1193  16,64,256,512
        select    98    13K       -       98  128
      ioctlops     0     0K       -    14716 
16,32,64,128,256,512,1024,4096
           msg     4    30K       -        4  2048,4096
           sem     4     8K       -        4  512,1024,2048,4096
           shm     1    16K       -        1
           tty    25    25K       -       25  1024
           pts     3     1K       -        3  256
      mbuf_tag     0     0K       -        2  32
         shmfd     1     8K       -        1
    CAM periph    54    14K       -      371  16,32,64,128,256
           pcb    28   157K       -      148  16,32,128,1024,2048,4096
        soname     5     1K       -    18699  16,32,128
        biobuf     4     8K       -        6  2048
      vfscache     1  1024K       -        1
    cl_savebuf     0     0K       -        7  64,128
   export_host     5     3K       -        5  512
      vfs_hash     1   512K       -        1
        vnodes     2     1K       -        2  256
   vnodemarker     0     0K       -     4832  512
         mount   222    15K       -      807  16,32,64,128,256,1024
   ata_generic     1     1K       -        1  1024
           BPF     4     1K       -        4  128
   ether_multi    22     2K       -       24  16,32,64
        ifaddr    54    14K       -       54  32,64,128,256,512,4096
         ifnet     5     9K       -        5  256,2048
         clone     5    20K       -        5  4096
        arpcom     3     1K       -        3  16
      routetbl    65    11K       -      949  32,64,128,256,512
      in_multi     3     1K       -        3  64
     sctp_iter     0     0K       -        3  256
      sctp_ifn     3     1K       -        3  128
      sctp_ifa     4     1K       -        4  128
      sctp_vrf     1     1K       -        1  64
     sctp_a_it     0     0K       -        3  16
     hostcache     1    28K       -        1
    acd_driver     1     2K       -        1  2048
      syncache     1    92K       -        1
     in6_multi    19     2K       -       19  32,64,128
  ip6_moptions     1     1K       -        1  32
       NFS FHA    13     3K       - 18480347  64,2048
           rpc  1381   716K       - 82214178  32,64,128,256,512,2048
audit_evclass   168     6K       -      205  32
        newblk     1     1K       -        1  512
      inodedep     1   512K       -        1
       pagedep     1   128K       -        1
   ufs_dirhash    45     9K       -       45  16,32,64,128,512
     ufs_mount     3    11K       -        3  512,2048
       UMAHash     3   130K       -       12  512,1024,2048,4096
       acpidev    56     4K       -       56  64
     vm_pgdata     2   129K       -        2  128
       CAM XPT   589   369K       -     2047  32,64,128,256,1024
       io_apic     2     4K       -        2  2048
      pci_link    16     2K       -       16  32,128
       memdesc     1     4K       -        1  4096
           msi     3     1K       -        3  128
      nexusdev     3     1K       -        3  16
       entropy  1024    64K       -     1024  64
  twa_commands     2   104K       -      101  256
      atkbddev     2     1K       -        2  64
          UART     6     4K       -        6  16,512,1024
         USBHC     1     1K       -        1  128
        USBdev    30    11K       -       30  16,32,64,128,256,512
           USB   157    54K       -      190  16,32,64,128,256,1024
        DEVFS1   152    76K       -      153  512
        DEVFS3   165    42K       -      167  256
         DEVFS    16     1K       -       17  16,128
       solaris 822038 707024K       - 235790398 
16,32,64,128,256,512,1024,2048,4096
    kstat_data     2     1K       -        2  64
Received on Tue Nov 03 2009 - 12:32:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC