If memory serves me right, sometime around 10:52am, Weldon S Godfrey 3 told me: > > Up until yesterday, we have been running FreeBSD-CURRENT of 12/08. We started > to see a couple months ago some very odd network behavior. Something happens > to the stack that causes processes accessing the network to just hang. After > the problem happens, usually (but not always), you can't ssh in. Always, you > can't ssh or telnet out, and nothing can access the NFS shares on the server. > You can ping everything from the server. You can't even do a route add, you > can't ssh if you use just the IP address (although pinging with hostnames it > doesn't have cached or in hosts table resolves). When you try to ssh out, do > a route add from the box, the process just hangs. You can't control C it at > all, it hangs forever. There is nothing in dmesg or messages to indicate an > issue. I try to up/down the interfaces. In CURRENT-12/08, it may allow > things to work for like 30s. > > We upgraded to 8.0-RC2 yesterday and, at first, the problem appeared to happen > a lot more often. We expected that was related with the increase in network > performance. At least in 8.0-RC2, I did see a large amount of input errors > with netstat -in on the heavily loaded interface before it started the locking > up behavior. I have replaced the ethernet cable and move ports. The Catalyst > 3650 never records any errors. The problem would reoccur in about 5 minutes > once our load kicked in this morning. > > > One change in this upgrade, we switched from NFS v2 to v3. When we downgraded > to the previous OS, we stayed at v3. The problem was just about as bad with > v3 with the 12/08 OS > > We went back to RC2 with NFS v2 and appeared to stabilize to a degree. > It ran for about an hour and a half and then the issue came up > > We are currently back to the 12/08 version using NFS2 and watching things. > > We are using a Dell PowerEdge 2950-iii, the problem happens when using the > onboard nics using the bce driver and with an Intel card using the em driver > > I am hunting down any MTU/duplex/speed problems that could cause it (haven't > found any so far). Of course, any problems on the network wouldn't (ideally) > freak out the network stack on the server). I don't know how to troubleshoot > this further on the server since I am not getting any problems indicated in > logging, panics, cores, etc. > > Any help is appreciated. > I have swapped out the computer, switch, ethernet card, 3ware card. We are running on 8.0-CURRENT 12/08 that was what we where using with a lot less issues. No help. If it happens again, I am going to try to do a netif restart and routing restart. Although I believe I tried that at the begining and it did not help.Received on Mon Nov 02 2009 - 20:11:16 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC