PSA: If you run -current, beware!

From: Peter Wemm <peter_at_wemm.org>
Date: Tue, 03 Feb 2015 13:33:15 -0800
Sometime in the Dec 10th through Jan 7th timeframe a timing bug has been 
introduced to 11.x/head/-current.    With HZ=1000 (the default for bare metal, 
not for a vm); the clocks stop just after 24 days of uptime.  This means 
things like cron, sleep, timeouts etc stop working.  TCP/IP won't time out or 
retransmit, etc etc.  It can get ugly.

The problem is NOT in 10.x/-stable.

We hit this in the freebsd.org cluster, the builds that we used are:
FreeBSD 11.0-CURRENT #0 r275684: Wed Dec 10 20:38:43 UTC 2014 - fine
FreeBSD 11.0-CURRENT #0 r276779: Wed Jan  7 18:47:09 UTC 2015 - broken

If you are running -current in a situation where it'll accumulate uptime, you 
may want to take precautions.  A reboot prior to 24 days uptime (as horrible a 
workaround as that is) will avoid it.

Yes, this is being worked on.
-- 
Peter Wemm - peter_at_wemm.org; peter_at_FreeBSD.org; peter_at_yahoo-inc.com; KI6FJV
UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246
Received on Tue Feb 03 2015 - 20:33:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC