On Tue, Feb 03, 2015 at 01:33:15PM -0800, Peter Wemm wrote: > Sometime in the Dec 10th through Jan 7th timeframe a timing bug has been > introduced to 11.x/head/-current. With HZ=1000 (the default for bare metal, > not for a vm); the clocks stop just after 24 days of uptime. This means > things like cron, sleep, timeouts etc stop working. TCP/IP won't time out or > retransmit, etc etc. It can get ugly. > > The problem is NOT in 10.x/-stable. > > We hit this in the freebsd.org cluster, the builds that we used are: > FreeBSD 11.0-CURRENT #0 r275684: Wed Dec 10 20:38:43 UTC 2014 - fine > FreeBSD 11.0-CURRENT #0 r276779: Wed Jan 7 18:47:09 UTC 2015 - broken > > If you are running -current in a situation where it'll accumulate uptime, you > may want to take precautions. A reboot prior to 24 days uptime (as horrible a > workaround as that is) will avoid it. > > Yes, this is being worked on. So the issue is reproducable in 3 minutes after boot with the following change in kern_clock.c: volatile int ticks = INT_MAX - (/*hz*/1000 * 3 * 60); It is fixed (in the proper meaning of the word, not like worked around, covered by paper) by the patch at the end of the mail. We already have a story trying to enable much less ambitious option -fno-strict-overflow, see r259045 and the revert in r259422. I do not see other way than try one more time. Too many places in kernel depend on the correctly wrapping 2-complement arithmetic, among others are callweel and scheduler. diff --git a/sys/conf/kern.mk b/sys/conf/kern.mk index c031b3a..eb7ce2f 100644 --- a/sys/conf/kern.mk +++ b/sys/conf/kern.mk _at__at_ -158,6 +158,11 _at__at_ INLINE_LIMIT?= 8000 CFLAGS+= -ffreestanding # +# Make signed arithmetic wrap. +# +CFLAGS+= -fwrapv + +# # GCC SSP support # .if ${MK_SSP} != "no" && \Received on Wed Feb 04 2015 - 13:29:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:55 UTC