Re: Crash in accounting code: encode_long(), due to bad rusage data?

From: Diomidis Spinellis <dds_at_aueb.gr>
Date: Mon, 20 Aug 2007 01:23:34 +0300
Robert Watson wrote:
> I recently upgraded two servers from FreeBSD 6-STABLE to FreeBSD 
> 7-CURRENT in anticipation of the forthcoming release.  Both of them run 
> with accounting enabled at all times.  When a large pine session was 
> exiting on one of the two boxes, I ran into the following panic:
> 
> panic: encode_long: -ve value -32749

Getting rid of the panic is easy:

--- kern_acct.c	2007-08-20 01:15:18.000000000 +0300
+++ kern_acct.c.new	2007-08-20 01:16:06.000000000 +0300
_at__at_ -523,8 +523,7 _at__at_
  	int norm_exp;	/* Normalized exponent */
  	int shift;

-	KASSERT(val >= 0,  ("encode_long: -ve value %ld", val));
-	if (val == 0)
+	if (val <= 0)
  		return (0);
  	norm_exp = fls(val) - 1;
  	shift = FLT_MANT_DIG - norm_exp - 1;

However, as you wrote, this doesn't fix the root of the problem.

> I find the large negative value in ru_idrss somewhat sad to contemplate, 
> and while this could well be a problem with capturing of process runtime 
> information, I'd like it if the accounting code were robust against this 
> sort of bug, rather than panicking, and I guess I'd also rather than the 
> process runtime information also be correctly captured :-).

Do you think it makes any sense for encode_long to be correctly encoding 
negative numbers, or should we concentrate on locating and fixing the 
negative ru_idrss problem?

Diomidis Spinellis - http://www.spinellis.gr
Received on Sun Aug 19 2007 - 21:32:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:16 UTC