Re: Fix for memory leak in setenv/unsetenv

From: Sean C. Farley <sean-freebsd_at_farley.org>
Date: Thu, 19 Oct 2006 13:22:13 -0500 (CDT)
On Thu, 19 Oct 2006, Stefan Esser wrote:

> Sean C. Farley wrote:
>> On Thu, 19 Oct 2006, John Baldwin wrote:
> [...]
>>> I don't see how you fixed the leak.  You explicitly mention that you
>>> don't free old values, so you are preserving the leak.
>> 
>> I preserve the leak, but I also make use of old entries with matching
>> names if their size is big enough.  The maximum size of the value is
>> recorded at allocation.  The code in the source tree uses the
>> strlen() of the current value which can shrink even if the space is
>> available.
>> 
>> Example:
>> setenv("FOO", "BAR1", 1);
>> w = getenv("FOO");
>> setenv("FOO", "BARBAR1", 1);
>> x = getenv("FOO");
>> setenv("FOO", "BAR2", 1);
>> y = getenv("FOO");
>> setenv("FOO", "BARBAR2", 1);
>> z = getenv("FOO");
>> 
>> This will end up with w == y ("BAR2") and y == z ("BAR1").  w was
>
> Hmmm, I assume you meant: w == y ("BAR2") and x == z ("BARBAR2") ...

Oops.  Yes, you are correct.

>> allocated first in the array of variables.  x is then allocated since
>> w is too small.  y finds w within the array and reuses it.  z does
>> the wame with x.  If the larger value had been allocated first, then
>> all setenv() calls would have used the same storage space.  Yes,
>> there is a leak, but flipping back and forth does not cause the leak
>> to keep growing.  Also, there would be no need to space-pad a value
>> to prevent the leak.
>
> Just curious, why don't you ignore the first slot allocated to "BAR1"
> during the setenv("FOO", "BAR2", 1) and overwrite the value at *x ...
>
> I.e.: w ("BAR1") and x == y == z ("BARBAR2") ...

setenv() uses the first slot with enough space that has the same name.

getenv() is performing a linear search for the first active occurrence
of "FOO" within the array.

> If the maximum size is recorded, there is no reason to prefer *w over
> *x, and the case of a cached pointer might still be wrong but not that
> irritating once the variable has been set to its longest value.

The issue is that environ may already have multiple copies of the same
variable.  The current code does not limit that.  It should, but I was
trying to see if the general idea for this change would be accepted.

> I've got to admit, that I have not looked your patch, but the only
> drawback seems to be that the last instance of a variable in the
> environment space has to be located in getenv() (maximizing the search
> time ...).

Actually, the first active variable found is returned by getenv() even
if another would be found later.  This does make me think that if
changed the way the environment variable array was built to only contain
the first instance of each variable instead of all instances then a
search by getenv() from the end of the array backwards would be faster.
A cheap alternative is to create the array in reverse.

> Always using the last allocated (largest) slot for storage of new
> values of environment variables would result in nearly reasonable
> behavior. A cached pointer does either point to the value of the
> variable at the time of the getenv(), or to the last value assigned to
> the environment variable that does not exceed the allocated size.

I will look at changing it.

Sean
-- 
sean-freebsd_at_farley.org
Received on Thu Oct 19 2006 - 16:22:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:01 UTC