Re: Fix for memory leak in setenv/unsetenv

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 11 Oct 2006 14:27:41 -0400
On Wednesday 11 October 2006 12:15, Sean C. Farley wrote:
> On Tue, 10 Oct 2006, John Baldwin wrote:
> 
> > On Friday 06 October 2006 21:13, Sean C. Farley wrote:
> >> Many a moon ago[1], I put together a patch to fix the leak in
> >> setenv() and unsetenv().  A few months ago, I submitted a PR
> >> (kern/99826[2]) for the final fix.  I was wondering if anyone would
> >> take a look at it to see if any changes are still warranted.  The PR
> >> contains information about the patch and sample programs to test it
> >> out.
> >>
> >> Thank you.
> >>
> >> Sean
> >>    1. 
http://lists.freebsd.org/pipermail/freebsd-hackers/2005-February/010463.html
> >>    2. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/99826
> >
> > This still won't work.  The reason for the intentional leak is because
> > of this code sequence:
> >
> > 	char *a;
> >
> > 	setenv("FOO", "0", 1);
> > 	a = getenv("FOO");
> > 	setenv("FOO", "bar", 1);
> > 	printf("FOO was %s\n", a);
> >
> > With the memory leak fixed this will use free'd memory.  While this
> > code may seem weird in a program, it actually is quite possible for a
> > library to read and cache the value of an environment variable.  If
> > you didn't leave the leak around, the library could cause a crash if
> > the main program (or another library) changed the environment variable
> > the first library had a cached pointer to the value of.
> 
> Although it would not crash, the following would fail anyway:
> 
>   	setenv("FOO", "bar", 1);
>   	a = getenv("FOO");
>   	setenv("FOO", "0", 1);
>   	printf("FOO was %s\n", a);
> 
> In this scenario, the printf() would print "0" since the second value
> had a string length less than or equal to the previous value.  The
> current implementation of setenv() would reuse the string instead of
> malloc'ing a new one.

Yeah, but it doesn't crash is the point actually.  The pointer is still
valid, though it may be overwritten with a newer value, it's still valid
and a library can reliably doing getenv() and that pointer will always
point to some value of that variable, but it won't ever point to anything
else.

> Also, this snippet from IEEE Std 1003.1, 2004 Edition regarding
> getenv()[3]:
> 
>      The return value from getenv() may point to static data which may be
>      overwritten by subsequent calls to getenv(), setenv(), or
>      unsetenv().
> 
> After the call to the second setenv(), a portable application should not
> assume that a still points to the same value.  Also, it says "may point
> to static data" suggesting (at least to me) that the pointer may point
> to dynamic memory and be freed following the call to setenv().

No, static memory means it won't be free'd but is in bss, etc.  This
statement basically backs up exactly what getenv/setenv currently do:  the
value may be overwritten.  That paragraph doesn't say that the pointer will
become invalid, just that what it points to may be stale or be overwritten
after the getenv(3) call.

Part of the problem is that we have no way to notify consumers of an 
environment variable when its value is changed.  Alternatively, we could add 
a different variant of getenv that required the user to supply the buffer,
but that's not the API we have.

> > I know for one app at my last job we had a problem with this with TZ,
> > and so we explicitly space padded the timezone name out to a
> > fixed-size each time to avoid the leak.
> 
> This is what I am trying to fix in setenv().  :)

I know, and we went with the above workaround rather than hacking 
setenv/getenv. :)

-- 
John Baldwin
Received on Wed Oct 11 2006 - 17:04:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:01 UTC