Re: <jemalloc>: jemalloc_arena.c:182: Failed assertion: "p[i] == 0"

From: Steve Wills <swills_at_FreeBSD.org>
Date: Mon, 7 May 2012 15:19:25 -0400
> On Apr 21, 2012, at 11:54 AM, David Wolfskill wrote:
>> After applying Dimitry Andric's patches to contrib/jemalloc and
>> replacing
>> /usr/bin/as with one built last Sunday, I was finally(!) able to rebuild
>> head as of 234536:
>>
>> FreeBSD freebeast.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #797
>> 234536M: Sat Apr 21 10:23:33 PDT 2012
>> root_at_freebeast.catwhisker.org:/usr/obj/usr/src/sys/GENERIC  i386
>>
>> However, as I was copying a /usr/obj hierarchy via tar -- e.g.:
>>
>> root_at_freebeast:/common/home/david # (cd /var/tmp && rm -fr obj && mkdir
>> obj) && (cd /usr && tar cpf - obj) | (cd /var/tmp && tar xpf -)
>>
>> it ran for a while, then:
>>
>> <jemalloc>: jemalloc_arena.c:182: Failed assertion: "p[i] == 0"
>> Abort (core dumped)
>> root_at_freebeast:/common/home/david # echo $?
>> 134
>> root_at_freebeast:/common/home/david # ls -lTio *.core
>> ls: No match.
>> root_at_freebeast:/common/home/david #
>>
>> So ... no core file, apparently.
>>
>> freebeast(10.0-C)[2] find /usr/src/contrib/jemalloc -type f -name
>> jemalloc_arena.c
>> freebeast(10.0-C)[3]
>>
>> No file named "jemalloc_arena.c", either.
>>
>> But contrib/jemalloc/src/arena.c contains a function,
>> arena_chunk_validate_zeroed():
>>
>>    175 static inline void
>>    176 arena_chunk_validate_zeroed(arena_chunk_t *chunk, size_t run_ind)
>>    177 {
>>    178         size_t i;
>>    179         UNUSED size_t *p = (size_t *)((uintptr_t)chunk + (run_ind
>> << LG_PAGE));
>>    180
>>    181         for (i = 0; i < PAGE / sizeof(size_t); i++)
>>    182                 assert(p[i] == 0);
>>    183 }
>>
>> Thoughts?
>
> I received a similar report yesterday in the context of filezilla, but
> didn't get as far as reproducing it.  I think the problem is in
> chunk_alloc_dss(), which dangerously claims that newly allocated memory is
> zeroed.  It looks like I formalized this bad assumption in early 2010,
> though the bug existed before that.  It's a bigger deal now because sbrk()
> is preferred over mmap(), so the bug has languished for a couple of years.
>  I'll get a fix committed today (and revert the order of preference
> between sbrk() and mmap()).
>
> By the way, I wonder why not everyone hits this (I don't).
>

I just now hit the same issue while using ports tinderbox. It was calling
tar during the "makeJail" tinderbox subcommand and gave the same error as
in the subject. Funny thing is I had run the same command (on a different
"jail") right before this and didn't get the error. What's the status of
this? Should I set MALLOC_PRODUCTION=yes in /etc/make.conf, rebuild world
and forget about it?

Steve
Received on Mon May 07 2012 - 17:19:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:26 UTC