Re: r319971 -> r320351: Fatal error 'Cannot allocate red zone for initial thread'

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Thu, 29 Jun 2017 13:16:23 +0200
On 06/26/17 15:03, O. Hartmann wrote:
> On Mon, 26 Jun 2017 14:48:58 +0200
> Gary Jennejohn <gljennjohn_at_gmail.com> wrote:
> 
>> On Mon, 26 Jun 2017 14:00:48 +0200
>> "O. Hartmann" <ohartmann_at_walstatt.org> wrote:
>>
>>> On Mon, 26 Jun 2017 13:26:08 +0200
>>> Gary Jennejohn <gljennjohn_at_gmail.com> wrote:
>>>    
>>>> On Mon, 26 Jun 2017 10:29:47 +0200
>>>> "O. Hartmann" <o.hartmann_at_walstatt.org> wrote:
>>>>      
>>>>> Over the past week we did not update several 12-CURRENT running
>>>>> development hosts, so today is the first day of performing this task.
>>>>>
>>>>> First I hit the very same problem David Wolfskill reported earlier, a
>>>>> fatal trap 12, but fowllowing the thread, I did as advised:
>>>>> removing /usr/obj completely (we use filemon/WITH_META_MODE=YES all
>>>>> over the place) and recompiling world and kernel.
>>>>>
>>>>> Since tag 20170617 in /usr/src/UPDATING referred to the INO64 update
>>>>> and the INO64 update hasn't performed so far starting from r319971, I
>>>>> installed the kernel, rebooted the box in single user mode (this time
>>>>> smoothly), did a mergemaster and tried to do "make installworld" - but
>>>>> the box instantanously bails out:
>>>>>
>>>>> [...]
>>>>> Fatal error 'Cannot allocate red zone for initial thread' at line 392 in
>>>>> file /usr/src/lib/libthread/thr_init.c
>>>>> pid 60 (cc) uid0: exited on signal 6 ...
>>>>>
>>>>> [...]
>>>>>
>>>>> That way, I obviously can not install a world :-(
>>>>>
>>>>> What is wrong here? Is the problem resovable?
>>>>>        
>>>>
>>>> How recent was your last update?  Some changes were made just a few
>>>> hours ago to fix a stack growth problem in threads.
>>>
>>> Well, what do you mean by "...  source is not up to date ..."?
>>> Performing an svn update of /usr/src should suffice, shouldn't
>>> it?  If not, then ...  please correct me.  I think the sources
>>> are up to date as of the moment the bug occured.
>>>
>>> I consider the sources up to date, it is on the latest updated
>>> box r320355.
>>>    
>>
>> You did not explicitly state in the orignal post at which SVN
>> revison your code was.  Seems to me that my question was
>> reasonable.
>>
>> Now it's clear that your source should have been up to date.
>>
>> Just for the record, I just booted a kernel from SVN r320357 which
>> immediately resulted in a kernel panic.  I had to delete everything
>> under /usr/obj/usr/src/sys to get a working kernel.
> 
> That has been made clear earlier in the thread, telling us that NO_CLEAN
> and/or META_MODE leaves the object tree in a somehow unusable state. Id did so
> twice this morning.
> 
> I have to build a kernel with KTRACE capabilities as requested herein, but can
> perform this not before tomorrow morning.
> 
> Some people seem to report positive updates, but starting from a later svn
> revision. So the problem seems to be transitional ...

Hi,

This happens on systems with identical 12-current kernel and different 
user-space, like jails.

Fatal error 'Cannot allocate red zone for initial thread' at line 399 in
file /usr/src/lib/libthr/thread/thr_init.c (errno = 12)

In the one case I have a more recent 12-current user-space. On the other 
one which is failing I have 10-stable with 12-current kernel.

Here is the kdump leading up to the error:

>   1465 xxx RET   __sysctl 0
>   1465 xxx CALL  __sysctl(0x7fffffffdb90,0x3,0x80127632c,0x7fffffffdc38,0,0)
>   1465 xxx SCTL  "kern.smp.cpus"
>   1465 xxx RET   __sysctl 0
>   1465 xxx CALL  mmap(0,0x400000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
>   1465 xxx RET   mmap 34389098496/0x801c00000
>   1465 xxx CALL  thr_self(0x801c06400)
>   1465 xxx RET   thr_self 0
>   1465 xxx CALL  mmap(0x7fffffbfe000,0x1000,0<PROT_NONE>,0x1000<MAP_ANON>,0xffffffff,0)
>   1465 xxx RET   mmap -1 errno 12 Cannot allocate memory
>   1465 xxx CALL  write(0x2,0x7fffffffdbc7,0x1)
>   1465 xxx GIO   fd 2 wrote 1 byte

Does this give any hints about a solution? Is this related to 
sign-extension of the -1U parameter in the end of mmap() ?

--HPS
Received on Thu Jun 29 2017 - 09:18:43 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC