Re: lang/sbcl consumes all available memory and dies

From: Anonymous <swell.k_at_gmail.com>
Date: Tue, 17 Mar 2009 03:09:09 +0300
Kostik Belousov <kostikbel_at_gmail.com> writes:

> On Tue, Mar 17, 2009 at 12:24:03AM +0300, Anonymous wrote:
>> Kostik Belousov <kostikbel_at_gmail.com> writes:
>> 
>> > On Mon, Mar 16, 2009 at 08:55:14PM +0300, Anonymous wrote:
>> >> I noticed that after commit r189771 (ELF: .note.ABI-tag) sbcl
>> >> starts to eat all memory until it dies from bus error never reaching
>> >> REPL. The process is unkillable, too.
>> >> 
>> >>   $ sbcl
>> >>   This is SBCL 1.0.25, an implementation of ANSI Common Lisp.
>> >>   More information about SBCL is available at <http://www.sbcl.org/>.
>> >> 
>> >>   SBCL is free software, provided as is, with absolutely no warranty.
>> >>   It is mostly in the public domain; some portions are provided under
>> >>   BSD-style licenses.  See the CREDITS and COPYING files in the
>> >>   distribution for more information.
>> >>   load: 0.06  cmd: sbcl 1926 [running] 0.01u 0.44s 3% 189432k
>> >>   load: 0.06  cmd: sbcl 1926 [tx->tx_quiesce_done_cv)] 0.01u 0.72s 5% 367124k
>> >>   load: 0.78  cmd: sbcl 1926 [running] 0.01u 2.91s 14% 1763028k
>> >>   load: 0.72  cmd: sbcl 1926 [tx->tx_quiesce_done_cv)] 0.01u 3.65s 14% 2237272k
>> >>   load: 0.74  cmd: sbcl 1926 [*vm page queue mutex] 0.01u 5.78s 9% 3482892k
>> >>   zsh: bus error (core dumped)  sbcl
>> >> 
>> >> This is amd64, r189876M, zfs, 4g mem, 4g swap, sbcl 1.0.17, sbcl-1.0.25,
>> >> 1.0.26.3. I can reproduce it under qemu with clean environment as well.
>> >> 
>> >> Can somebody confirm it on i386? Just run `sbcl' and exit from REPL by
>> >> either `^D' or `(quit)'.
>> >> 
>> >> The workaround is to reverse-apply diff from r189771.
>> >
>> > I think the D-state is due to quite large vm address space of the lisp,
>> > that takes a long time to dump.
>> > For the start, can you confirm that setting sysctl
>> > machdep.prot_fault_translation to 2 solves your problem ?
>> 
>> Yep, machdep.prot_fault_translation=2 solves it on my main amd64 box and
>> in qemu-amd64. Anything else?
>
> Please, try this patch.
>
> diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c
> index f2bdcf5..5604ea5 100644
> --- a/sys/kern/imgact_elf.c
> +++ b/sys/kern/imgact_elf.c
> _at__at_ -1330,14 +1330,14 _at__at_ __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote,
>      int32_t *osrel)
>  {
>  	const Elf_Note *note, *note_end;
> -	const Elf32_Phdr *phdr, *pnote;
> -	const Elf32_Ehdr *hdr;
> +	const Elf_Phdr *phdr, *pnote;
> +	const Elf_Ehdr *hdr;
>  	const char *note_name;
>  	int i;
>  
>  	pnote = NULL;
> -	hdr = (const Elf32_Ehdr *)imgp->image_header;
> -	phdr = (const Elf32_Phdr *)(imgp->image_header + hdr->e_phoff);
> +	hdr = (const Elf_Ehdr *)imgp->image_header;
> +	phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff);
>  
>  	for (i = 0; i < hdr->e_phnum; i++) {
>  		if (phdr[i].p_type == PT_NOTE) {

Double-checked on more recent revision (r189900) under qemu-amd64
with/without the patch. The problem disappears.

Don't know about i386, though.
Received on Mon Mar 16 2009 - 23:09:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:44 UTC