[Backtrace] 4.9 and 5.1-RELEASE occasionly panic on RAM > 4GB without PAE (long)

From: Xin LI/ÀîöÎ <delphij_at_frontfree.net>
Date: Thu, 18 Dec 2003 08:40:45 +0800
Hello,

One of my friend is operating servers on a big ICP in China. They found that
if they run FreeBSD 4.9 or FreeBSD 5.1-RELEASE on their server, which has
more than 4GB RAM installed, then they will occasionaly have the server
panic'ed, or even rebooted without any reason.

The server hardware is: P4-Xeon-2.4G*2, 4G RAM(DDR266), 146G*2 on asr(4)
RAID-1. The kernel is not quite different from GENERIC, except that we have
AUTO_EOI1 enabled. When PAE is enabled, then the kernel will not recognize
the asr(4) device, and thus would refuse to boot from it, so we have it
disabled.

 
This problem afffects 4.9-RELEASE and 5.1-RELEASE. As the ICP is a
conservative user, they will not run -STABLE, nor -CURRENT. I have turned
the debugging symbols on their server and grabbed the following backtraces.
Also, I have found how to reproduce this problem on a machine running with >
4GB memory without a PAE configuration, which, in my opinion, should not
cause a kernel panic because it's just a resource hungry program and all it
will do should, at most, hang the system, or even being killed directly by
the operating system.


-----------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main()
{
 char* s;
 int p;
 int d;
 d = 4096;
 for(;d>0;d--)
 {
  p=fork();
  if(p==0)
  {
   s=malloc(1024*1024);
   if(s==NULL)
    printf("Pid %d malloc(1024*1024) failed\n",getpid());
   else
    memset(s,0,1024*1024);
   while(1);
  }
  else if(p==-1)
   printf("Fork() %d failed\n",d);
  else
   continue;
 }
 return 0;
}
----------------------------------------------------------------------- 

Unfortuantelly, this will cause panic after a short while. We got these
backtrace:

gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.2

GNU gdb 4.18 (FreeBSD)

Copyright 1998 Free Software Foundation, Inc.

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain
conditions.

Type "show copying" to see the conditions.

There is absolutely no warranty for GDB.  Type "show warranty" for details.

This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read
called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
2627 in elfstab_build_psymtabs

Deprecated bfd_read called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
933 in fill_symbuf

 

SMP 4 cpus

IdlePTD at phsyical address 0x003af000

initial pcb at physical address 0x00313fe0

panicstr: pmap_new_proc: u_map allocation failed

panic messages:

---

panic: pmap_new_proc: u_map allocation failed

mp_lock = 02000001; cpuid = 2; lapic.id = 01000000

boot() called on cpu#2

 

syncing disks... 3

done

Uptime: 2h44m38s

 

dumping to dev #da/0x20001, offset 263168

dump 3967 3966 3965..................

.....................................

.....................................

...............10 9 8 7 6 5 4 3 2 1 0

---

#0  dumpsys () at ../../kern/kern_shutdown.c:487

487             if (dumping++) {

(kgdb) bt

#0  dumpsys () at ../../kern/kern_shutdown.c:487

#1  0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316

#2  0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595

#3  0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919

#4  0xc0239944 in vm_fork (p1=0xfbfe2ba0, p2=0xffb5d1e0, flags=20) at
../../vm/vm_glue.c:240

#5  0xc017fa85 in fork1 (p1=0xfbfe2ba0, flags=20, procp=0xfc0fef24) at ../..
/kern/kern_fork.c:517

#6  0xc017f1fa in fork (p=0xfbfe2ba0, uap=0xfc0fef80) at
../../kern/kern_fork.c:100

#7  0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = -1077937296, tf_esi = -1077937304, tf_ebp = -1077937392,

      tf_isp = -66064428, tf_ebx = 1, tf_edx = 0, tf_ecx = -1077937066,
tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = 1208488192,

      tf_cs = 31, tf_eflags = 514, tf_esp = -1077937420, tf_ss = 47}) at
../../i386/i386/trap.c:1175

#8  0xc027e09b in Xint0x80_syscall ()

#9  0x8048506 in ?? ()

(kgdb) up

#1  0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316

316                     dumpsys();

(kgdb)

#2  0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595

595             boot(bootopt);

(kgdb)

#3  0xc028d575 in pmap_new_proc (p=0xffb5d1e0) at ../../i386/i386/pmap.c:919

919                             panic("pmap_new_proc: u_map allocation
failed");

(kgdb) l

914

915             /* get a kernel virtual address for the UPAGES for this proc
*/

916             if ((up = (vm_offset_t) p->p_addr) == 0) {

917                     up = kmem_alloc_nofault(kernel_map, UPAGES *
PAGE_SIZE);

918                     if (up == 0)

919                             panic("pmap_new_proc: u_map allocation
failed");

920                     p->p_addr = (struct user *) up;

921             }

922

923             for(i = 0; i < UPAGES; i++) {

(kgdb) p up

$1 = 0

(kgdb) p {struct proc *}p

$2 = (struct proc *) 0xffb5d040

(kgdb) p {struct proc}p

$3 = {p_procq = {tqe_next = 0xffb5d040, tqe_prev = 0x0}, p_list = {le_next =
0xffb5d380, le_prev = 0xc032ac38}, p_cred = 0xd2898620,

  p_fd = 0xd289e900, p_stats = 0x0, p_limit = 0xd275e700, p_upages_obj =
0xffbf9e60, p_procsig = 0xd289dac0, p_flag = 6, p_stat = 1 '\001',

  p_pad1 = "\000\000", p_pid = 5832, p_hash = {le_next = 0xfd8641a0, le_prev
= 0xd2198b20}, p_pglist = {le_next = 0xffb5d380,

    le_prev = 0xfbfe2bdc}, p_pptr = 0xfbfe2ba0, p_sibling = {le_next =
0xffb5d380, le_prev = 0xfbfe2bf0}, p_children = {lh_first = 0x0},

  p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace =
0xffb239c0, p_estcpu = 7, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0,

  p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval =
{tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}},

  p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0,
p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = {

    __bits = {0, 0, 0, 0}}, p_textvp = 0xfc0310c0, p_lock = 0 '\000',
p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000',

  p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0
'\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0},

  p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits =
{0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = {

    slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp
= 0x0, ss_size = 0, ss_flags = 4}, p_priority = 50 '2',

  p_usrpri = 50 '2', p_nice = 0 '\000', p_comm = "fork_and_malloc_", p_pgrp
= 0xd2740f20, p_sysent = 0xc02f13e0, p_rtprio = {type = 1,

    prio = 0}, p_prison = 0x0, p_args = 0xd2770b40, p_addr = 0x0, p_md =
{md_regs = 0x0}, p_xstat = 0, p_acflag = 0, p_ru = 0x0,

  p_nthreads = 0, p_aioinfo = 0x0, p_wakeup = 0, p_peers = 0x0, p_leader =
0xffb5d1e0, p_asleep = {as_priority = 0, as_timo = 0},

  p_emuldata = 0x0, p_fdtol = 0x0}

(kgdb) 

-----------------------------------------------------------------------

On another machine, we have tested the following CGI script. When having
heavy load, it will cause panic, too:

-----------------------------------------------------------------------

#!/bin/sh

 

printf "content-type:text/html\n\n"

 

printf "<table bordercolor=a0a0a0 border=1 cellpadding=1 cellspacing=1>\n"

 

printf "<tr>\n"

printf "<td colspan=8 align=center>`/sbin/ifconfig|grep broadcast|head -n
1|perl -pi -e 's/ +/\t/g'|cut -f3|perl -pi -e 's/\n//g'`</td>\n"

printf "</tr>\n"

 

printf "<tr>\n"

printf "<td align=center width=100>Load</td>\n"

printf "<td align=center width=100>Connections</td>\n"

printf "<td align=center width=100>Processes</td>\n"

printf "<td align=center width=100>Open Files</td>\n"

printf "<td align=center width=300>Processor(User%% System%% Free%%)</td>\n"

printf "<td align=center width=100>Active Memory</td>\n"

printf "<td align=center width=100>Swap</td>\n"

printf "</tr>\n"

 

printf "<tr>\n"

printf "<td align=right>`w w|head -n 1|perl -pi -e 's/^.*, //g'|perl -pi -e
's/\n/ /g'`</td>\n"

printf "<td align=right>`netstat -an|wc|cut -c1-8|perl -pi -e 's/ //g'|perl
-pi -e 's/\n/ /g'`</td>\n"

printf "<td align=right>`ps ax|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e
's/\n/ /g'`</td>\n"

printf "<td align=right>`fstat|wc|cut -c1-8|perl -pi -e 's/ //g'|perl -pi -e
's/\n/ /g'`</td>\n"

printf "<td align=right>`vmstat|tail -n 1|perl -pi -e 's/ +/\t/g'|cut
-f18-|perl -pi -e 's/\t/%% /g'|perl -pi -e 's/\n/%%/g'`</td>\n"

printf "<td align=right>`top -n|head -n 4|tail -n 1|perl -pi -e 's/
+/\t/g'|cut -f2|perl -pi -e 's/\n/ /g'`</td>\n"

printf "<td align=right>`top -n|head -n 5|tail -n 1|perl -pi -e 's/
+/\t/g'|cut -f4|perl -pi -e 's/\n/ /g'`</td>\n"

printf "</tr>\n"

 

printf "</table>\n"

-----------------------------------------------------------------------

To simulate the situation, we have used ab:

/usr/local/apache/bin/ab -n 16384000 -c 1024

And we got this:

-----------------------------------------------------------------------

gdb /usr/src/sys/compile/DELPHIJ/kernel.debug -k vmcore.1

GNU gdb 4.18 (FreeBSD)

Copyright 1998 Free Software Foundation, Inc.

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain
conditions.

Type "show copying" to see the conditions.

There is absolutely no warranty for GDB.  Type "show warranty" for details.

This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read
called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
2627 in elfstab_build_psymtabs

Deprecated bfd_read called at
/usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line
933 in fill_symbuf

 

SMP 4 cpus

IdlePTD at phsyical address 0x003af000

initial pcb at physical address 0x00313fe0

panicstr: pmap_new_proc: u_map allocation failed

panic messages:

---

panic: pmap_new_proc: u_map allocation failed

mp_lock = 01000001; cpuid = 1; lapic.id = 06000000

boot() called on cpu#1

 

syncing disks... 1

done

Uptime: 1d0h36m40s

 

dumping to dev #da/0x20001, offset 263168

dump 3967 3966 3965..................

.....................................

.....................................

...............10 9 8 7 6 5 4 3 2 1 0

---

#0  dumpsys () at ../../kern/kern_shutdown.c:487

487             if (dumping++) {

(kgdb) bt

#0  dumpsys () at ../../kern/kern_shutdown.c:487

#1  0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316

#2  0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595

#3  0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919

#4  0xc0239944 in vm_fork (p1=0xfed67780, p2=0xffa7f680, flags=20) at
../../vm/vm_glue.c:240

#5  0xc017fa85 in fork1 (p1=0xfed67780, flags=20, procp=0xfed6ef24) at ../..
/kern/kern_fork.c:517

#6  0xc017f1fa in fork (p=0xfed67780, uap=0xfed6ef80) at
../../kern/kern_fork.c:100

#7  0xc02916e1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
tf_edi = 134960572, tf_esi = 135016448, tf_ebp = -1077938448,

      tf_isp = -19468332, tf_ebx = 134959244, tf_edx = 4, tf_ecx =
-1077938376, tf_eax = 2, tf_trapno = 22, tf_err = 2, tf_eip = 134672192,

      tf_cs = 31, tf_eflags = 582, tf_esp = -1077938476, tf_ss = 47}) at
../../i386/i386/trap.c:1175

#8  0xc027e09b in Xint0x80_syscall ()

#9  0x804b34f in ?? ()

#10 0x804d915 in ?? ()

#11 0x804d403 in ?? ()

#12 0x804d17e in ?? ()

#13 0x804b4fb in ?? ()

#14 0x804ac40 in ?? ()

#15 0x80526be in ?? ()

#16 0x805262a in ?? ()

#17 0x804813e in ?? ()

(kgdb) up

#1  0xc0185beb in boot (howto=256) at ../../kern/kern_shutdown.c:316

316                     dumpsys();

(kgdb)

#2  0xc018605d in panic (fmt=0xc02e15e0 "pmap_new_proc: u_map allocation
failed") at ../../kern/kern_shutdown.c:595

595             boot(bootopt);

(kgdb)

#3  0xc028d575 in pmap_new_proc (p=0xffa7f680) at ../../i386/i386/pmap.c:919

919                             panic("pmap_new_proc: u_map allocation
failed");

(kgdb) l

914

915             /* get a kernel virtual address for the UPAGES for this proc
*/

916             if ((up = (vm_offset_t) p->p_addr) == 0) {

917                     up = kmem_alloc_nofault(kernel_map, UPAGES *
PAGE_SIZE);

918                     if (up == 0)

919                             panic("pmap_new_proc: u_map allocation
failed");

920                     p->p_addr = (struct user *) up;

921             }

922

923             for(i = 0; i < UPAGES; i++) {

(kgdb) p p

$1 = (struct proc *) 0xffa7f680

(kgdb) p {struct proc *}p

$2 = (struct proc *) 0xffa7f4e0

(kgdb) p {struct proc}p

$3 = {p_procq = {tqe_next = 0xffa7f4e0, tqe_prev = 0x0}, p_list = {le_next =
0xffa7f820, le_prev = 0xc032ac38}, p_cred = 0xd3cb9f20,

  p_fd = 0xd3c95d00, p_stats = 0x0, p_limit = 0xd3c9e100, p_upages_obj =
0xff64de04, p_procsig = 0xd3ca7400, p_flag = 4, p_stat = 1 '\001',

  p_pad1 = "\000\000", p_pid = 26450, p_hash = {le_next = 0xfd812dc0,
le_prev = 0xd2198d48}, p_pglist = {le_next = 0xff18c3c0,

    le_prev = 0xfed677bc}, p_pptr = 0xfed67780, p_sibling = {le_next = 0x0,
le_prev = 0xfed677d0}, p_children = {lh_first = 0x0},

  p_ithandle = {callout = 0x0}, p_oppid = 0, p_dupfd = 0, p_vmspace =
0xffae2500, p_estcpu = 184, p_cpticks = 0, p_pctcpu = 0, p_wchan = 0x0,

  p_wmesg = 0x0, p_swtime = 0, p_slptime = 0, p_realtimer = {it_interval =
{tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}},

  p_runtime = 0, p_uu = 0, p_su = 0, p_iu = 0, p_uticks = 0, p_sticks = 0,
p_iticks = 0, p_traceflag = 0, p_tracep = 0x0, p_siglist = {

    __bits = {0, 0, 0, 0}}, p_textvp = 0xf9b5c740, p_lock = 0 '\000',
p_oncpu = 0 '\000', p_lastcpu = 0 '\000', p_rqindex = 0 '\000',

  p_locks = 0, p_simple_locks = 0, p_stops = 0, p_stype = 0, p_step = 0
'\000', p_pfsflags = 0 '\000', p_pad3 = "\000", p_retval = {0, 0},

  p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_oldsigmask = {__bits =
{0, 0, 0, 0}}, p_sig = 0, p_code = 0, p_klist = {

    slh_first = 0x0}, p_sigmask = {__bits = {0, 0, 0, 0}}, p_sigstk = {ss_sp
= 0x0, ss_size = 0, ss_flags = 4}, p_priority = 52 '4',

  p_usrpri = 52 '4', p_nice = 0 '\000', p_comm =
"sh\000pd\000r\000\000\000\000\000\000\000\000\000", p_pgrp = 0xd21ce460,

  p_sysent = 0xc02f13e0, p_rtprio = {type = 1, prio = 0}, p_prison = 0x0,
p_args = 0xd3c97240, p_addr = 0x0, p_md = {md_regs = 0x0},

  p_xstat = 0, p_acflag = 0, p_ru = 0x0, p_nthreads = 0, p_aioinfo = 0x0,
p_wakeup = 0, p_peers = 0x0, p_leader = 0xffa7f680, p_asleep = {

    as_priority = 0, as_timo = 0}, p_emuldata = 0x0, p_fdtol = 0x0}

(kgdb) p kernel_map

$4 = 0xc031a60c

-----------------------------------------------------------------------

I have took a sight on the pmap and vm code, and did not figured out why the
panic was caused. So I post it here, and I will have to say that -CURRENT
may have the same problem, and it may affect the upcoming 5.2-RELEASE. For
additional information, please fell free to contact me, and I can act as
liaison.

 

Any suggestions? Thank you in advance!

 

Xin LI,

FreeBSD (Simplified) Chinese Project
Received on Wed Dec 17 2003 - 15:40:54 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:34 UTC