7-BETA3 everyday reboot (was: BETA3 crash)

From: Alexandre Biancalana <biancalana_at_gmail.com>
Date: Wed, 28 Nov 2007 11:37:21 -0200
Hi list,

 The server continue to reboot every night and I *really* need some
help to track this down.

 The message past 2 days before the reboot are simillar:
Nov 27 01:36:03 Manny syslogd: kernel boot file is /boot/kernel/kernel
Nov 27 01:36:03 Manny kernel: panic: vm_fault: fault on nofault entry,
addr: ffffffffd5e0b000
Nov 27 01:36:03 Manny kernel: cpuid = 0
Nov 27 01:36:03 Manny kernel: KDB: stack backtrace:
Nov 27 01:36:03 Manny kernel: db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Nov 27 01:36:03 Manny kernel: panic() at panic+0x17a
Nov 27 01:36:03 Manny kernel: vm_fault() at vm_fault+0x14c1
Nov 27 01:36:03 Manny kernel: trap_pfault() at trap_pfault+0x218
Nov 27 01:36:03 Manny kernel: trap() at trap+0x30c
Nov 27 01:36:03 Manny kernel: calltrap() at calltrap+0x8
Nov 27 01:36:03 Manny kernel: --- trap 0xc, rip = 0xffffffffd81d4e54,
rsp = 0xffffffffd8245960, rbp = 0xffffffffd82459a0 ---
Nov 27 01:36:03 Manny kernel: dsl_dir_set_reservation_check() at
dsl_dir_set_reservation_check+0x24
Nov 27 01:36:03 Manny kernel: dsl_sync_task_group_sync() at
dsl_sync_task_group_sync+0x96
Nov 27 01:36:03 Manny kernel: dsl_pool_sync() at dsl_pool_sync+0xc3
Nov 27 01:36:03 Manny kernel: spa_sync() at spa_sync+0x390
Nov 27 01:36:03 Manny kernel: txg_sync_thread() at txg_sync_thread+0x12f
Nov 27 01:36:03 Manny kernel: fork_exit() at fork_exit+0x12a
Nov 27 01:36:03 Manny kernel: fork_trampoline() at fork_trampoline+0xe
Nov 27 01:36:03 Manny kernel: --- trap 0, rip = 0, rsp =
0xffffffffd8245d30, rbp = 0 ---
Nov 27 01:36:03 Manny kernel: Uptime: 22h22m55s
Nov 27 01:36:03 Manny kernel: Physical memory: 2036 MB
Nov 27 01:36:03 Manny kernel: Dumping 1256 MB: 1241 1225 1209 1193
1177 1161 1145 1129 1113 1097 1081 1065 1049 1033 1017 1001 985 969
953 937 9
21 905 889 873 857 841 825 809 793 777 761 745 729 713 697 681 665 649
633 617 601 585 569 553 537 521 505 489 473 457 441 425 409 393 377
361 3
45 329 313 297 281 265 249 233 217 201 185 169 153 137 121 105 89 73 57 41 25 9


 Last night the message before reboot was different:

Nov 28 03:37:18 Manny syslogd: kernel boot file is /boot/kernel/kernel
Nov 28 03:37:18 Manny kernel: panic: kmem_malloc(131072): kmem_map too
small: 878166016 total allocated
Nov 28 03:37:18 Manny kernel: cpuid = 1
Nov 28 03:37:18 Manny kernel: KDB: stack backtrace:
Nov 28 03:37:18 Manny kernel: db_trace_self_wrapper() at
db_trace_self_wrapper+0x2a
Nov 28 03:37:18 Manny kernel: panic() at panic+0x17a
Nov 28 03:37:18 Manny kernel: kmem_malloc() at kmem_malloc+0x49a
Nov 28 03:37:18 Manny kernel: uma_large_malloc() at uma_large_malloc+0x4a
Nov 28 03:37:18 Manny kernel: malloc() at malloc+0x12d
Nov 28 03:37:18 Manny kernel: zil_lwb_write_start() at zil_lwb_write_start+0x283
Nov 28 03:37:18 Manny kernel: zil_commit_writer() at zil_commit_writer+0x1c4
Nov 28 03:37:18 Manny kernel: zil_commit() at zil_commit+0xb8
Nov 28 03:37:18 Manny kernel: zfs_sync() at zfs_sync+0x9a
Nov 28 03:37:18 Manny kernel: sync_fsync() at sync_fsync+0x187
Nov 28 03:37:18 Manny kernel: sched_sync() at sched_sync+0x354
Nov 28 03:37:18 Manny kernel: fork_exit() at fork_exit+0x12a
Nov 28 03:37:18 Manny kernel: fork_trampoline() at fork_trampoline+0xe
Nov 28 03:37:18 Manny kernel: --- trap 0, rip = 0, rsp =
0xffffffffd4d25d30, rbp = 0 ---
Nov 28 03:37:18 Manny kernel: Uptime: 1d1h58m22s
Nov 28 03:37:18 Manny kernel: Physical memory: 2036 MB
Nov 28 03:37:18 Manny kernel: Dumping 1617 MB: 1602 1586 1570 1554
1538 1522 1506 1490 1474 1458 1442 1426 1410 1394 1378 1362 1346 1330
1314 1298 1282 1266 1250 1234 1218 1202 1186 1170 1154 1138 1122 1106
1090 1074 1058 1042 1026 1010 994 978 962 946 930 914 898 882 866 850
834 818 802 786 770 754 738 722 706 690 674 658 642 626 610 594 578
562 546 530 514 498 482 466 450 434 418 402 386 370 354 338 322 306
290 274 258 242 226 210 194 178 162 146 130 114 98 82 66 50 34 18 2
Nov 28 03:37:18 Manny kernel: Dump complete
Nov 28 03:37:18 Manny kernel: Automatic reboot in 15 seconds - press a
key on the console to abort
Nov 28 03:37:18 Manny kernel: Rebooting...
Nov 28 03:37:18 Manny kernel: cpu_reset: Stopping other CPUs

Looking at /var/crash/info.9 (the last crash) I see the message "Panic
String: kmem_malloc(131072): kmem_map too small: 878166016 total
allocated", I never had this crash before see my /boot/loader.conf:

vm.kmem_size_max="1G"
vm.kmem_size="1G"


I have the 3 vmcore files, but as said before I need some help to analyze then:

Manny:/var/crash # ls -la
total 4143934
drwxr-x---   2 root  wheel         512 Nov 28 11:19 .
drwxr-xr-x  24 root  wheel         512 Nov 28 01:37 ..
-rw-r--r--   1 root  wheel           3 Nov 28 03:37 bounds
-rw-------   1 root  wheel         471 Nov 26 03:10 info.7
-rw-------   1 root  wheel         471 Nov 27 01:36 info.8
-rw-------   1 root  wheel         481 Nov 28 03:37 info.9
-rw-r--r--   1 root  wheel           5 Sep 17 11:33 minfree
-rw-------   1 root  wheel  1235959808 Nov 26 03:14 vmcore.7
-rw-------   1 root  wheel  1317441536 Nov 27 01:37 vmcore.8
-rw-------   1 root  wheel  1695657984 Nov 28 03:39 vmcore.9
Manny:/var/crash # kgdb /boot/kernel/kernel vmcore.9
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".
(no debugging symbols found)...Attempt to extract a component of a
value that is not a structure pointer.
(kgdb) bt
#0  0xffffffff80326aea in doadump ()
#1  0xffffffffd4d25780 in ?? ()
#2  0xffffffff80327005 in boot ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

This machine is running sources from 25/11 with AMD64 and ZFS. The
complete dmesg of this machine can be seen at first message of the
previus thread (http://www.nabble.com/BETA3-crash-(zfs-related--)-t4865643.html)

I can provide any additional information need.

Any help is *very* appreciated !

Best Regards,
Alexandre
Received on Wed Nov 28 2007 - 12:37:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:23 UTC