Re: zfs kernel messages

From: Olli Hauer <ohauer_at_gmx.de>
Date: Thu, 25 Oct 2007 00:04:04 +0200
Pawel Jakub Dawidek wrote:
> On Tue, Oct 23, 2007 at 09:27:56PM +0200, Olli Hauer wrote:
>> lock order reversal:
>>  1st 0xc4cea568 struct mount mtx (struct mount mtx) _at_ 
>>  /usr/src/sys/modules/zfs/../../compat/opensolaris/kern/opensolaris_vfs.c:209
>>  2nd 0xc3ee9010 sleep mtxpool (sleep mtxpool) _at_ 
>>  /usr/src/sys/kern/kern_resource.c:1266
>> KDB: stack backtrace:
>> db_trace_self_wrapper(c0a9c175,e7318734,c078510e,c0a9e63c,c3ee9010,...) at 
>> db_trace_self_wrapper+0x26
>> kdb_backtrace(c0a9e63c,c3ee9010,c0a982df,c0a982df,c0a98e54,...) at 
>> kdb_backtrace+0x29
>> witness_checkorder(c3ee9010,9,c0a98e54,4f2,38,...) at 
>> witness_checkorder+0x6de
>> _mtx_lock_flags(c3ee9010,0,c0a98e54,4f2,c4c32d00,...) at 
>> _mtx_lock_flags+0xbc
>> uifree(c3f08c20,c4967220,c4cea538,e73187d4,c431b9df,...) at uifree+0x2d
>> crfree(c4c32d00,0,c439446c,d1,c3,...) at crfree+0x54
>> domount(c48f8840,c4967220,c43995fb,c469e260,e7318810,...) at domount+0x20f
>> zfsctl_snapdir_lookup(e7318aa0,e7318aa0,c48f8840,2,c4967330,...) at 
>> zfsctl_snapdir_lookup+0x362
>> VOP_LOOKUP_APV(c439d5e0,e7318aa0,c48f8840,c0aa409d,19b,...) at 
>> VOP_LOOKUP_APV+0xa5
>> lookup(e7318b48,c0aa409d,c6,bf,c4cd772c,...) at lookup+0x58e
>> namei(e7318b48,c48f8840,c0bb51d4,c48f8840,e7318b4c,...) at namei+0x34b
>> kern_lstat(c48f8840,282111b8,0,e7318c18,c0aa5520,...) at kern_lstat+0x4f
>> lstat(c48f8840,e7318cfc,8,c0a9f13d,c0b469d0,...) at lstat+0x2f
>> syscall(e7318d38) at syscall+0x2b3
>> Xint0x80_syscall() at Xint0x80_syscall+0x20
>> --- syscall (190, FreeBSD ELF32, lstat), eip = 0x2819c48b, esp = 
> 
> Revert the previous patch, refetch and try again. The new one also
> eliminates this LOR:
> 
> 	http://people.freebsd.org/~pjd/patches/opensolaris_vfs.c.patch
> 

Thanks, the patch resolved the console output problem.
The reboot hang problem is the same *only* after access to the snapshot directory.

test1:
create zool
create zfs1
creste zfs2
rsync 1GB source and other small files to zfs1/2
  reboot => no error

test1 + create snapshot, reboot => no problem
test1 + create snapshot, list files in snaphot, reboot => system hang
test1 + create snapshot, set snapdir=visible, list files in snapshot, reboot => system hang

test2:
create zool
create zfs1
creste zfs2
rsync 1GB source and other small files to zfs1
rsync zfs1 -> zfs2
  reboot => no error

test2 + snapshot => reboot no error
test2 + create snapshot, list files in snaphot, reboot => system hang
test2 + create snapshot, set snapdir=visible, list files in snapshot, reboot => system hang


Now in case the system hang i break into debugger

db> show all procs
pid ppid pgrp uid  state  wmesg wchan   cmd
804    0    0   0  SL     vgeom:io 0xcfac1388 [vdev:worker twed1]
803    0    0   0  SL     vgeom:io 0xcfd8b348 [vdev:worker twed0]
801    1  801   0  S+     soldelay 0xc0baeb14 reboot
148    0    0   0  SL     zfs:(&tq 0xc43dcb78 [zil_clean]
147    0    0   0  SL     zfs:(&tq 0xc43dcaac [zil_clean]
146    0    0   0  SL     zfs:(&tq 0xc43dc9e0 [zil_clean]
144    0    0   0  SL     zfs:(&tx 0xc465352c [txg_thread_enter]
143    0    0   0  SL     zfs:(&tx 0xc465350c [txg_thread_enter]
142    0    0   0  SL     zfs:(&tx 0xc465351c [txg_thread_enter]
149    0    0   0  SL     zfs:(&tq 0xc43dc914 [spa_zio_intr_5]
138    0    0   0  SL     zfs:(&tq 0xc43dc848 [spa_zio_issue_5]
137    0    0   0  SL     zfs:(&tq 0xc43dc77c [spa_zio_intr_4]
136    0    0   0  SL     zfs:(&tq 0xc43dc6b0 [spa_zio_issue_4]
135    0    0   0  SL     zfs:(&tq 0xc43dc5e4 [spa_zio_intr_3]
134    0    0   0  SL     zfs:(&tq 0xc43dc050 [spa_zio_issue_3]
133    0    0   0  SL     zfs:(&tq 0xc43dc11c [spa_zio_intr_2]
132    0    0   0  SL     zfs:(&tq 0xc43dc1e8 [spa_zio_issue_2]
131    0    0   0  SL     zfs:(&tq 0xc43dc2b4 [spa_zio_intr_1]
130    0    0   0  SL     zfs:(&tq 0xc43dc380 [spa_zio_issue_1]
139    0    0   0  SL     zfs:(&tq 0xc43dc44c [spa_zio_intr_0]
128    0    0   0  SL     zfs:(&tq 0xc43dc518 [spa_zio_issue_0]
111    0    0   0  SL     zfs:(&tq 0xc439df6c [arc_reclaim_thread]
110    0    0   0  SL     zfs:(&ar 0xc43dd050 [system_taskq]
  44    0    0   0  SL     -        0xc0baeb14 [system_taskq]
...
Received on Wed Oct 24 2007 - 20:04:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:20 UTC