Scott Burns wrote: > Hello, > > I am running several servers using Pawel's July 27 ZFS patchset, applied > against 8-current source from the same day. I have seen a similar panic > on two different servers: ... > Stopped at _mtx_lock_flags+0x15: lock cmpxchgq %rsi,0x18(%rdi) > db> bt > Tracing pid 95276 tid 100432 td 0xffffff010b3cc000 > _mtx_lock_flags() at _mtx_lock_flags+0x15 > zone_dataset_visible() at zone_dataset_visible+0x94 > zfs_mount() at zfs_mount+0x3e5 ... With a bit of testing, I found that this panic is easily reproducible. Simply try to list the contents of a snapshot from within a jail, as long as the snapshot isn't already mounted, and the system panics. If I mount the snapshot from outside of the jail first, and then list it inside the jail, it does not panic. I spent a bit of time debugging this weekend. Trying to list an unmounted snapshot triggers a zfs_mount() for the snapshot, which calls zone_dataset_visible() to determine if the snapshot should be visible in the current zone. When it is run outside of a jail, it returns true early on because INGLOBALZONE(curproc) is true, otherwise it takes another code path. The panic is happening after that check, at mtx_lock(&pr->cr_mtx), because (pr = curthread->td_ucred->cr_prison) is NULL. Interestingly, it's not NULL if zone_dataset_visible() is triggered by a "zfs list" command, but it is NULL if zone_dataset_visible() is called from zfs_mount(). As a temporary workaround, I modified my copy of cddl/compat/opensolaris/kern/opensolaris_zone.c to have zone_dataset_visible() return true if it is being called for a snapshot. I modified it as below: -if (INGLOBALZONE(curproc)) +if (INGLOBALZONE(curproc) || strchr(dataset, '_at_')) This is obviously not ideal, since it allows the manipulation of the snapshot from another jail if the caller knows that it exists. Since I am the only one with root access to any of the jails, I am not concerned with that. "zfs list" continues to behave normally. I will continue looking at this, but since my main goal of working around the panic has been taken care of, I am not sure how long my attention span will last. If the cause of curthread->td_ucred->cr_prison being NULL under these conditions is obvious to anyone, please let me know. -- Scott Burns System Administrator BQ Internet CorporationReceived on Mon Sep 22 2008 - 15:21:18 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:35 UTC