on 13/10/2010 21:43 Andriy Gapon said the following: > Further walking child zio hierarchy we reach the one that looks like this: > $59 = {io_bookmark = {zb_objset = 400, zb_object = 0, zb_level = -1, zb_blkid = > 22437}, io_prop = {zp_checksum = ZIO_CHECKSUM_INHERIT, zp_compress = > ZIO_COMPRESS_INHERIT, zp_type = DMU_OT_NONE, > zp_level = 0 '\0', zp_ndvas = 0 '\0'}, io_type = ZIO_TYPE_WRITE, io_child_type > = ZIO_CHILD_VDEV, io_cmd = 0, io_priority = 0 '\0', io_reexecute = 0 '\0', > io_state = "\001", io_txg = 0, > io_spa = 0xffffff00056c6000, io_bp = 0xffffff01acdbaa30, io_bp_copy = {blk_dva = > {{dva_word = {12884902144, 1678614837}}, {dva_word = {0, 0}}, {dva_word = {0, > 0}}}, blk_prop = 9225910817809957119, > blk_pad = {0, 0, 0}, blk_birth = 236695, blk_fill = 0, blk_cksum = {zc_word = > {15569186404091016741, 3408946246337318984, 400, 22437}}}, io_parent_list = > {list_size = 48, list_offset = 16, > list_head = {list_next = 0xffffff000826b7c0, list_prev = 0xffffff000826b7c0}}, > io_child_list = {list_size = 48, list_offset = 32, list_head = {list_next = > 0xffffff00080aca98, > list_prev = 0xffffff00080aca98}}, io_walk_link = 0x0, io_logical = > 0xffffff0008b8d660, io_transform_stack = 0x0, io_ready = 0, io_done = > 0xffffffff80b99ab0 <vdev_mirror_child_done>, > io_private = 0xffffff00b5f469a8, io_bp_orig = {blk_dva = {{dva_word = > {12884902144, 1678614837}}, {dva_word = {0, 0}}, {dva_word = {0, 0}}}, blk_prop = > 9225910817809957119, blk_pad = {0, 0, 0}, > blk_birth = 236695, blk_fill = 0, blk_cksum = {zc_word = > {15569186404091016741, 3408946246337318984, 400, 22437}}}, io_data = > 0xffffff80e6565000, io_size = 131072, io_vd = 0xffffff00084cd000, > io_vsd = 0x0, io_vsd_free = 0, io_offset = 859454990848, io_deadline = 20883, > io_offset_node = {avl_child = {0x0, 0x0}, avl_pcb = 18446742974333891893}, > io_deadline_node = {avl_child = {0x0, 0x0}, > avl_pcb = 1}, io_vdev_tree = 0xffffff00084cd578, io_flags = 179, io_stage = > ZIO_STAGE_VDEV_IO_START, io_pipeline = 47104, io_orig_flags = 131, io_orig_stage = > ZIO_STAGE_READY, > io_orig_pipeline = 47104, io_error = 0, io_child_error = {0, 0, 0}, io_children > = {{0, 0}, {0, 0}, {0, 0}}, io_stall = 0x0, io_gang_leader = 0x0, io_gang_tree = 0x0, > io_executor = 0xffffff000875a8a0, io_waiter = 0x0, io_lock = {lock_object = > {lo_name = 0xffffffff80c29a8b "zio->io_lock", lo_flags = 40960000, lo_data = 0, > lo_witness = 0x0}, sx_lock = 1}, > io_cv = {cv_description = 0xffffffff80c29a9a "zio->io_cv)", cv_waiters = 0}, > io_ena = 0, io_task = {ost_task = {ta_running = 0x0, ta_link = {stqe_next = 0x0}, > ta_pending = 0, ta_priority = 0, > ta_func = 0, ta_context = 0x0}, ost_func = 0, ost_arg = 0x0, ost_magic = 0}} So, after some more investigation, it looks like this zio is genuinely stuck, because its bio is stuck in geom because its ccb/command is stuck in arcmsr. Looks like the driver (controller/firmware) isn't processing any more requests. Perhaps a hardware issue, but I reckon that the driver should have detected the situation, timed out the commands and reset the hardware (if needed). Anyway, it looks that this is not related to ZFS[*]. Maybe firmware and BIOS should be updated, maybe hardware replaced. [*] Perhaps ZFS should have its own zio timeout mechanism. And/or GEOM. And/or peripheral or transport layer of CAM. But, IMO, the SIM drivers must have it. -- Andriy GaponReceived on Thu Oct 14 2010 - 15:20:54 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC