Re: uath0 issues: eject-caused panic, won't work after a restart.

From: Weongyo Jeong <weongyo.jeong_at_gmail.com> Date: Fri, 2 Oct 2009 18:42:51 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:56 UTC

Hello Lucius,

On Sun, Jul 12, 2009 at 12:53:38PM +0200, Lucius Windschuh wrote:
> Hi guys,
> I'm using CURRENT r195362MP and have two issues with it.
> 
> 1st: Pulling the device while the kernel was nearly finished shutting
> down shutting down resulted in a kernel panic:
> 
> Waiting (max 60 seconds) for system process `vnlru' to stop...done
> Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
> Waiting (max 60 seconds) for system process `syncer' to stop...
> Syncing disks, vnodes remaining...2 1 1 1 0 0 done
> All buffers synced.
> 
> [LORs: ufs, syncer, defs; ELI detaches]
> 
> ugen3.2: <Atheros Communications Inc> at usbus3 (disconnected)
> uath0: at uhub3, port 1, addr 2 (disconnected)
> 
> lock order reversal:
>  1st 0xc6793208 if_afdata (if_afdata) _at_ /usr/src/sys/net/if.c:876
>  2nd 0xc0bf82c8 mld_mtx (mld_mtx) _at_ /usr/src/sys/netinet6/mld6.c:577
> KDB: stack backtrace:
> db_trace_self_wrapper(c09cafad,e8b72a64,c06f0fe5,c06e1d9b,c09cde42,...)
> at db_trace_self_wrapper+0x26
> kdb_backtrace(c06e1d9b,c09cde42,c6113ca0,c610e9c0,e8b72ac0,...) at
> kdb_backtrace+0x29
> _witness_debugger(c09cde42,c0bf82c8,c09cdfbe,c610e9c0,c09e6205,...) at
> _witness_debugger+0x25
> witness_checkorder(c0bf82c8,9,c09e6205,241,0,...) at witness_checkorder+0x839
> _mtx_lock_flags(c0bf82c8,0,c09e6205,241,c79b6ac0,...) at _mtx_lock_flags+0xc4
> mld_domifdetach(c6793000,c6793208,c0a50c60,e8b72b64,c0755e1c,...) at
> mld_domifdetach+0x2c
> in6_domifdetach(c6793000,c79b6ac0,36c,440,c679322c,...) at in6_domifdetach+0x15
> if_detach(c6793000,c79de00c,e8b72b9c,c79be200,c79ed000,...) at if_detach+0x85c
> ieee80211_ifdetach(c79ed000,0,c79c1c40,201,c6793000,...) at
> ieee80211_ifdetach+0x14
> uath_detach(c6e84c00,c67cd860,c0a336c8,a3c,c06d7d89,...) at uath_detach+0x80
> device_detach(c6e84c00,c09b6190,c6773650,1,2,...) at device_detach+0x8c
> usb_detach_device(c688c43c,0,c09b5fa1,199,19c1b4f,...) at
> usb_detach_device+0x178
> usb_unconfigure(c789d400,c05ef530,c78801e0,7b4,0,...) at usb_unconfigure+0x5a
> usb_free_device(c688c400,3,1,10,e8b72ca8,...) at usb_free_device+0x1be
> uhub_explore(c6785400,0,c09b5463,e0,c6504d34,...) at uhub_explore+0x2a7
> usb_bus_explore(c6504d34,c6504dac,c09b768a,67,c0a89000,...) at
> usb_bus_explore+0xbb
> usb_process(c6504cd4,e8b72d38,c09c3242,342,c6744d48,...) at usb_process+0xde
> fork_exit(c05faed0,c6504cd4,e8b72d38) at fork_exit+0xb8
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xe8b72d70, ebp = 0 ---
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0xdeadc0de
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc074c318
> stack pointer           = 0x28:0xc5f3f9e8
> frame pointer           = 0x28:0xc5f3f9e8
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 11 (swi4: clock)
> 
> Held locks:
> exclusive sx 123456789ABCDEF - USB config SX lock (123456789ABCDEF -
> USB config SX lock) r = 0 (0xc688c43c) locked _at_
> /usr/src/sys/dev/usb/usb_device.c:409
> exclusive sleep mutex Giant (Giant) r = 0 (0xc0a84a30) locked _at_
> /usr/src/sys/kern/kern_intr.c:1164
> shared sx module subsystem sx lock (module subsystem sx lock) r = 0
> (0xc0a83fe0) locked _at_ /usr/src/sys/kern/kern_module.c:103
> 
> More information from the textdump:
> 
> db:1:locks>  show alllocks
> Process 29 (usbus3) thread 0xc6742000 (100057)
> Process 11 (intr) thread 0xc643f480 (100021)
> Process 1 (init) thread 0xc6178000 (100001)
> db:1:alllocks>  show lockedvnods
> Locked vnodes
> db:0:kdb.enter.default>  show pcpu
> cpuid        = 0
> dynamic pcpu    = 0x5b99f2
> curthread    = 0xc6176480: pid 11 "swi4: clock"
> curpcb       = 0xc5f3fd90
> fpcurthread  = none
> idlethread   = 0xc6176b40: pid 10 "idle: cpu0"
> APIC ID      = 0
> currentldt   = 0x50
> spin locks held:
> db:0:kdb.enter.default>  bt
> Tracing pid 11 tid 100006 td 0xc6176480
> strlen(deadc0de,c5f3fb38,4,80000000,c5f3fa7c,...) at strlen+0x8
> kvprintf(c09c6735,c06e0520,c5f3fb38,a,c5f3fb78,...) at kvprintf+0x8fe
> vsnprintf(c0a84d00,100,c09c6735,c5f3fb78,0,...) at vsnprintf+0x3b
> panic(c09c6735,deadc0de,c09ddcd9,645,c0a83210,...) at panic+0x8d
> _mtx_lock_flags(c67e0000,0,c09ddcd9,645,c67e0000,...) at _mtx_lock_flags+0x9a
> adhoc_age(c67f2800,c5f3fbe8,c06d2e0c,c0a83210,0,...) at adhoc_age+0x32
> sta_age(c67f2800,c5f3fc48,c078dd71,c79ed000,c6176480,...) at sta_age+0x1c
> ieee80211_scan_timeout(c79ed000,c6176480,c0a852c0,c6176480,c5f3fc2c,...)
> at ieee80211_scan_timeout+0x1c
> ieee80211_node_timeout(c79ed000,0,c09c8fe3,176,c0a852f4,...) at
> ieee80211_node_timeout+0x21
> softclock(c0a852c0,c5f3fcc8,c06a0834,c0a89680,c61b6b38,...) at softclock+0x24a
> intr_event_execute_handlers(c6174aa0,c61b6b00,c09c34c7,4fc,c61b6b70,...)
> at intr_event_execute_handlers+0x125
> ithread_loop(c610bbc0,c5f3fd38,c09c3242,342,c6174aa0,...) at ithread_loop+0x9f
> fork_exit(c0689950,c610bbc0,c5f3fd38) at fork_exit+0xb8
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0, eip = 0, esp = 0xc5f3fd70, ebp = 0 ---
> 
> PID 11 is the "intr" process.
> 
> A textdump (vm6.tar.gz) is available:
> http://sites.google.com/site/lwfreebsd/Home/files/vm6.tar.gz?attredirects=0

Already passed about 3 months.  :-)

Could you please test with CURRENT to reproduce this problem?  On my
environment it doesn't happen anymore.  I'd like to know this issue is
still valid.

> 2nd issue:
> 
> With an plugged and firmware-loaded uath device (TRENDnet TEW-504UB in
> my case), reboot the system. It is not initialized by uath until you
> pull and plug it back in.
> The USB descriptors, obtained by usbconfig dump_device_desc, are the
> same before pulling and after reinitializing the device.
> 
> Is there anything I can do to help fixing these issues?

I tried to solve this issue on my amd64 machine but could not find a way
to fix.  It looks HAL interface doesn't have a API to reset full H/W and
endpoint 0 also don't have a such feature.  Only a way looks a bus reset
event (EHCI_STS_RESET) currently depending on the system for example,
powerpc I have seems it hasn't this problem but amd64 has.

regards,
Weongyo Jeong