need some debugging help

From: Kenneth D. Merry <ken_at_kdm.org>
Date: Fri, 29 Aug 2003 22:03:57 -0600
I've been working on a set of patches to remove the sysctl variable creation
from interrupt context in the cd(4) and da(4) drivers.

To fix the problem, I've created a new taskqueue that runs in a thread
context, instead of inside a software interrupt like the current task
queues.  (The eventual fix will involve moving the CAM probe inside a
thread; this will provide a more temporary solution that will hopefully
also work on -stable, until we can change the CAM probe code.)

I think I have everything setup correctly, but I keep getting panics inside
the GEOM code with these patches.  (Memory modified after free.)  I don't
know whether I've just exposed some race condition, or whether I've done
something wrong.

I've seen several different panics, all with the same root cause (memory
modified after free), and with two different previous memory pools -- geom
and devbuf.

==========================================================================
SMP: AP CPU #1 Launched!
Memory modified after free 0xcbd4f800(124)
panic: Most recently used by GEOM

cpuid = 0; lapic.id = 00000000
Debugger("panic")
Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> trace
Debugger(c03e974d,0,c03fb4d8,e5e45934,100) at Debugger+0x55
panic(c03fb4d8,c03e55db,7c,c083ac14,c083ac00) at panic+0x15f
mtrash_ctor(cbd4f800,80,0,57a,cbd4f800) at mtrash_ctor+0x5d
uma_zalloc_arg(c083ac00,0,102,e5e4599c,e5e4599c) at uma_zalloc_arg+0x1e4
malloc(5b,c042fae0,102,1,c02756e4) at malloc+0xd3
g_new_providerf(cbda62c0,cbd7b130,e5e45a3c,1,1) at g_new_providerf+0xa3
g_slice_config(cbda62c0,2,1,0,0) at g_slice_config+0x259
g_bsd_modify(cbda62c0,cbd7712c,e5e45c8c,10,cbd77000) at g_bsd_modify+0x382
g_bsd_taste(c0470480,cbda5780,0,159,cbda5700) at g_bsd_taste+0x2c4
g_new_provider_event(cbda5780,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
db> panic

==========================================================================
SMP: AP CPU #1 Launched!
Memory modified after free 0xcbd4f600(124)
panic: Most recently used by devbuf

cpuid = 0; lapic.id = 00000000
Debugger("panic")
Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> trace
Debugger(c03e974d,0,c03fb4d8,e5e45af0,100) at Debugger+0x55
panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
uma_zalloc_arg(c083ac00,0,102,e5e45b58,e5e45b58) at uma_zalloc_arg+0x1e4
malloc(5a,c042fae0,102,1,c02756e4) at malloc+0xd3
g_new_providerf(cbda74c0,cb9b0d90,e5e45bf8,1,1) at g_new_providerf+0xa3
g_slice_config(cbda74c0,0,1,7e00,0) at g_slice_config+0x259
g_mbr_modify(cbda74c0,cb9d3800,cbd5b200,123,0) at g_mbr_modify+0x247
g_mbr_taste(c0470560,cbd4ee80,0,159,cbd4f580) at g_mbr_taste+0x1be
g_new_provider_event(cbd4ee80,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
db> panic
==========================================================================

Memory modified after free 0xcbd4f600(124)
panic: Most recently used by devbuf

cpuid = 0; lapic.id = 00000000
Debugger("panic")
Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> trace
Debugger(c03e974d,0,c03fb4d8,e5e45bbc,100) at Debugger+0x55
panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
uma_zalloc_arg(c083ac00,0,102,fe,cbd6b800) at uma_zalloc_arg+0x1e4
malloc(60,c042fae0,102,cbd59240,c0470560) at malloc+0xd3
g_slice_alloc(4,214,cbd4f4d4,1c2,e5e45c9c) at g_slice_alloc+0x7e
g_slice_new(c0470560,4,cbd4f480,e5e45c98,e5e45c9c) at g_slice_new+0x6f
g_mbr_taste(c0470560,cbd4f480,0,159,cbd4f580) at g_mbr_taste+0x90
g_new_provider_event(cbd4f480,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---
db> 
==========================================================================
SMP: AP CPU #1 Launched!
Memory modified after free 0xcbd4f600(124)
panic: Most recently used by devbuf

cpuid = 0; lapic.id = 00000000
Debugger("panic")
Stopped at      Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> trace
Debugger(c03e974d,0,c03fb4d8,e5e45aa8,100) at Debugger+0x55
panic(c03fb4d8,c03e7f50,7c,c083ac14,c083ac00) at panic+0x15f
mtrash_ctor(cbd4f600,80,0,57a,cbd4f600) at mtrash_ctor+0x5d
uma_zalloc_arg(c083ac00,0,102,5c,cbd4f900) at uma_zalloc_arg+0x1e4
malloc(64,c042fae0,102,cbd4f900,2) at malloc+0xd3
g_post_event_x(c021ea70,cbd4f900,2,0,e5e45b6c) at g_post_event_x+0x54
g_post_event(c021ea70,cbd4f900,2,cbd4f900,0) at g_post_event+0x45
g_new_providerf(cbda3540,cb9b0b20,e5e45bf8,1,1) at g_new_providerf+0x151
g_slice_config(cbda3540,0,1,7e00,0) at g_slice_config+0x259
g_mbr_modify(cbda3540,cbd6c400,cbd73000,123,0) at g_mbr_modify+0x247
g_mbr_taste(c0470560,cbd4f700,0,159,cbd4f780) at g_mbr_taste+0x1be
g_new_provider_event(cbd4f700,0,c03e52b7,b3,66666667) at g_new_provider_event+0xad
one_event(e5e45d0c,c021cd85,c0485fb4,0,4c) at one_event+0x218
g_run_events(c0485fb4,0,4c,c03d7a28,a) at g_run_events+0x15
g_event_procbody(0,e5e45d48,c03e738b,314,34df1a6d) at g_event_procbody+0x45
fork_exit(c021cd40,0,e5e45d48) at fork_exit+0xcf
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe5e45d7c, ebp = 0 ---

==========================================================================

Since the panics involved either M_DEVBUF or M_GEOM, I removed all M_DEVBUF
mallocs from the cd(4) and da(4) drivers.  (None of the other affected code
used M_DEVBUF; I created new malloc types for the cd(4) and da(4) drivers.)

The problem didn't change.  (Other than the exact place in GEOM that
triggered the malloc that caught the problem.)

Anyway, I've attached the patch in question.  If someone could tell me what
(if anything) I'm doing wrong, I'd appreciate it!

Thanks,

Ken
-- 
Kenneth Merry
ken_at_kdm.org

Received on Fri Aug 29 2003 - 19:04:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:20 UTC