Re: Panic in a recent kernel (cardbus/pci related ?)

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 30 Dec 2009 15:13:07 -0500
On Wednesday 30 December 2009 2:48:37 pm M. Warner Losh wrote:
> In message: <200912301025.56737.jhb_at_freebsd.org>
>             John Baldwin <jhb_at_freebsd.org> writes:
> : On Wednesday 30 December 2009 9:28:33 am John Baldwin wrote:
> : > On Friday 11 December 2009 12:15:27 am Thierry Herbelot wrote:
> : > > Hello,
> : > > 
> : > > I'm seeing a panic in my latest -Current kernel (config file == GENERIC 
> : > minus 
> : > > INVARIANTS, WITNESS and SMP). The machine is an older notebook, with a 
> : > PCMCIA 
> : > > network card.
> : > > 
> : > > The end of the verbose dmesg, showing the panic is following :
> : > > [SNIP]
> : > > Device configuration finished.
> : > > procfs registered
> : > > Timecounter "TSC" frequency 169163324 Hz quality 800
> : > > Timecounters tick every 1.000 msec
> : > > firewire0: fw_sidrcv: ERROR invalid self-id packet
> : > > firewire0: 1 nodes, maxhop <= 0 Not IRM capable irm(-1)
> : > > lo0: bpf attached
> : > > hptrr: no controller detected.
> : > > ata0: Identifying devices: 00000001
> : > > ata0: New devices: 00000001
> : > > usbus0: 12Mbps Full Speed USB v1.0
> : > > battery0: battery initialization start
> : > > battery1: battery initialization start
> : > > acpi_acad0: ugen0.1: <Intel> at usbus0
> : > > uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
> : > > acline initialization start
> : > > acpi_acad0: On Line
> : > > acpi_acad0: acline initialization done, tried 1 times
> : > > ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire
> : > > ad0: setting UDMA33
> : > > ad0: 28615MB <HITACHI DK23DA-30 00J1A0A1> at ata0-master UDMA33
> : > > ad0: 58605120 sectors [62016C/15H/63S] 16 sectors/interrupt 1 depth queue
> : > > unknown: Lazy allocation of 0x400 bytes rid 0x14 type 3 at 0x88000000
> : > > cbb1: Opening memory:
> : > > cbb1: Normal: 0x88000000-0x88000fff
> : > > cbb1: Opening memory:
> : > > cbb1: Normal: 0x88000000-0x88000fff
> : > >         map[10]: type I/O Port, range 32, base 0, size  8, port disabled
> : > >         map[14]: type Memory, range 32, base 0, size 10, enabled
> : > > panic: resource_list_add: resource entry is busy
> : > > KDB: enter: panic
> : > > [thread pid 8 tid 100032 ]
> : > > Stopped at      kdb_enter+0x3a: movl    $0,kdb_why
> : > > db> where
> : > > Tracing pid 8 tid 100032 td 0xc256cb40
> : > > kdb_enter(c0c9240f,c0c9240f,c0c93aaa,c23d4b70,c23d4b70,...) at 
> : > kdb_enter+0x3a
> : > > panic(c0c93aaa,3,14,400,ffffffff,...) at panic+0xd1
> : > > resource_list_add(c26e9004,3,14,0,ffffffff,...) at resource_list_add+0x96
> : > > pci_add_map(c26e9004,1,0,c23d4c58,14,...) at pci_add_map+0x628
> : > > pci_add_resources(c256b980,c267a980,1,0,1,...) at pci_add_resources+0x59e
> : > > cardbus_attach_card(c256b980,c24fd990,c0d23d08,f889cc55,ffebf3e8,...) at 
> : > > cardbus_attach_card+0x56e
> : > > cbb_event_thread(c2676000,c23d4d38,4478b00,840fc085,428,...) at 
> : > > cbb_event_thread+0x395
> : > > fork_exit(c070db40,c2676000,c23d4d38) at fork_exit+0x90
> : > > fork_trampoline() at fork_trampoline+0x8
> : > > --- trap 0, eip = 0, esp = 0xc23d4d70, ebp = 0 ---
> : > 
> : > I think I have finally figured this out.  What is happening is that the card 
> : > stores its CIS in a PCI BAR (this is probably fairly common for cardbus 
> : > cards).  So, the PCI BAR holding the CIS is being allocated before 
> : > pci_add_resources() is called hence the confusion.  There is a bit of a 
> : > chicken and egg problem here in that we need to parse the CIS to determine 
> : > what special requirements (e.g. prefetch) might be required for other BARs.  
> : > I'm not sure if the Cardbus spec makes certain guarantees about the properties 
> : > of a BAR that is used to hold the CIS.  I'm not actually sure how this worked 
> : > prior to my change as the resource for the CIS BAR should still have been 
> : > present in this case causing the same error (the old pci_release_resource() 
> : > would still have left the resource around).  I'll need to talk to Warner about 
> : > the best way to resolve this.
> : 
> : This is one possible hack.  It instructs the PCI bus to completely remove the
> : resource for the CIS.  While looking at this I found some other bugs (the code to
> : disable decoding in the ROM BAR didn't actually work for example) and have come up
> : with a larger patch.  It does a few things:
> : 
> : 1) Fixes bus_generic_rl_(alloc|release)_resource() to not try to fetch a resource
> : list for a grandchildren.
> 
> : 2) Add full support for device ROM BARs to the PCI bus and remove
> : the device ROM hacks from the cardbus driver now that PCI manages
> : them.
> 
> I'll have to pay special attention to this.  It is really easy to get
> wrong, and the current code in the tree doesn't quite work.
> 
> : 3) Use a resource_list_unreserve() when purging resources from a
> : cardbus card when it is removed as this is a bit cleaner.  Arguably
> : the PCI bus driver should have a 'delete all resources' method that
> : does this instead (hotplug PCI would need to use it).
> 
> Yea.  There's a conflict here.  Originally, the view was that PCI
> drivers were responsible for freeing all resources they allocated.
> And they needed to keep track.  Over time lists have sprung up to make
> this possible.  It is no wonder there's inconsistency here.

Well, cardbus is the only place that currently needs to forcefully remove a
PCI device, and in the case that you do that I think it's best to not trust
the driver but to always clean up after it.  pccard does the same thing as
well for ejected cards.

> : 4) Remove unused pci_release_resource().
> : 
> : The patch is available at http://www.FreeBSD.org/~jhb/patches/cardbus.patch
> 
> Can you regenerate this -p?

Sure.  I've committed some of the more harmless bits already.  I've
regenerated the full set of remaining patches at the same URL.  I've
also split out two sub-patches:

~jhb/patches/cardbus_bus_space.patch - this just changes the cardbus_cis
 code to use bus_*() instead of bus_space_*()

~jhb/patches/rom.patch - this is the set of changes to add PCIR_BIOS
 support to the pci(4) bus driver and remove the special hacks for it
 from cardbus

-- 
John Baldwin
Received on Wed Dec 30 2009 - 19:13:43 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:59 UTC