Re: After install - Fatal trap 18 ATA problem?

From: Anish Mistry <mistry.7_at_osu.edu>
Date: Mon, 19 Jun 2006 22:35:50 -0400
On Monday 19 June 2006 18:51, Yar Tikhiy wrote:
> On Mon, Jun 19, 2006 at 05:05:09PM -0400, Anish Mistry wrote:
> > On Monday 19 June 2006 16:36, Yar Tikhiy wrote:
> > > On Mon, Jun 19, 2006 at 03:25:19PM -0400, Anish Mistry wrote:
> > > > On Monday 19 June 2006 14:09, Yar Tikhiy wrote:
> > > > > On Fri, Jun 16, 2006 at 01:32:55PM -0400, Anish Mistry 
wrote:
> > > > > > I'm trying to get FreeBSD installed on one of my systems
> > > > > > and I'm getting the error stated below.  I did have
> > > > > > FreeBSD 6-STABLE installed a few months ago on this very
> > > > > > system.  The only change is that FreeBSD is now installed
> > > > > > on the second harddrive instead of the first.  This is
> > > > > > using the -CURRENT snapshot for this month.  The install
> > > > > > goes just fine.  I also get a very similar error when I
> > > > > > install 6.1 too.
> > > > > >
> > > > > > This seems to be the same problem as:
> > > > > > http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/20
> > > > > >06-0 3/ms g00539.html
> > > > > >
> > > > > > But I don't have a built-in compact flash reader attached
> > > > > > via. ATA.
> > > > > >
> > > > > > Full verbose boot+backtrace:
> > > > > > http://am-productions.biz/docs/boot-panic-script.txt.gz
> > > > > >
> > > > > > rr232x: no controller detected.
> > > > > > ata0-slave: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80
> > > > > > wire ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA66
> > > > > > cable=80 wire ad0: setting PIO4 on nForce2 Pro chip
> > > > > > ad0: setting UDMA66 on nForce2 Pro chip
> > > > > > ad0: 17206MB <IBM DJNA-371800 J78OA30K> at ata0-master
> > > > > > UDMA66
> > > > > >
> > > > > >
> > > > > > Fatal trap 18: integer divide fault while in kernel mode
> > > > > > cpuid = 0; apic id = 00
> > > > > > instruction pointer	= 0x20:0xc089b49f
> > > > > > stack pointer	        = 0x28:0xc0c20b64
> > > > > > frame pointer	        = 0x28:0xc0c20bec
> > > > > > code segment		= base 0x0, limit 0xfffff, type 0x1b
> > > > > > 			= DPL 0, pres 1, def32 1, gran 1
> > > > > > processor eflags	= interrupt enabled, resume, IOPL = 0
> > > > > > current process		= 0 (swapper)
> > > > > > [thread pid 0 tid 0 ]
> > > > > > Stopped at      __qdivrem+0x3b: divl    %ecx,%eax
> > > > > > db> bt
> > > > > > Tracing pid 0 tid 0 td 0xc0a02fb8
> > > > > > __qdivrem(219b700,0,0,0,0) at __qdivrem+0x3b
> > > > > > __udivdi3(219b700,0,0,0) at __udivdi3+0x16
> > > > >
> > > > >                       ^^^
> > > > > Looks like an attempt to divide something (0x219b700) by
> > > > > zero using quad_t arithmetics.
> > > > >
> > > > > > ad_describe(c26e8580,c26e8580,c262c280,c265e400,c25ec200)
> > > > > > at ad_describe+0x1b3
> > > > > > ad_attach(c26e8580) at ad_attach+0x1e7
> > > > > > device_attach(c26e8580,c0957850,c26e8580,c265e000,c265e40
> > > > > >0) at device_attach+0x58
> > > > > > device_probe_and_attach(c26e8580) at
> > > > > > device_probe_and_attach+0xe0
> > > > > > bus_generic_attach(c25d2a80,c25d2a80,1,0,c26e8580) at
> > > > > > bus_generic_attach+0x16
> > > > > > ata_identify(c25d2a80) at ata_identify+0x1c8
> > > > > > ata_boot_attach(0) at ata_boot_attach+0x3e
> > > > > > run_interrupt_driven_config_hooks(0,c1ec00,c1e000,0,c0450
> > > > > >af5) at run_interrupt_driven_config_hooks+0x18
> > > > > > mi_startup() at mi_startup+0x96
> > > > > > begin() at begin+0x2c
> > > > > > db> ps
> > > > > > --
> > > > > > Anish Mistry
> > > > >
> > > > > FWIW, I saw an integer divide fault apparently related to
> > > > > the ata driver when I tried to test a low-end VIA-based
> > > > > mobo with FreeBSD. I gave it away soon and had had no time
> > > > > for debugging though.
> > > > >
> > > > > Could you see using gdb what C code is at ad_describe+0x1b3
> > > > > in your kernel?
> > > >
> > > > How do I do this without creating a kernel dump?  Do I need
> > > > to setup remote GDB over a serial console?
> > >
> > > No, you don't.  It's much easier than that.  You were
> > > installing FreeBSD from a CURRENT snapshot when the panic
> > > happened, weren't you?  If so, get a working machine with
> > > not-too-old GDB first. FreeBSD 5.x or 6.x will do.  Then locate
> > > kernel.debug or kernel.symbols in the boot/kernel subdir on the
> > > installation CD. It's the kernel that panic'ed.  Well,
> > > kernel.symbols isn't the kernel itself, but its symbols only. 
> > > OTOH, we need nothing but the symbols.
> > >
> > > Unpack the snapshot's kernel source to somewhere.  This is as
> > > easy as typing:
> > >
> > > 	cd /cdrom/7.0-CURRENT/src
> >
> > For the archives...
> > You need to create the usr/src directory or tar will fail:
> > mkdir -p /usr/home/me/somewhere/usr/src
>
> Yes, you're quite right here!
>
> > > 	env DESTDIR=/usr/home/me/somewhere sh install.sh sys
> > >
> > > And now load the kernel binary in GDB (not kgdb):
> > >
> > > 	gdb /cdrom/boot/kernel/kernel.symbols
> > > 	(gdb) dir /usr/home/me/somewhere
> > >
> > > Perhaps GDB will find the source files more readily if you put
> > > them just into /usr/src (after renaming the original /usr/src
> > > to, e.g., /usr/src.orig).  So you'll also prevent GDB from
> > > picking the wrong source tree.
> > >
> > > 	mv /usr/src /usr/src.orig
> > > 	mkdir /usr/src
> > > 	cd /cdrom/7.0-CURRENT/src
> > > 	sh install.sh sys
> > > 	gdb /cdrom/boot/kernel/kernel.symbols
> > >
> > > Now you should be able to examine the source code using binary
> > > code offsets:
> > >
> > > 	(gdb) list *(ad_describe+0x1b3)
> > >
> > > The "list" command will show you which line in which source
> > > file is responsible for the division by zero, and 9 more lines
> > > around it to provide a context.  The output can be shown here
> > > as is, it's quite informative.
> >
> > (gdb) list *(ad_describe+0x1b3)
> > 0xc04e224b is in ad_describe
> > (/usr/src/sys/dev/ata/ata-disk.c:383).
>
> 				^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I suppose you put the CURRENT sources under /usr/src at last,
> didn't you?
Correct.

>
> > 378                       device_get_unit(ch->dev),
> > 379                       (atadev->unit ==
> > ATA_MASTER) ? "master" : "slave",
> > 380                       (adp->flags &
> > AD_F_TAG_ENABLED) ? "tagged " : "",
> > 381                       ata_mode2str(atadev->mode));
> > 382         if (bootverbose) {
> > 383             device_printf(dev, "%ju sectors [%juC/%dH/%dS] "
> > 384                           "%d sectors/interrupt %d depth
> > queue\n", adp->total_secs,
> > 385                           adp->total_secs / (adp->heads *
> > adp->sectors),
> > 386                           adp->heads, adp->sectors,
> > atadev->max_iosize / DEV_BSIZE,
> > 387                           adp->num_tags + 1);
>
> Consequently, adp->heads or adp->sectors was 0 for ad0. It means 
> that the ata(4) driver had some kind of trouble when reading the
> disk's parameters from the ATA controller.  Now you may want to
> contact the author of ata(4), Soren Schmidt <sos_at_freebsd.org>, for
> further instructions on how to debug this problem.  I hope he'll
> find all this info useful.  Thanks!
Do you have any insight on what I can do further to debug this 
problem?

Thanks,

-- 
Anish Mistry

Received on Tue Jun 20 2006 - 00:35:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:57 UTC