Re: scsi_cd or atapicam crash in current.

From: Kenneth D. Merry <ken_at_kdm.org> Date: Fri, 12 Sep 2003 10:50:39 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:22 UTC

On Fri, Sep 12, 2003 at 08:57:22 -0700, Kevin Oberman wrote:
> I am seeing a peculiar, possibly timing sensitive, crash that looks
> like if is probably in either atapicam or scsi_cd. The system is
> CURRENT as of yesterday morning.
> 
> The crash happens frequently when nautilus starts up. It does not
> always crash, but does so fairly frequently and leaves my laptop
> locked in X with no access to the console. If nautilus starts, the
> system continues without problems until X is terminated and restarted.
> 
> I managed to get a panic printout by switching back to vty0 (console)
> while the X startup was in progress and I am entering the panic by
> hand. Slight chance of a typo, but I have checked it a couple of
> times.
> 
> For some reason I can't explain, I didn't get a crash dump, but I
> probably can get one after a future crash. The easy fix is to remove
> the DVD/CD-RW drive. FWIW, the system is an IBM T30 and it happens
> with either APM or ACPI. I am attaching the dmesg and the config
> file. Hopefully the mailer won't strip them.
> 
> Fatal trap 18: integer divide fault while in kernel mode
> instruction pointer     = 0x8:oxc0139a8b
> stack pointer           = 0x10:0xdd5b6a38
> frame pointer           = 0x10:0xdd5b6a80
> code segment            = base 0x0, limit 0xffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = Interrupt enabled, resume, IOPL = 0
> current process         = 737 (nautilus)
> kernel: type 18 trap, code 0
> Stopped at      cdstart+0xcb:   divl    0x30(%ebx), %eax
> db> tr
> cdstart(c419d500,c4192000,1,c407cc30,c407cc00) at cdstart+0xcb
> xpt_run_dev_allocq(c40b8c00,c407cc08,1,c418d800,c419d500) at 
> xpt_run_dev_allocq+0xab
> xpt_schedule(c419d500,1,0,ce54ec78,dd5b6c70) at xpt_schedule+0xca
> cdstrategy(ce54ec78,0,0,0,d439f000) at cdstrategy+0x88
> physio(c4197700,dd5b6c70,10,dd5b6b78,c03f4900) at physio+0x2df
> spec_read(dd5b6bd0,dd5b6c20,c02b35e3,dd5b6bd0,1020002) at spec_read+0x19a
> spec_vnoperate(dd5b6bd0,1020002,c470c850,0,dd5b6c70) at spec_vnoperate+0x18
> vn_read(c489d8c4,dd5b6c70,c478ee00,0,c470c850) at vn_read+0x1a3
> dofileread(c470c850,c489d8c4,12,bfbfeb40,800) at dofileread+0xd9
> read(c470c850,dd5b6d10,c,c,3) at read+0x6b
> syscall(2f,2f,2f,80cb000,0) at syscall+0x2b0
> Xint0x80_syscall() at Xint0x80_syscall+0x1d
> --- syscall (3, FreeBSD ELF32, read), eip = 0x28da2b5f, esp = 0xbfbfeadc,ebp = 
> 0xbfbfeb08 ---
> db>

Other folks have reported seeing bogus values returned from read capacity
for atapicam-attached driver.

I've seen it on my laptop as well (which runs -current).  (Only since the
ATAng code went in.  It worked fine before.)

cdstart() uses the blocksize to try to figure out the LBA to pass to the
SCSI read or write commands, so that's likely what's causing the integer
divide fault.

What does dmesg say about the size of the disk in the drive?  Do you have a
CD in the drive?

What happens when you do:

camcontrol cmd cd0 -v -c "25 0 0 0 0 0 0 0 0 0" -i 8 "i4 i4"

That should give you the media size and blocksize of the CD in the drive,
or an error if you don't have any media.

If you're getting bogus values for the media/blocksize, or if it says
there's a disk there when there isn't one, then you've got a problem either
with the ATAPI or atapicam code.

Ken
-- 
Kenneth Merry
ken_at_kdm.org