Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout

From: Scott Long <scottl_at_freebsd.org>
Date: Sat, 13 Nov 2004 09:42:09 -0700
Poul-Henning Kamp wrote:
> In message <4195E1FF.5090906_at_DeepCore.dk>, =?ISO-8859-1?Q?S=F8ren_Schmidt?= wri
> tes:
> 
> 
>>>>Timeout is 5 secs, which is a pretty long time in this context IMHO..
>>>
>>>Five seconds counted from when ?
>>
>>Now thats the nasty part :)
>>ATA starts the timeout when the request is issued to the device, so 
>>theoretically the disk could take 4.9999 secs to complete the request 
>>and then the timeout fires before the taskqueue gets its chance at it, 
>>but IMHO thats pretty unlikely...
> 
> 
> I find that far more likely than kernel threads being stalled for that
> long.  ATA disks doing bad-block stuff takes several seconds on some
> of the disks I've had my hands on.
> 

Bad block recovery takes a while, as do things like periodic thermal
recal.  The IBM drives are famous for this 'feature'.

> 
>>Anyhow, I can just remove the warning from ATA if that makes anyone 
>>happy, since its just a warning and ATA doesn't do anything with it at all.
>>However, IMNHO this points at a problem somewhere that we should better 
>>understand and fix instead.
> 
> 
> I would prefer you reset the timer to five seconds in your interrupt
> routine so we can see exactly on which side of that the time is spent.
> 
> 

At least cancel the hardware timeout in the ithread.  I don't doubt that
there are times when the system is going to get busy and not service
g_up right away (and thus the bio_taskqueue), and I won't argue that
this doesn't indicate buggy or poorly implemented code elsewhere in the
system.  But the timeout warning that is given now does nothing to help
identify whatever real problem exists, and only confuses users.

Scott
Received on Sat Nov 13 2004 - 15:41:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:22 UTC