Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout

From: Zoltan Frombach <tssajo_at_hotmail.com>
Date: Wed, 1 Dec 2004 23:33:45 -0800
I can now assure you guys, that once the patch was applied, I *NEVER* get 
the WRITE_DMA warning message again. I've been running the server with the 
patch for over a week now.

Question: Is there any chance this patch will make it into a stable release? 
IMHO, that would be great. I understand that this is "just a warning 
message" anyway, but still...

Zoltan

Zoltan Frombach wrote:
> I will apply this patch first thing tomorrow. But I don't see how will I 
> see any difference? Does it put something into a log file? Shouldn't it?

The change is that if you see the "WARNING interrupt seen" we know that
it was the upper layers that used up the 5 secs of timeout, not that
some of it was used by a disk being slow to respond.

-Søren

> Zoltan
>
> Poul-Henning Kamp wrote:
>
>> In message <4195E1FF.5090906_at_DeepCore.dk>, 
>> =?ISO-8859-1?Q?S=F8ren_Schmidt?= wri
>> tes:
>>
>>
>>>>> Timeout is 5 secs, which is a pretty long time in this context IMHO..
>>>>
>>>>
>>>> Five seconds counted from when ?
>>>
>>>
>>> Now thats the nasty part :)
>>> ATA starts the timeout when the request is issued to the device, so 
>>> theoretically the disk could take 4.9999 secs to complete the request 
>>> and then the timeout fires before the taskqueue gets its chance at it, 
>>> but IMHO thats pretty unlikely...
>>
>>
>> I find that far more likely than kernel threads being stalled for that
>> long.  ATA disks doing bad-block stuff takes several seconds on some
>> of the disks I've had my hands on.
>>
>>> Anyhow, I can just remove the warning from ATA if that makes anyone 
>>> happy, since its just a warning and ATA doesn't do anything with it at 
>>> all.
>>> However, IMNHO this points at a problem somewhere that we should better 
>>> understand and fix instead.
>>
>>
>> I would prefer you reset the timer to five seconds in your interrupt
>> routine so we can see exactly on which side of that the time is spent.
>
>
> It would be even better to time how long both ops take and be able to
> get that via a sysctl or something (I have that on my TODO list but its
> loooong :) ).
>
> Anyhow resetting it is easy (patch against 5.3R):
>
> Index: ata-queue.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/dev/ata/ata-queue.c,v
> retrieving revision 1.32.2.5
> diff -u -r1.32.2.5 ata-queue.c
> --- ata-queue.c 24 Oct 2004 09:27:37 -0000      1.32.2.5
> +++ ata-queue.c 13 Nov 2004 10:44:40 -0000
> _at__at_ -216,6 +216,9 _at__at_
>         ata_completed(request, 0);
>      }
>      else {
> +       if (!dumping)
> +           callout_reset(&request->callout, request->timeout * hz,
> +                         (timeout_t*)ata_timeout, request);
>         if (request->bio && !(request->flags & ATA_R_TIMEOUT)) {
>             ATA_DEBUG_RQ(request, "finish bio_taskqueue");
>             bio_taskqueue(request->bio, (bio_task_t *)ata_completed,
> request); 
Received on Thu Dec 02 2004 - 06:34:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:23 UTC