Re: Tracking down ata0 reset hang

From: Jesse Guardiani <jesse_at_wingnet.net>
Date: Wed, 24 Dec 2003 09:46:20 -0500
Soren Schmidt wrote:

> It seems Kevin Oberman wrote:
>> > 
>> > Pretty darn wierd. Anyone have any ideas as to why it works when the
>> > serial console AND verbose boot is enabled?
>> 
>> I can confirm this on my system, as well. And I do have an explanation
>> (but a fix will probably have to come from Søren.
>> 
>> The reason is almost certainly a timing issue. When you boot -v, the
>> resume sends a LOT more data to the screen and when you use -D, it sends
>> it to both the screen and the serial port which is MUCH slower to
>> complete than the screen. This allows SOMETHING to clear or complete or
>> something on resume and all is well.
> 
> I think its more likely that we hit the infamous console lockup bug
> thats been around for months (probably a lock messup somewhere).

Granted, I know next to nothing about low level PC hardware, but I can assure
you that suspend/resume is VERY delay dependant - at least on my IBM Thinkpad
A30p.

With 5.1-RELEASE, I had to perform all kinds of trickery to get APM suspend/
resume working. This trickery boils down to the following actions in rc.suspend:


# Kill any process currently using /dev/acd0
fstat /dev/acd0 | awk '{print $3}' | xargs kill -KILL

# switch to a TTY
vidcontrol -s 1 < /dev/ttyv0

# detach buggy CD-ROM before suspending.
sync && sync && sync
atacontrol detach 1
sleep 1

sync && sync && sync
sleep 1


And then in rc.resume I had to do this:

# Switch back to X
vidcontrol -s 9 < /dev/ttyv0

sync && sync && sync

# Reattach and reinitialize ATA channel 1
atacontrol attach 1


I played with the timeouts a good bit and the above is the bare minimum my
system would accept without a hard lock.

Perhaps my system is buggy and isn't the norm. But what would it hurt to put
a 1 or 2 second delay in the ata code before the ata0 reset?

It may be unrelated, but I've noticed that Windows usually takes up to 7 seconds
to suspend. Resume takes even longer. FreeBSD is incredibly speedy by comparison,
so we've got some time to spare. What would it hurt to test some delay patches?

Since learning that `boot -vD` allows me to resume without hanging, last night was
the first night in three weeks that I:

- suspended my laptop from work
- resumed it at home
- suspended it at home
- resumed it at work again this morning.

Maybe it will crash next time. Maybe not. My system used to crash once a week or
month on resume even under FreeBSD 5.1-RELEASE, so I'm not looking for perfection.
I'm just looking for something that works _most_ of the time.

Also, I'm curious if the `boot -vD` thing is strictly a Thinkpad workaround, or if
it works for others too. Has anyone WITHOUT a Thinkpad BIOS tried to `boot -vD`
and suspend/resume with APM?

-- 
Jesse Guardiani, Systems Administrator
WingNET Internet Services,
P.O. Box 2605 // Cleveland, TN 37320-2605
423-559-LINK (v)  423-559-5145 (f)
http://www.wingnet.net
Received on Wed Dec 24 2003 - 05:46:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:35 UTC