Re: 7.0-CURRENT Hang

From: Cy Schubert <Cy.Schubert_at_komquats.com>
Date: Tue, 07 Feb 2006 10:06:32 -0800
In message <20060207173154.GE19674_at_comp.chem.msu.su>, Yar Tikhiy writes:
> On Mon, Feb 06, 2006 at 08:29:35PM -0800, Cy Schubert wrote:
> > 
> > On the Pentium P54C model (that's an old 120 MHz Pentium I use as a 4.x, 
> > 5.x, and 7.x ports build testbed) the CPUID instruction when called with AL
>  
> > = 0x02, CPUID returns EAX = EBX = ECX = EDX = 0. The code fragment in 
> > identcpu.c below results in "rounds" becoming 0xffffffff.
> > 
> > 	do_cpuid(0x2, regs);
> > 	rounds = (regs[0] & 0xff) - 1;
> > 
> > The subsequent loop of the following will loop virtually for ever (it takes
>  
> > forever tor this machine to count down from 0xffffffff performing a very 
> > great many calls to get_INTEL_TLB in the process, virtually hanging the 
> > machine in the process.
> > 
> > 	while (rounds > 0) {
> > 		[... code ...]
> > 		rounds--;
> > 	}
> 
> FWIW, my presumably P54C machine (Family 5 Model 2 Stepping 6)
> doesn't indicate it has the CPUID 0x02 function.  That is, CPUID
> 0x00 returns EAX = 0x01, which is the highest function supported.
> Could you try to run the misc/cpuid port on your Pentium and show
> its output?  It might appear that the code around CPUID 0x02 shouldn't
> be reached at all in your case.  Zero values from CPUID 0x02 are
> pretty indicative of that.

Mine is Family 5 Model 2 Stepping 12. All of my doc is for Pentium-Pro and 
newer so you are probably correct.

> 
> Dealing with "rounds" equal to -1 can be a good idea anyway to catch
> braid dead CPUs instead of hanging the system on them.

Well, with rounds = -1 [actually (unsigned int)0xffffffff], the CPU will 
"appear" to hang as it "rounds" or loops virtually forever -- counting back 
from 0xffffffff on a 120 MHz machine and performing get TLB info a number 
of times each iteration takes hours to do just a few iterations. I've seen 
mine go through "rounds", decrementing rounds-- each time, for hours at a 
time, though initially before digging into it using DDB it did appear that 
the CPU was hung, it was just starting to loop for 4,294,967,295 times. On 
older and slower machines, if it took hours to iterate through a few 
iterations, my guess is that it would take days to loop through this code. 
My patch allows it to take the defaults and finally boot. If the CPU 
doesn't support AL = 0x02, what's the point of looping? It appears to run 
nicely with the patch.

I have another machine just like this as a firewall.


Cheers,
Cy Schubert <Cy.Schubert_at_komquats.com>
Web:  http://www.komquats.com and http://www.bcbodybuilder.com
FreeBSD UNIX:  <cy_at_FreeBSD.org>   Web:  http://www.FreeBSD.org
BC Government:  <Cy.Schubert_at_gov.bc.ca>

    "Lift long enough and I believe arrogance is replaced by
    humility and fear by courage and selfishness by generosity
    and rudeness by compassion and caring."
        -- Dave Draper
Received on Tue Feb 07 2006 - 17:06:35 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:52 UTC