Re: how to find out what the other CPU is doing

From: Randall Stewart <rrs_at_cisco.com>
Date: Sun, 28 Jan 2007 07:38:27 -0500
All:

Ok, I did not get an answer to this.. and of course
I hit the bug again (which I now figured out how to
fix :-D)

So let me explain what I did.. so that way I
can come back and find this email later when it
someday happens again ;-) (and for anyone else
curious).

1) I had to do this from DDB ... I could not find a
    way in kgdb.

2) When you stop the machine in ddb (at least in i386) it
    dumps BOTH CPU's info in something called
    stoppcbs[num-cpus]
3) Its an array of  struct pcb .. which has all the info
    you need to get started.
4) With a trusty x/ stoppcbs you can work your way through
    and gather the info you need.. For x86 the second CPU
    started at stoppcbs+0x270 .. if you don't want to look
    at all those 0's (of course the offset could change and
    will vary from CPU type to CPU type :-D)
5) Dig out the ebp from here. You can look at the IP
    but it will be in some NMI stop CPU routine.
6) You can use the bp to trace backward through the stack
    and figure out the running stack trace... I went back
    to kgdb after getting the ebp (with CPU still spinning away).
7) You have to go several frames back to get by all the NMI
    stuff before you find your guilty party :-)

There might be a better way to do this.. and I am thinking
about adding a machine dependent trace that can take
a ebp argument (if one does not already exist in kgdb.. I
suppose I need to poke around in the macro's a bit).. anyway
its primitive .. but it allows you to find that spinning
kernel routine :-)

R
-- 
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)
Received on Sun Jan 28 2007 - 11:39:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC