Re: how to find out what the other CPU is doing

From: Randall Stewart <rrs_at_cisco.com>
Date: Wed, 07 Feb 2007 15:02:43 -0500
John Baldwin wrote:
> On Sunday 28 January 2007 07:38, Randall Stewart wrote:
>> All:
>>
>> Ok, I did not get an answer to this.. and of course
>> I hit the bug again (which I now figured out how to
>> fix :-D)
>>
>> So let me explain what I did.. so that way I
>> can come back and find this email later when it
>> someday happens again ;-) (and for anyone else
>> curious).
>>
>> 1) I had to do this from DDB ... I could not find a
>>     way in kgdb.
>>
>> 2) When you stop the machine in ddb (at least in i386) it
>>     dumps BOTH CPU's info in something called
>>     stoppcbs[num-cpus]
>> 3) Its an array of  struct pcb .. which has all the info
>>     you need to get started.
>> 4) With a trusty x/ stoppcbs you can work your way through
>>     and gather the info you need.. For x86 the second CPU
>>     started at stoppcbs+0x270 .. if you don't want to look
>>     at all those 0's (of course the offset could change and
>>     will vary from CPU type to CPU type :-D)
>> 5) Dig out the ebp from here. You can look at the IP
>>     but it will be in some NMI stop CPU routine.
>> 6) You can use the bp to trace backward through the stack
>>     and figure out the running stack trace... I went back
>>     to kgdb after getting the ebp (with CPU still spinning away).
>> 7) You have to go several frames back to get by all the NMI
>>     stuff before you find your guilty party :-)
>>
>> There might be a better way to do this.. and I am thinking
>> about adding a machine dependent trace that can take
>> a ebp argument (if one does not already exist in kgdb.. I
>> suppose I need to poke around in the macro's a bit).. anyway
>> its primitive .. but it allows you to find that spinning
>> kernel routine :-)
> 
> When you use 'thread/tid/proc' in kgdb it uses stoppcbs[] automatically, so 
> you can do 'proc 437' and do 'bt' to get a trace as I explained earlier.  ddb 
> can also do this for you as 'tr' in ddb can take a pid or tid as an argument, 
> so in ddb you can do 'tr 437' to trace proc 437.  Note if you want to use 
> the 'tid' in kgdb you use 'tid <tid>'.  'proc' takes PIDs not TIDs in kgdb.
> 
Hmm..

I tried that the first time I had a crash in kgdb (I did not do anything
in DDB) and it did not work for me... I flustered around with it
for a very very long time too.

Maybe I have an old kgdb or something but I could not get it to work :-(

R

-- 
Randall Stewart
NSSTG - Cisco Systems Inc.
803-345-0369 <or> 803-317-4952 (cell)
Received on Wed Feb 07 2007 - 19:03:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC