Re: nullfs broken on powerpc

From: Andreas Tobler <andreast_at_FreeBSD.org> Date: Wed, 25 Jan 2012 22:00:26 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC

On 25.01.12 21:29, Eitan Adler wrote:
> On Wed, Jan 25, 2012 at 2:50 PM, Milan Obuch<freebsd-current_at_dino.sk>  wrote:
>> On Wed, 25 Jan 2012 14:21:23 +0200
>> Kostik Belousov<kostikbel_at_gmail.com>  wrote:
>>
>>> On Tue, Jan 24, 2012 at 06:31:52PM +0100, Milan Obuch wrote:
>>>> Hi,
>>>>
>>
>> [ snip ]
>>
>>>> This does not work with powerpc for me. With sources csup'ped this
>>>> morning, full system rebuild with GENERIC kernel, it is enough for
>>>> me to issue
>>>>
>>>> mount_nullfs /data/src10 /usr/src
>>>> csup /usr/share/examples/cvsup/standard-supfile
>>>>
>>>> and system panic occurs, with following on system console:
>>>>
>>>> panic: mtx_lock() of spin mutex (null)
>>>> _at_ /usr/src/sys/kern/vfs_subr.c:2670 cpuid = 0
>>>> KDB: enter: panic
>>>> [ thread pid 1442 tid 100095 ]
>>>> Stopped at 0x40f734: addi r0, r0, 0x0
>>>> db>
>>>>
>>>> At this point, I am able to interact with system, the question for
>>>> me is what I want to get from it :) I tried bt with following
>>>> result:
>>>>
>>>> Tracing pid 1442 tid 100095 td 0x2d6b000
>>>> 0xe22c26d0: at panic+0x274
>>>> 0xe22c2730: at _mtx_lock_flags+0xc4
>>>> 0xe22c2760: at vgonel+0x330
>>>> 0xe22c27b0: at vrecycle+0x54
>>>> 0xe22c27d0: at null_inactive+0x30
>>>> 0xe22c27f0: at VOP_INACTIVE_APV+0xdc
>>>> 0xe22c2810: at vinactive+0x98
>>>> 0xe22c2850: at vputx+0x344
>>>> 0xe22c28a0: at vput+0x18
>>>> 0xe22c28c0: at kern_statat_vnhook+0x108
>>>> 0xe22c29d0: at kern_statat+0x18
>>>> 0xe22c29f0: at kern_lstat+0x2c
>>>> 0xe22c2a10: at sys_lstat+0x30
>>>> 0xe22c2a90: at trap+0x388
>>>> 0xe22c2b60: at powerpc_interrupt+0x108
>>>> 0xe22c2b90: user SC trap by _end+0x40d88c70: srr1=0xd032
>>>>              r1=0xffaf9a70 cr=0x28004044 xer=0x20000000
>>>> ctr=0x41a0ac40
>>>> db>
>>>>
>>>> Does this shed any light for someone with more knowledge here? My
>>>> gut feeling is there is some endianness issue at play, the same
>>>> nullfs usage works for me flawlessly on both i386 and amd64
>>>> systems, so it could not be 32 vs 64 bit issue at least.
>>>>
>>>> At line 2670 of /usr/src/sys/kern/vfs_subr.c I see end of function
>>>> void vgonel(struct vnode *vp)
>>>>
>>>>          VI_LOCK(vp);
>>>>          vp->v_vnlock =&vp->v_lock;
>>>>          vp->v_op =&dead_vnodeops;
>>>>          vp->v_tag = "none";
>>>>          vp->v_type = VBAD;
>>>> }
>>>>
>>>> so the question seems to be reduced to 'why is vp null?' or is my
>>>> small attempt on analyse flawed...
>>
>>> I do not think that the vp is null. It more look like the *vp memory
>>> was zeroed. This has very low chances of being related to endianess,
>>> and more like a kernel memory corruption.
>>>
>>> Take a dump and print the content of *vp.
>>
>> How could I look into memory? I found page
>> http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-online-ddb.html
>> and I can see registers (show reg), use x with absolute addresses, but
>> something like 'x vp' tells just 'Symbol not known' - should I somehow
>> load symbol table into memory? But backtrace shows function names... or
>> should I somehow modify GENERIC kernel to include more debugging info?
>> Kernel debugging is a bit new for me, even if I can write simple
>> modification into kernel, but only in some special (and narrow) area of
>> code...
>
>> From ddb write 'call doadump'. Provided you have a proper dump device
> set up in rc.conf it should work. You could then use kgdb from a
> running computer to analyze the dump in more detail.

This only works if your target is booke, AIM (Apple based machines) do 
not have the 'call doadump' implemented yet. It is somewhere on my long 
todo list.

Gruss,
Andreas