Re: nullfs broken on powerpc

From: Eitan Adler <lists_at_eitanadler.com>
Date: Wed, 25 Jan 2012 15:29:15 -0500
On Wed, Jan 25, 2012 at 2:50 PM, Milan Obuch <freebsd-current_at_dino.sk> wrote:
> On Wed, 25 Jan 2012 14:21:23 +0200
> Kostik Belousov <kostikbel_at_gmail.com> wrote:
>
>> On Tue, Jan 24, 2012 at 06:31:52PM +0100, Milan Obuch wrote:
>> > Hi,
>> >
>
> [ snip ]
>
>> > This does not work with powerpc for me. With sources csup'ped this
>> > morning, full system rebuild with GENERIC kernel, it is enough for
>> > me to issue
>> >
>> > mount_nullfs /data/src10 /usr/src
>> > csup /usr/share/examples/cvsup/standard-supfile
>> >
>> > and system panic occurs, with following on system console:
>> >
>> > panic: mtx_lock() of spin mutex (null)
>> > _at_ /usr/src/sys/kern/vfs_subr.c:2670 cpuid = 0
>> > KDB: enter: panic
>> > [ thread pid 1442 tid 100095 ]
>> > Stopped at 0x40f734: addi r0, r0, 0x0
>> > db>
>> >
>> > At this point, I am able to interact with system, the question for
>> > me is what I want to get from it :) I tried bt with following
>> > result:
>> >
>> > Tracing pid 1442 tid 100095 td 0x2d6b000
>> > 0xe22c26d0: at panic+0x274
>> > 0xe22c2730: at _mtx_lock_flags+0xc4
>> > 0xe22c2760: at vgonel+0x330
>> > 0xe22c27b0: at vrecycle+0x54
>> > 0xe22c27d0: at null_inactive+0x30
>> > 0xe22c27f0: at VOP_INACTIVE_APV+0xdc
>> > 0xe22c2810: at vinactive+0x98
>> > 0xe22c2850: at vputx+0x344
>> > 0xe22c28a0: at vput+0x18
>> > 0xe22c28c0: at kern_statat_vnhook+0x108
>> > 0xe22c29d0: at kern_statat+0x18
>> > 0xe22c29f0: at kern_lstat+0x2c
>> > 0xe22c2a10: at sys_lstat+0x30
>> > 0xe22c2a90: at trap+0x388
>> > 0xe22c2b60: at powerpc_interrupt+0x108
>> > 0xe22c2b90: user SC trap by _end+0x40d88c70: srr1=0xd032
>> >             r1=0xffaf9a70 cr=0x28004044 xer=0x20000000
>> > ctr=0x41a0ac40
>> > db>
>> >
>> > Does this shed any light for someone with more knowledge here? My
>> > gut feeling is there is some endianness issue at play, the same
>> > nullfs usage works for me flawlessly on both i386 and amd64
>> > systems, so it could not be 32 vs 64 bit issue at least.
>> >
>> > At line 2670 of /usr/src/sys/kern/vfs_subr.c I see end of function
>> > void vgonel(struct vnode *vp)
>> >
>> >         VI_LOCK(vp);
>> >         vp->v_vnlock = &vp->v_lock;
>> >         vp->v_op = &dead_vnodeops;
>> >         vp->v_tag = "none";
>> >         vp->v_type = VBAD;
>> > }
>> >
>> > so the question seems to be reduced to 'why is vp null?' or is my
>> > small attempt on analyse flawed...
>
>> I do not think that the vp is null. It more look like the *vp memory
>> was zeroed. This has very low chances of being related to endianess,
>> and more like a kernel memory corruption.
>>
>> Take a dump and print the content of *vp.
>
> How could I look into memory? I found page
> http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-online-ddb.html
> and I can see registers (show reg), use x with absolute addresses, but
> something like 'x vp' tells just 'Symbol not known' - should I somehow
> load symbol table into memory? But backtrace shows function names... or
> should I somehow modify GENERIC kernel to include more debugging info?
> Kernel debugging is a bit new for me, even if I can write simple
> modification into kernel, but only in some special (and narrow) area of
> code...

>From ddb write 'call doadump'. Provided you have a proper dump device
set up in rc.conf it should work. You could then use kgdb from a
running computer to analyze the dump in more detail.

-- 
Eitan Adler
Received on Wed Jan 25 2012 - 19:29:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC