I've run into this panic a couple of times over the last few days, while trying to rebuild ports using an NFS-mounted /usr/ports filesystem. It happened again today and this time I had time to look at the dump. The problem is a null pointer dereference in nfs_putpages(), when it tries to look at np->n_size. It turns out that v_data is NULL on entry to this routine. Looking at the stack I see why: #6 0xc0674e4a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc05eb030 in nfs_putpages (ap=0xe81c6a14) at /usr/src/sys/nfsclient/nfs_bio.c:301 #8 0xc0691148 in VOP_PUTPAGES_APV (vop=0x1000, a=0xe81c6a14) at vnode_if.c:2164 #9 0xc064fd8e in vnode_pager_putpages (object=0xcafaa840, m=0x1000, count=0x1000, sync=0x5, rtvals=0x1000) at vnode_if.h:1119 During symbol reading, Attribute value is not a constant (DW_FORM_ref4). #10 0xc064b99e in vm_pageout_flush (mc=0xe81c6ab0, count=0x1, flags=0x5) at vm_pager.h:147 #11 0xc0647d0c in vm_object_page_collect_flush (object=0xcafaa840, p=0xc19e5218, curgeneration=0x0, pagerflags=0x5) at /usr/src/sys/vm/vm_object.c:950 #12 0xc0647800 in vm_object_page_clean (object=0xcafaa840, start=0x0, end=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/vm/vm_object.c:753 #13 0xc0647525 in vm_object_terminate (object=0xcafaa840) at /usr/src/sys/vm/vm_object.c:608 #14 0xc064e5ad in vnode_destroy_vobject (vp=0xcb58c110) at /usr/src/sys/vm/vnode_pager.c:166 #15 0xc05ee075 in nfs_reclaim (ap=0x1000) at /usr/src/sys/nfsclient/nfs_node.c:247 #16 0xc069095e in VOP_RECLAIM_APV (vop=0x1000, a=0xe81c6c90) at vnode_if.c:1589 #17 0xc0587aa5 in vgonel (vp=0xcb58c110) at vnode_if.h:818 #18 0xc0584ac2 in vlrureclaim (mp=0xc9b2e400) at /usr/src/sys/kern/vfs_subr.c:612 #19 0xc0584e8b in vnlru_proc () at /usr/src/sys/kern/vfs_subr.c:725 #20 0xc052034c in fork_exit (callout=0xc0584d00 <vnlru_proc>, arg=0x0, frame=0xe81c6d38) at /usr/src/sys/kern/kern_fork.c:789 #21 0xc0674eac in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 In nfs_reclaim(), just before he calls vnode_destroy_vobject(), he zfrees and clears vp->v_data. When, down in the guts of vm_object.c, he tries to flush the associated pages, v_data is already NULL so he goes boom. Now, why does he do the zfree/clear before vnode_destroy_vobject()? Is he assuming that there are no pages associated with this vnode that need to be flushed? Should there be? I looked at some other file systems and they do the same thing. The obvious fix is to move the zfree/clear to after the vnode_destroy_vobject() but if there should be no pages that need to be flushed on the vnode at this point, that would just hide the problem. I can keep looking at the code to answer my question but I thought I would ask here first, in case there's someone who knows the answer right away. Thanks. -- Frank Mayhar frank_at_exit.com http://www.exit.com/ Exit Consulting http://www.gpsclock.com/ http://www.exit.com/blog/frank/Received on Sun Jan 15 2006 - 19:40:04 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:50 UTC