Re: ZFS leaking vnodes (sort of)

From: Simon Dircks <enderbsd_at_gmail.com> Date: Thu, 12 Jul 2007 07:52:08 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:14 UTC

On 7/12/07, Pawel Jakub Dawidek <pjd_at_freebsd.org> wrote:
>
> On Wed, Jul 11, 2007 at 08:24:41PM -0400, Simon Dircks wrote:
> > With this patch i am still able to reproduce my ZFS crash.
> >
> > controllera# uname -a
> > FreeBSD controllera.storage.ksdhost.com 7.0-CURRENT FreeBSD 7.0-CURRENT#0:
> > Thu Jul 12 02:28:52 UTC 2007
> > graff_at_controllera.storage.ksdhost.com:/usr/obj/usr/src/sys/CONTROLLERA
> > amd64
> >
> >
> > panic: ZFS: bad checksum (read on <unknown> off 0: zio
> 0xffffff001d729810
> > [LO SP
> > A space map] 1000L/800P DVA[0]=<0:1600421800:800>
> DVA[1]=<0:2c000f7000:800>
> > DVA[
> > 2]=<0:4200013800:800> fletcher4 lzjb LE contiguous birth=566 fill=1
> > chsum=5d3276
> > 7b98:635ff7022f8b:4251
> > cpuid = 0
> > KDB: enter: panic
> > [thread pid 802 tid 100066 ]
> > stopped at kdb_enter+0x31: leave
>
> This isn't related to the patch, actually. It looks like you don't have
> enough redundancy. Can you paste 'zpool status' output?
>
>
 Sure

controllera# zpool status
  pool: tank
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        tank          UNAVAIL      0     0     0  insufficient replicas
          mirror      UNAVAIL      0     0     0  insufficient replicas
            ggate111  UNAVAIL      0     0     0  cannot open
            ggate211  UNAVAIL      0     0     0  cannot open

Now here is another interesting thing: I can cause a crash now by just
reattaching the disks. I can repeat this without fail in just a few mins. So
for every test i have been typing zpool destroy tank, and making a fresh
pool.

Kip Macy wrote:

That looks more like bad disk than a file system bug.

That could be, my "disks" are actually ggatec devices on other machines. And
for some reason when they are under ZFS i get alot of packet loss (and ping
spike) over the gigabit interface even when it is not maxed out.
But should this still cause a panic? my / and /usr are a normal local disk
on UFS. Using the same ggate111 and ggate112 devices with gmirror + mount -o
async and copying file.XXX over to the gmirror I get no packet loss, no ping
spikes, and the checksum matches after i move the file over. All the
machines in question are the same version of freebsd, and all have full
debugging on and are not in production or running anything else.