Re: Bad news re: new (20080817) ZFS patches and send/recv (broken again)

From: Thomas Backman <serenity_at_exscape.org>
Date: Mon, 17 Aug 2009 17:24:47 +0200
On Aug 17, 2009, at 15:25, Thomas Backman wrote:

> So, I've got myself a source tree almost completely free of patches  
> after today's batch of ZFS patches merged - all that remains is that  
> I uncommented ps -axl from /usr/sbin/crashinfo, since it only  
> coredumps anyway, and added CFLAGS+=-DDEBUG=1 to zfs/Makefile.
>
> One of the changes I didn't already have prior to this must have  
> broken something, though, because this script worked just fine  
> before the merges earlier today.
> The script below is the exact same I linked in http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html 
>  back in July (URL to the script: http://exscape.org/temp/zfs_clone_panic.sh 
>  ) - I made some local changes, thus the name invoked below.
>
> Now that all the patches are merged, you should need nothing but the  
> script, bash, and the ~200MB free space on the partition containing / 
> root/ to reproduce this problem.
> (Note that the "no such pool" in the FIRST script is normal; it  
> simply tries to clean up something that isn't there, without error/ 
> sanity checking.)
>
> [root_at_chaos ~]# bash -x panic-tests/CLONE_CRASH.submitted.sh initial
> + PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/ 
> local/bin:/root/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/ 
> usr/local/sbin
> + MASTERPATH=/root/zfsclonecrash/crashtestmaster.disk
> + SLAVEPATH=/root/zfsclonecrash/crashtestslave.disk
> + LASTBACKUP=/root/zfsclonecrash/last-backup-name
> + MASTERNUM=1482
> + SLAVENUM=1675
> + '[' '!' -z initial ']'
> + case $1 in
> + initial
> ++ dirname /root/zfsclonecrash/crashtestmaster.disk
> + mkdir -p /root/zfsclonecrash
> + rm -rf /crashtestmaster /crashtestslave
> + mount_unmount unmount
> + '[' -z unmount ']'
> + [[ unmount == \m\o\u\n\t ]]
> + [[ unmount == \u\n\m\o\u\n\t ]]
> + zpool export crashtestmaster
> cannot open 'crashtestmaster': no such pool
> + ggatel destroy -u 1482
> + zpool export crashtestslave
> cannot open 'crashtestslave': no such pool
> + ggatel destroy -u 1675
> + echo Creating files and syncing
> Creating files and syncing
> + dd if=/dev/zero of=/root/zfsclonecrash/crashtestmaster.disk  
> bs=1000k count=100
> 100+0 records in
> 100+0 records out
> 102400000 bytes transferred in 0.367217 secs (278854144 bytes/sec)
> + dd if=/dev/zero of=/root/zfsclonecrash/crashtestslave.disk  
> bs=1000k count=100
> 100+0 records in
> 100+0 records out
> 102400000 bytes transferred in 0.286532 secs (357377280 bytes/sec)
> + sync
> + echo Sleeping 5 seconds
> Sleeping 5 seconds
> + sleep 5
> + echo 'Creating GEOM providers (~10 secs)'
> Creating GEOM providers (~10 secs)
> + ggatel create -u 1482 /root/zfsclonecrash/crashtestmaster.disk
> + sleep 5
> + ggatel create -u 1675 /root/zfsclonecrash/crashtestslave.disk
> + sleep 5
> + echo 'Creating pools'
> Creating pools
> + zpool create -f crashtestmaster ggate1482
> + zpool create -f crashtestslave ggate1675
> + echo 'Adding some data to the master pool'
> Adding some data to the master pool
> + zfs create crashtestmaster/test_orig
> + dd if=/dev/random of=/crashtestmaster/test_orig/file1 bs=1000k  
> count=10
> 10+0 records in
> 10+0 records out
> 10240000 bytes transferred in 0.447472 secs (22884109 bytes/sec)
> + dd if=/dev/random of=/crashtestmaster/test_orig/file2 bs=1000k  
> count=10
> 10+0 records in
> 10+0 records out
> 10240000 bytes transferred in 0.626774 secs (16337625 bytes/sec)
> + echo 'Cloning test_base'
> Cloning test_base
> + zfs snapshot crashtestmaster/test_orig_at_snap
> + zfs clone crashtestmaster/test_orig_at_snap crashtestmaster/test_cloned
> + zfs promote crashtestmaster/test_cloned
> ++ date +backup-%Y%m%d-%H%M%S
> + NOW=backup-20090817-151412
> + echo 'Creating snapshots'
> Creating snapshots
> + zfs snapshot -r crashtestmaster_at_backup-20090817-151412
> + echo 'Doing initial clone to slave pool'
> Doing initial clone to slave pool
> + zfs send -R crashtestmaster_at_backup-20090817-151412
> + zfs recv -vFd crashtestslave
> cannot receive: invalid stream (malformed nvlist)
> warning: cannot send 'crashtestmaster/test_cloned_at_snap': Broken pipe
> warning: cannot send 'crashtestmaster/ 
> test_cloned_at_backup-20090817-151412': Broken pipe
> + mount_unmount unmount
> + '[' -z unmount ']'
> + [[ unmount == \m\o\u\n\t ]]
> + [[ unmount == \u\n\m\o\u\n\t ]]
> + zpool export crashtestmaster
> + ggatel destroy -u 1482
> + zpool export crashtestslave
> + ggatel destroy -u 1675
> + echo backup-20090817-151412
> + echo 'Done!'
> Done!
> + exit 0
>
>
> I first noticed this after trying to make an incremental backup  
> right after the installworld+reboot, as I always do; it didn't find  
> the slave zpool to import...(! This may be very bad in real-life  
> cases - I have no clue if ggate is the culprit or if people will  
> soon start reporting lost pools.) So I tried wiping the backup and  
> creating a new one from scratch (~15GB over the network), giving me  
> the very same problem, instantly:
>
> [...]
> + zpool create -f -R /slave slave ggate666
> ++ date +backup-%Y%m%d-%H%M
> + NOW=backup-20090817-1522
> + echo 'Creating snapshots'
> Creating snapshots
> + zfs snapshot -r tank_at_backup-20090817-1522
> + echo 'Cloning pool'
> Cloning pool
> + zfs send -R tank_at_backup-20090817-1522
> + zfs recv -vFd slave
> cannot receive: invalid stream (malformed nvlist)
> warning: cannot send 'tank_at_backup-20090817-1522': Broken pipe
>
>
> Regards,
> Thomas
This is perhaps more troubling...

[root_at_chaos ~]# dd if=/dev/zero of=./zfstestfile bs=1000k count=100 &&
ggatel create -u 142 ./zfstestfile &&
zpool create -f testpool142 ggate142
100+0 records in
100+0 records out
102400000 bytes transferred in 0.266194 secs (384681697 bytes/sec)
[root_at_chaos ~]# zpool list
NAME          SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank           66G  9.86G  56.1G    14%  ONLINE  -
testpool142    93M  73.5K  92.9M     0%  ONLINE  -
[root_at_chaos ~]# zpool export testpool142
[root_at_chaos ~]# ggatel destroy -u 142
[root_at_chaos ~]# ggatel create -u 142 ./zfstestfile
[root_at_chaos ~]# zpool import testpool142
cannot import 'testpool142': no such pool available
[root_at_chaos ~]# zpool import
[root_at_chaos ~]# ggatel list
ggate142
[root_at_chaos ~]#

Worse yet:

[root_at_chaos ~]# zpool create testpool ad0s1d
[root_at_chaos ~]# zpool export testpool
[root_at_chaos ~]# zpool import testpool
cannot import 'testpool': no such pool available

Regards,
Thomas
Received on Mon Aug 17 2009 - 13:25:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:54 UTC