Bad news re: new (20080817) ZFS patches and send/recv (broken again)

From: Thomas Backman <serenity_at_exscape.org>
Date: Mon, 17 Aug 2009 15:25:52 +0200
So, I've got myself a source tree almost completely free of patches  
after today's batch of ZFS patches merged - all that remains is that I  
uncommented ps -axl from /usr/sbin/crashinfo, since it only coredumps  
anyway, and added CFLAGS+=-DDEBUG=1 to zfs/Makefile.

One of the changes I didn't already have prior to this must have  
broken something, though, because this script worked just fine before  
the merges earlier today.
The script below is the exact same I linked in http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009174.html 
  back in July (URL to the script: http://exscape.org/temp/zfs_clone_panic.sh 
  ) - I made some local changes, thus the name invoked below.

Now that all the patches are merged, you should need nothing but the  
script, bash, and the ~200MB free space on the partition containing / 
root/ to reproduce this problem.
(Note that the "no such pool" in the FIRST script is normal; it simply  
tries to clean up something that isn't there, without error/sanity  
checking.)

[root_at_chaos ~]# bash -x panic-tests/CLONE_CRASH.submitted.sh initial
+ PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/ 
local/bin:/root/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/ 
local/sbin
+ MASTERPATH=/root/zfsclonecrash/crashtestmaster.disk
+ SLAVEPATH=/root/zfsclonecrash/crashtestslave.disk
+ LASTBACKUP=/root/zfsclonecrash/last-backup-name
+ MASTERNUM=1482
+ SLAVENUM=1675
+ '[' '!' -z initial ']'
+ case $1 in
+ initial
++ dirname /root/zfsclonecrash/crashtestmaster.disk
+ mkdir -p /root/zfsclonecrash
+ rm -rf /crashtestmaster /crashtestslave
+ mount_unmount unmount
+ '[' -z unmount ']'
+ [[ unmount == \m\o\u\n\t ]]
+ [[ unmount == \u\n\m\o\u\n\t ]]
+ zpool export crashtestmaster
cannot open 'crashtestmaster': no such pool
+ ggatel destroy -u 1482
+ zpool export crashtestslave
cannot open 'crashtestslave': no such pool
+ ggatel destroy -u 1675
+ echo Creating files and syncing
Creating files and syncing
+ dd if=/dev/zero of=/root/zfsclonecrash/crashtestmaster.disk bs=1000k  
count=100
100+0 records in
100+0 records out
102400000 bytes transferred in 0.367217 secs (278854144 bytes/sec)
+ dd if=/dev/zero of=/root/zfsclonecrash/crashtestslave.disk bs=1000k  
count=100
100+0 records in
100+0 records out
102400000 bytes transferred in 0.286532 secs (357377280 bytes/sec)
+ sync
+ echo Sleeping 5 seconds
Sleeping 5 seconds
+ sleep 5
+ echo 'Creating GEOM providers (~10 secs)'
Creating GEOM providers (~10 secs)
+ ggatel create -u 1482 /root/zfsclonecrash/crashtestmaster.disk
+ sleep 5
+ ggatel create -u 1675 /root/zfsclonecrash/crashtestslave.disk
+ sleep 5
+ echo 'Creating pools'
Creating pools
+ zpool create -f crashtestmaster ggate1482
+ zpool create -f crashtestslave ggate1675
+ echo 'Adding some data to the master pool'
Adding some data to the master pool
+ zfs create crashtestmaster/test_orig
+ dd if=/dev/random of=/crashtestmaster/test_orig/file1 bs=1000k  
count=10
10+0 records in
10+0 records out
10240000 bytes transferred in 0.447472 secs (22884109 bytes/sec)
+ dd if=/dev/random of=/crashtestmaster/test_orig/file2 bs=1000k  
count=10
10+0 records in
10+0 records out
10240000 bytes transferred in 0.626774 secs (16337625 bytes/sec)
+ echo 'Cloning test_base'
Cloning test_base
+ zfs snapshot crashtestmaster/test_orig_at_snap
+ zfs clone crashtestmaster/test_orig_at_snap crashtestmaster/test_cloned
+ zfs promote crashtestmaster/test_cloned
++ date +backup-%Y%m%d-%H%M%S
+ NOW=backup-20090817-151412
+ echo 'Creating snapshots'
Creating snapshots
+ zfs snapshot -r crashtestmaster_at_backup-20090817-151412
+ echo 'Doing initial clone to slave pool'
Doing initial clone to slave pool
+ zfs send -R crashtestmaster_at_backup-20090817-151412
+ zfs recv -vFd crashtestslave
cannot receive: invalid stream (malformed nvlist)
warning: cannot send 'crashtestmaster/test_cloned_at_snap': Broken pipe
warning: cannot send 'crashtestmaster/ 
test_cloned_at_backup-20090817-151412': Broken pipe
+ mount_unmount unmount
+ '[' -z unmount ']'
+ [[ unmount == \m\o\u\n\t ]]
+ [[ unmount == \u\n\m\o\u\n\t ]]
+ zpool export crashtestmaster
+ ggatel destroy -u 1482
+ zpool export crashtestslave
+ ggatel destroy -u 1675
+ echo backup-20090817-151412
+ echo 'Done!'
Done!
+ exit 0


I first noticed this after trying to make an incremental backup right  
after the installworld+reboot, as I always do; it didn't find the  
slave zpool to import...(! This may be very bad in real-life cases - I  
have no clue if ggate is the culprit or if people will soon start  
reporting lost pools.) So I tried wiping the backup and creating a new  
one from scratch (~15GB over the network), giving me the very same  
problem, instantly:

[...]
+ zpool create -f -R /slave slave ggate666
++ date +backup-%Y%m%d-%H%M
+ NOW=backup-20090817-1522
+ echo 'Creating snapshots'
Creating snapshots
+ zfs snapshot -r tank_at_backup-20090817-1522
+ echo 'Cloning pool'
Cloning pool
+ zfs send -R tank_at_backup-20090817-1522
+ zfs recv -vFd slave
cannot receive: invalid stream (malformed nvlist)
warning: cannot send 'tank_at_backup-20090817-1522': Broken pipe


Regards,
Thomas
Received on Mon Aug 17 2009 - 11:26:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:54 UTC