ZFS: data sometimes not getting correctly flushed to disk (possible mmap/write issue)?

From: Richard Todd <rmtodd_at_ichotolot.servalan.com>
Date: Fri, 15 Feb 2008 16:59:20 -0600
Bit of a long story behind this one: I've been playing with the "picard"
program in ports to do re-tagging of audio (mp3/ogg) files, and was having
problems with the tagging process creating corrupted .ogg files.  On
further investigation trying to debug this, I found that the "picard"
program seemed to be correctly writing the files if I checked them
immediately, but the files were corrupt when I looked at them later.
Further investigation and coming up with test cases brought me to the
surprising conclusion that ZFS was to blame -- under certain conditions ZFS
seems not to be writing all the data in the file out to disk correctly.
The file appears to be good only so long as the file's data is in cache, but
the data actually on disk doesn't match what's in cache.

Yeah, I know.  Sounds completely crazy, right?  I've got some test programs
in the attached shar file that reliably reproduce the bug; try it yourself
and see.  

You'll need the following things from ports to run the tests:
1) audio/sox and audio/vorbis-tools to create the test audio file
2) audio/py-mutagen and Python to do the retagging of the test audio file; 
   this is the bit with the write patterns that seem to provoke the bug.
   (Picard uses the py-mutagen Python library to do its tagging.) 

The attached shar file has three files in it: gen1.cpp, gen2.py, and
test.sh.  gen1.cpp is a simple program to generate a test audio file of a
given length.  gen2.py is a Python script using py-mutagen to rewrite the
Ogg tags on an audio file with some new ones.  test.sh invokes the other
programs (assuming you've compiled gen1.cpp to an "a.out" executable) and
looks like this:

	#!/bin/sh
	NAME=$1
	LEN=$2
	rm $NAME
	./a.out $NAME $LEN
	./gen2.py $NAME
	md5 $NAME
	ls -l $NAME
	sudo umount /u3
	sudo mount -t zfs u3 /u3
	md5 $NAME
	ls -l $NAME

It takes as arguments the name of a file to do the test on, and the length of 
the file.  The script does the following:
  1) Remove any pre-existing file from a previous run of this script
  2) Create a new .ogg file with the given length. 
  3) Use py-mutagen to rewrite the Ogg tags on that file. 
  4) MD5 the file.  Since the file is only a few meg, this should be reading
     entirely from cache.  
  5) Unmount and remount the filesystem containing the file.  (I hardwired
     the filesystem name, sorry; you'll have to tweak this bit of test.sh if
     you're testing it.  Also, the script uses sudo; if you don't have sudo,
     you'll have to run the test as root.)
  6) Redo the MD5 of the file.

Now, if everything is working okay, the "before" and "after" MD5s should always
be exactly the same, right?  Usually they are, but sometimes not.  I ran this
test on my main system, which is an Core2Duo with 4G running i386 -current as
of a week ago (Feb. 7):

Script started on Tue Feb 12 19:53:15 2008
1 ichotolot ~/bug[ 7:53PM] Z% for i in 400 401 402 403 404 405
for> do
for>  echo $i 
for>  ./test1.sh /u3/tmp/foo1.ogg $i
for> done
400
MD5 (/u3/tmp/foo1.ogg) = 19717301e4e0d9ab2e80e9636ab8adf9
-rw-r--r--  1 rmtodd  wheel  6191808 Feb 12 19:54 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = 19717301e4e0d9ab2e80e9636ab8adf9
-rw-r--r--  1 rmtodd  wheel  6191808 Feb 12 19:54 /u3/tmp/foo1.ogg
401
MD5 (/u3/tmp/foo1.ogg) = 816d410ba77eee3493bf73bc355f1092
-rw-r--r--  1 rmtodd  wheel  6207274 Feb 12 19:54 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = 816d410ba77eee3493bf73bc355f1092
-rw-r--r--  1 rmtodd  wheel  6207274 Feb 12 19:54 /u3/tmp/foo1.ogg
402
MD5 (/u3/tmp/foo1.ogg) = 366432a89b11745367e059d9d6e1d6e8
-rw-r--r--  1 rmtodd  wheel  6222579 Feb 12 19:55 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = 366432a89b11745367e059d9d6e1d6e8
-rw-r--r--  1 rmtodd  wheel  6222579 Feb 12 19:55 /u3/tmp/foo1.ogg
403
MD5 (/u3/tmp/foo1.ogg) = 72b9b992f31332fc1682cad1e97fa7e5
-rw-r--r--  1 rmtodd  wheel  6238218 Feb 12 19:55 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = 3abd285e657684f4a2f89043f230b164
-rw-r--r--  1 rmtodd  wheel  6238218 Feb 12 19:55 /u3/tmp/foo1.ogg
404
MD5 (/u3/tmp/foo1.ogg) = 3c76463bb294ec2b0db0a4746613523b
-rw-r--r--  1 rmtodd  wheel  6253713 Feb 12 19:56 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = 3c76463bb294ec2b0db0a4746613523b
-rw-r--r--  1 rmtodd  wheel  6253713 Feb 12 19:56 /u3/tmp/foo1.ogg
405
MD5 (/u3/tmp/foo1.ogg) = ee0f61cd8e68a390a99db582c715a1f9
-rw-r--r--  1 rmtodd  wheel  6269175 Feb 12 19:56 /u3/tmp/foo1.ogg
MD5 (/u3/tmp/foo1.ogg) = ee0f61cd8e68a390a99db582c715a1f9
-rw-r--r--  1 rmtodd  wheel  6269175 Feb 12 19:56 /u3/tmp/foo1.ogg
2 ichotolot ~/bug[ 7:56PM] Z% exit

Script done on Tue Feb 12 19:56:55 2008

You see that on one of the tests (the one for the 403-second-long test
file) the MD5s don't match.  The bug seems to be triggered by certain exact
patterns of writes.  Since the offsets of the writes are dependent on the
exact input .ogg, the bug is repeatable given the same input data sequence.
That is, the bug *always* seems to show up with certain patterns, i.e., certain
data files and not with others.  Some other notes:

1) I tried this test on another system with UFS filesystems and couldn't 
   reproduce it, so this does seem to be ZFS-specific.  
2) I don't have compression enabled on the ZFS filesystems in question.  
3) ktracing gen2.py shows that the py-mutagen library seems to do both write()s
   and read/write mmap()s to the file to update the file.  

Anyway, here's the shar file. 

# This is a shell archive.  Save it in a file, remove anything before
# this line, and then unpack it by entering "sh file".  Note, it may
# create directories; files and directories will be owned by you and
# have default permissions.
#
# This archive contains:
#
#	gen1.cpp
#	gen2.py
#	test.sh
#
echo x - gen1.cpp
sed 's/^X//' >gen1.cpp << 'END-of-gen1.cpp'
X/*
X** Program to create a n-second-long ogg file of random noise.
X*/
X
X#include <stdio.h>
X#include <stdlib.h>
X#include <stdint.h>
X#include <sys/types.h>
X
X/* Usage: gen1 filename number-of-seconds */
Xint
Xmain(int argc, char **argv)
X{
X    unsigned int secs = atoi(argv[2]);
X    FILE *p = popen("sox -t .sw -r 44100 -c 2 - -r 44100 -t .wav /var/tmp/foo1.wav", "w");
X
X    srandom(1);
X
X    int16_t val;
X
X    for (unsigned int i = 0 ; i < secs*44100 ; ++i) {
X	// for each time, send 2 "randomly" gen'd samples down the pipe
X	val = random();
X	(void)fwrite(&val, sizeof(int16_t), 1, p);
X	val = random();
X	(void)fwrite(&val, sizeof(int16_t), 1, p);
X    }
X
X    (void) pclose(p);
X
X    char buf[2048];
X    sprintf(buf, "oggenc -Q -s 1234 -b 128 -o %s -N 1 -t Bozo -a Bozo2 -l Bozo3 /var/tmp/foo1.wav", argv[1]);
X    system(buf);
X    exit(0);
X}
END-of-gen1.cpp
echo x - gen2.py
sed 's/^X//' >gen2.py << 'END-of-gen2.py'
X#!/usr/local/bin/python
X
Xfrom mutagen.oggvorbis import OggVorbis
Ximport sys
X
Xaudio = OggVorbis(sys.argv[1])
Xaudio["genre"]="Rock"
Xaudio["date"]="1989"
Xaudio["album"]="Can't Look Away"
Xaudio["albumartistsort"]="Rabin, Trevor"
Xaudio["musicbrainz_artistid"]="884ee438-f185-4b2c-b23b-f96886bdc91f"
Xaudio["musicip_puid"]="e8c0c0e9-e939-d482-5abf-746c558113bb"
Xaudio["title"]="Etoile Noir & Eyes Of Love"
Xaudio["artist"]="Trevor Rabin"
Xaudio["releasetype"]="album"
Xaudio["artistsort"]="Rabin, Trevor"
Xaudio["musicbrainz_trackid"]="bd4358fa-69a3-4c38-89ea-4aa18380c7e4"
Xaudio["releasestatus"]="official"
Xaudio["totaltracks"]="11"
Xaudio["albumartist"]="Trevor Rabin"
Xaudio["musicbrainz_albumartistid"]="884ee438-f185-4b2c-b23b-f96886bdc91f"
Xaudio["tracknumber"]="6"
Xaudio["musicbrainz_albumid"]="41691ad3-a6f0-4633-8f24-b782509254b6"
X
Xaudio.save()
END-of-gen2.py
echo x - test.sh
sed 's/^X//' >test.sh << 'END-of-test.sh'
X#!/bin/sh
XNAME=$1
XLEN=$2
Xrm $NAME
X./a.out $NAME $LEN
X./gen2.py $NAME
Xmd5 $NAME
Xls -l $NAME
Xsudo umount /u3
Xsudo mount -t zfs u3 /u3
X#ogginfo foo1.ogg
Xmd5 $NAME
Xls -l $NAME
END-of-test.sh
exit
Received on Fri Feb 15 2008 - 22:15:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:27 UTC