SCSI End of Tape problems

From: Mark Powell <M.S.Powell_at_salford.ac.uk>
Date: Sun, 11 Nov 2007 17:26:40 +0000 (GMT)
I'm having problems with multi-volume backups onto scsi tape. Perhaps it's 
just my problem understanding how sa(4) is supposed to work. If so, please 
humour me :)
   I notice that SCSI EOT seems to be behaving strangely. This is 7.0-BETA2 
amd64 with an IBM Ultrium 3580 LTO-2 drive:

# mt status
Mode      Density              Blocksize      bpi      Compression
Current:  0x42                 variable       0        0x1
---------available modes---------
0:        0x42                 variable       0        0x1
1:        0x42                 variable       0        0x1
2:        0x42                 variable       0        0x1
3:        0x42                 variable       0        0x1
---------------------------------
Current Driver State: at rest.
---------------------------------
File Number: 0  Record Number: 0        Residual Count 0

Filling a tape with data yields unexpected write sizes towards tape end:

# sdd if=/dev/random of=/dev/nsa0 -debug bs=1m
...
readbuf  (4, 800900000, 1048576) = 1048576
writebuf (5, 800900000, 1048576)
readbuf  (4, 800900000, 1048576) = 1048576
writebuf (5, 800900000, 1048576)
writebuf (5, 800910000, 983040)
writebuf (5, 800920000, 917504)
writebuf (5, 800930000, 851968)
writebuf (5, 800940000, 786432)
writebuf (5, 800950000, 720896)
writebuf (5, 800960000, 655360)
writebuf (5, 800970000, 589824)
writebuf (5, 800980000, 524288)
writebuf (5, 800990000, 458752)
writebuf (5, 8009A0000, 393216)
writebuf (5, 8009B0000, 327680)
writebuf (5, 8009C0000, 262144)
writebuf (5, 8009D0000, 196608)
writebuf (5, 8009E0000, 131072)
writebuf (5, 8009F0000, 65536)
readbuf  (4, 800900000, 1048576) = 1048576
writebuf (5, 800900000, 1048576)
sdd: File too large. Error writing '/dev/nsa0'.

sdd: Read  202157 records + 0 bytes (total of 211976978432 bytes = 207008768.00k).
sdd: Wrote 202155 records + 1048576 bytes (total of 211975929856 bytes = 207007744.00k).

and truss output:

  2255: 0.014697962 read(4,"\M^G\M-,\M-t\M^A\M-_H\240\M-l|"...,1048576) = 1048576 (0x100000)
  2255: 0.011191023 write(5,"\M^G\M-,\M-t\M^A\M-_H\240\M-l|"...,1048576) = 1048576 (0x100000)
  2255: 0.014743220 read(4,"4\M-;\M-C\M-.7\M^XM/\aC\M-R\M-p"...,1048576) = 1048576 (0x100000)
  2255: 0.016858613 write(5,"4\M-;\M-C\M-.7\M^XM/\aC\M-R\M-p"...,1048576) = 65536 (0x10000)
  2255: 0.006140005 write(5,"Z<>\M^B\^DQV\M-v\^_\M-g\M^\p\M--"...,983040) = 65536 (0x10000)
  2255: 0.008204274 write(5,""\M-Z\M-z\M^D\aOEBn\M^G\^V\M^["...,917504) = 65536 (0x10000)
  2255: 0.008075763 write(5,"0C\n\M-\\M-Yv\M^T,\M-p\^]\M-r"...,851968) = 65536 (0x10000)
  2255: 0.005905613 write(5,"\M-&\^T:\M-;\M-%\M-i\M-_at_Cb\M-k"...,786432) = 65536 (0x10000)
  2255: 0.008508229 write(5,"\M^Q\M-4O#\^T\M^TP\M-#\M-m\M-F"...,720896) = 65536 (0x10000)
  2255: 0.009589113 write(5,"W\M^Z\f\^A\M^O\M-~\f\M-,\M-Q:^T"...,655360) = 65536 (0x10000)
  2255: 0.012372202 write(5,"\M-vO1A\M-r\^[\M-m\M-y%Se&\M-"8"...,589824) = 65536 (0x10000)
  2255: 0.008358486 write(5,"-\M^F\M-pZ\rT\M-=*%U\M-_at_\M->:"...,524288) = 65536 (0x10000)
  2255: 0.006049768 write(5,"\M-r\M^Q\^V\M-|\M^GN\M--\M-D\M-<"...,458752) = 65536 (0x10000)
  2255: 0.008752399 write(5,"\M^K\^\\M-w7\M-C\M-U/\M-s\v\M^Ou"...,393216) = 65536 (0x10000)
  2255: 0.008212376 write(5,"Pp\M^Y\M^XtVg\^S\M^Z(%i\M-^9\M-C"...,327680) = 65536 (0x10000)
  2255: 0.006424125 write(5,"\M-|\M-t\M-q\a~\M-F\240\M-h\M-y"...,262144) = 65536 (0x10000)
  2255: 0.008091408 write(5,"\M-d\M^\d\M-NFO#V\M^Q[5\M^Q\M-C."...,196608) = 65536 (0x10000)
  2255: 0.017410649 write(5,"j\M-Z\M-)\M^?\M^P\M-a`Bqx\M-_at_"...,131072) = 65536 (0x10000)
  2255: 0.005774868 write(5,"bo\v\M-S\M-M\M-XZH\M-ee\^B8D\M-."...,65536) = 65536 (0x10000)
  2255: 0.014674216 read(4,"\^DK\^??0\M^L\M-eE$\M-`\M-a\M-%k"...,1048576) = 1048576 (0x100000)
  2255: 0.000014527 write(5,"\^DK\^??0\M^L\M-eE$\M-`\M-a\M-%k"...,1048576) = 0 (0x0)
  2255: 0.000008102 fsync(0x5,0x7fffffffe410,0x30,0x7fffffffe380,0xffffffff80616580,0x7fffffffe2d8) = 0 (0x0)
  2255: 1.282824673 close(5)                      = 0 (0x0)
  2255: -1.999956418 process exit, rval = 27

It seems strange that the OS split the last 1m in 16x64kb chunks. Is that a 
driver bug? Shouldn't it just write a single 1m block at the end?
   Reading that tape back seems to correctly reads every block as 1m:

# sdd if=/dev/nsa0 -onull -debug bs=1m
...
readbuf  (4, 800900000, 1048576) = 1048576
readbuf  (4, 800900000, 1048576) = 1048576
readbuf  (4, 800900000, 1048576) = 0

sdd: Read  202156 records + 0 bytes (total of 211975929856 bytes = 
207007744.00k).
sdd: Wrote 0 records + 0 bytes (total of 0 bytes = 0.00k).

And the truss:

  3316: 0.027002833 read(4,"\M^G\M-,\M-t\M^A\M-_H\240\M-l|"...,1048576) = 1048576 (0x100000)
  3316: 1.330697190 read(4,"4\M-;\M-C\M-.7\M^XM/\aC\M-R\M-p"...,1048576) = 1048576 (0x100000)
  3316: 0.005059958 read(4,0x800900000,1048576)   = 0 (0x0)
  3316: 0.000017041 fsync(0x1,0x7fffffffe520,0x4,0x7fffffffe490,0xffffffff80616500,0x7fffffffe3e8) = 0 (0x0)
  3316: 0.000016203 close(1)                      = 0 (0x0)
  3316: -1.999939377 process exit, rval = 0

So why the discrepancy on write?
   I'm also seeing other strange behaviour.
   Here's some truss from a star multi-volume tape write with bs=1m:

# star -c -xdev -sparse -acl -link-dirs -T -vv -time H=exustar 
dumpdate=$dumpdate bs=1m fs=${fifo} -M -multivol -fifostats level=${level} 
-wtardumps VOLHDR=\"Level ${level} ${fs} `date +'%F %T'`\" fs-name=/${fs} -C 
/${snapdir} ."

Star uses a fifo to speed up tape writes. That's why there are two processes. 
FD 5 are the files it's reading and fd 9 is the tape:

  6082: 6183.696515874 read(5,"\M^?\M-{\M^Pd\0\^O\M-p\0\0L\0\0"...,42691072) = 2104844 (0x201e0c)
  6082: 6183.696598009 read(5,0x804654600,42682880) = 0 (0x0)
  6083: 6183.357543470 write(9,"\M-y!\M-f\M-0 \M-fo\M-q\M^MjjV_"...,1048576) = 512 (0x200)
  6082: 6183.725543832 read(5,"\M^?\M-{\240D\0\0\^C\^QV\M-9h#"...,42681344) = 3267175 (0x31da67)
  6082: 6183.725633230 read(5,0x804972800,40462336) = 0 (0x0)
  6083: 6183.369095147 write(9,"4\M-n\M-`;\M^N\M-E\M-%\M-c\^\"...,1048576) = 1048576 (0x100000)
  6083: 6183.408098854 write(9,"\M-*\M-d\M^U$^\M-N\M-^\M-C\M-\'d"...,1048576) = 5 (0x5)
  6083: 6183.447565199 write(9,"\^\\M-G\M-*\^]l\M-_at_s\M-,`\M-s\^Y"...,1048576) = 1048576 (0x100000)
  6083: 6183.458886675 write(9,"O\M-[\M^KRN\M-5D\M^M\M-y\^W\M^?"...,1048576) = 1048576 (0x100000)
  6083: 6183.497476915 write(9,"\^N\M^O\M^^x\M-e\^T\M-H\M-J\M-n"...,1048576) = 1048576 (0x100000)
  6083: 6183.536590694 write(9,"a\M^L\M-o\M^?\M-[\M-7\fk\M-q\M^H"...,1048576) = 1048576 (0x100000)
  6083: 6183.568920807 write(9,"\M-%\M-R)4\M-K[jr\M-x\M-|\M^U"...,1048576) = 1048576 (0x100000)
  6083: 6183.580280278 write(9,"i\M^T\M-9\M-$\M-;Yz,\M-<\M^T\M-W"...,1048576) = 512 (0x200)
  6082: 6183.975335126 read(5,"\M^?\M-{\M^Pd\0\^C0\0\0N\0\0\0\0"...,42557952) = 3762476 (0x39692c)
  6082: 6183.975424804 read(5,0x804d09800,45086720) = 0 (0x0)
  6083: 6183.619253813 write(9,"\M-1\M-6\M^K\M-.\M-T\M-4L_at_l\M^_at_ "...,1048576) = 1048576 (0x100000)

It's writing blocks of all sorts of sizes; as expected 1048576, more unusually 
512, but also more surprisingly a write to tape of 5 bytes! Surely they must be 
at least modulo 512?
   Or are the size of these writes not a reflection of the size of the blocks 
that are being written to the tape?
   Later on towards the end of the tape it starts to write just 64KB blocks like 
sdd did, but it writes 1039 of these rather than the neat 16 that sdd did:

  6083: 6188.558234023 write(9,"\M-]\^D\M^Rk_at_\^D`\^P\M-`(\M-2$"...,1048576) = 65536 (0x10000)
  6083: 6188.563378066 write(9,"\M-C4\n\M-D\M-!&\^Sr\^Y_\M^OO"...,1048576) = 65536 (0x10000)
  6083: 6188.568657603 write(9,"\M-_at_\M^?M\^EO\M-%\a\M-y[\^^;HY?"...,1048576) = 65536 (0x10000)
  6083: 6188.574390838 write(9,"_at_  \M^AS\M-'\0\^^\M^_at_\M^V\^P4"...,1048576) = 65536 (0x10000)
  6083: 6188.579571478 write(9,"\M-ZX$\M-"%\M^Q\M-I\M-8A\M^B\M-,"...,1048576) = 65536 (0x10000)
  6083: 6188.584619418 write(9,"^\M-U\^E\M-'%\M-\\M^_at_\M^Bh\M-N`"...,1048576) = 65536 (0x10000)
  6083: 6188.589762064 write(9,"\M-E\M-9\M-#\M-_at_C\^R\M-;I)\v\M-T"...,1048576) = 65536 (0x10000)

But it does report EOT, in a way:

  6083: 6191.387300915 write(9,"D\M^X\^R\M-H\r\M-I.\M-{\M-m\M-0+"...,1048576) = 65536 (0x10000)
  6083: 6191.393140310 write(9,"\M-rz\^?\M^?\M^?\M^?\M-{>\M-4~"...,1048576) = 65536 (0x10000)
  6083: 6191.402045793 write(9,"\M-$\^\\M^U\M-D"\^E\b\M^A\M^R!"...,1048576) = 65536 (0x10000)
  6083: 6193.593383502 write(9,",B&.\M^_at_[\^Z=\M-c\0\M^Q\M-Qme"...,1048576) = 65536 (0x10000)
  6083: 6193.601456462 write(9,"\M-=',\^SGN\M-'\M-[\M-yY\M^Q\M-e"...,1048576) ERR#28 'No space left on device'
  6083: 6193.601564578 write(2,"star: fifo had 88479 puts 206042"...,39) = 39 (0x27)
  6083: 6193.601619056 write(2,"star: fifo was 24 times empty an"...,51) = 51 (0x33)

   Now getting to the real issue. I can use star to create multi-volume archives 
without a problem. The ERR#28 above, when writing causes star to prompt for a 
new tape.
   The issue comes when reading them back. I want to make sure my backups are 
good using star's excellent -diff option. I can successfully check the 1st tape 
using diff (so the backup is good despite those bizarre write sizes), but at 
the end of the 1st tape I get:

13908: 1.016033736 read(9,"T\M-|\M-VD\M^Rq\M-'q\M-yd\M-u3"...,1048576) = 1048576 (0x100000)
13908: 0.014666665 read(9,0x801101000,1048576)   ERR#5 'Input/output error'
13908: 0.000017042 write(2,"star: Input/output error. Error "...,53) = 53 (0x35)

Star dies and the kernel outputs:

(sa0:ahc1:0:6:0): tape is now frozen- use an OFFLINE, REWIND or MTEOM command 
to clear this state.

Shouldn't the last read return 0 or report EOT in some other way? The driver 
sa, getting into this frozen state seems wrong?
   On to blocksizes. I believe the maximum blocksize that an LTO can handle 
is 2MB. Correct me if I'm wrong.
   However, I see more unusual behaviour. Blocksize seems to have a 64KB 
maximum:

# mt blocksize 0
# sdd if=/dev/random of=/dev/sa0 bs=32k count=1
sdd: Read  1 records + 0 bytes (total of 32768 bytes = 32.00k).
sdd: Wrote 1 records + 0 bytes (total of 32768 bytes = 32.00k).

Try to read it back without specifying a buffer size:

# sdd if=/dev/sa0 -onull
(sa0:ahc1:0:6:0): 32768-byte tape record bigger than supplied buffer
sdd: Input/output error. Error reading '/dev/sa0'.
sdd: Read  0 records + 0 bytes (total of 0 bytes = 0.00k).
sdd: Wrote 0 records + 0 bytes (total of 0 bytes = 0.00k).

As expect we are warned that the blocksize is 32k. Do that with 64k and 
we get expected behaviour too:

# sdd if=/dev/random of=/dev/sa0 bs=64k count=1
sdd: Read  1 records + 0 bytes (total of 65536 bytes = 64.00k).
sdd: Wrote 1 records + 0 bytes (total of 65536 bytes = 64.00k).
# sdd if=/dev/sa0 -onull
(sa0:ahc1:0:6:0): 65536-byte tape record bigger than supplied buffer
sdd: Input/output error. Error reading '/dev/sa0'.
sdd: Read  0 records + 0 bytes (total of 0 bytes = 0.00k).
sdd: Wrote 0 records + 0 bytes (total of 0 bytes = 0.00k).

But go beyond 64k and the blocksize seems to be limited:

# sdd if=/dev/random of=/dev/sa0 bs=65k count=1
sdd: Read  1 records + 0 bytes (total of 66560 bytes = 65.00k).
sdd: Wrote 1 records + 0 bytes (total of 66560 bytes = 65.00k).
# sdd if=/dev/sa0 -onull
(sa0:ahc1:0:6:0): 65536-byte tape record bigger than supplied buffer
sdd: Input/output error. Error reading '/dev/sa0'.
sdd: Read  0 records + 0 bytes (total of 0 bytes = 0.00k).
sdd: Wrote 0 records + 0 bytes (total of 0 bytes = 0.00k).

What's going on here? Why is blocksize getting limited to 64k?
   Cheers.

-- 
Mark Powell - UNIX System Administrator - The University of Salford
Information Services Division, Clifford Whitworth Building,
Salford University, Manchester, M5 4WT, UK.
Tel: +44 161 295 4837  Fax: +44 161 295 5888  www.pgp.com for PGP key
Received on Sun Nov 11 2007 - 16:27:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:21 UTC