Re: [USB/MSDOSFS] Possible File System Corrupted

From: Bruce Evans <brde_at_optusnet.com.au> Date: Thu, 15 Nov 2007 08:16:51 +1100 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:22 UTC

On Wed, 14 Nov 2007, Rainer Alves wrote:

> On 11/13/2007 15:36, Oliver Fromme wrote:
>> Tim Clewlow wrote:
>>  > Michal Varga wrote:
>>  > > Rainer Alves wrote:
>>  > > > I'm having the exact same problem.
>>  > > > Ever since I've switched from RELENG_6 to RELENG_7 I'm still able to 
>> > > > mount my SonyEricsson W810 phone, but whatever is copied there gets 
>> > > > corrupted.
>>  > > > I first noticed this yesterday when copying a new batch os MP3s to 
>> its 4  > > > GB memory stick.
>>  > >  > > "Me too" - I see this with USB2.0 support/controller enabled, 
>> tested on
>>  > > two nforce5 and amd690 boards. Random (but pretty heavy) corruptions
>>  > > with data transferred to and from digital camera and mp3 player (both
>>  > > acting as common "usb flash disks"). Disabling USB2.0 seems to fix it
>>  > > AND also no board without USB2.0 controller exhibits this here (I just
>>  > > did a few quick tests, and nothing so far). Possibly some 
>> EHCI-specific
>>  > > bug?
>> 
>> I remember there was an msdosfs corruption problem reported
>> a few months ago.  It could be worked around by disabling
>> read/write clustering when mounting the file systems (see
>> the options in the mount(8) manpage).
>> 
>> I don't think this issue is related, but it might still be
>> worth a try.

I couldn't find any trace of this problem in either testing or reading the
code, and didn't receive any response to a request for more info and what
effects disabling clustering has.

I don't use USB under FreeBSD... maybe it is just a USB bug.  There
was a bug in the afd driver that might have caused similar problems:
the afd driver simply couldn't handle the maximum block size that it
claimed to support -- its max block size was 32K, but it claimed to
support 64K.  The clustering code is one of the few things that uses
such larger block sizes.  Another is initial pagein for exec --
VM_INITIAL_PAGEIN is 16 pages for all supported machines, but that
gives a block size of 128K on machines with 8K-pages, and the limit
used to be DFLTPHYS (64K) for most disk drivers in practice.  Now the
bugs have moved, so the limit is MAXPHYS (128K) for most disk drivers
in practice.

> Unfortunately the problem isn't solved when I disable clustering (-o 
> noclusterr,noclusterw), files are still getting corrupted.
> Btw, it isn't possible to disable clustering in RELENG_7 unless this patch is 
> applied:
> http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/fs/msdosfs/msdosfs_vfsops.c.diff?r1=1.176;r2=1.177

Oops, noclusterr and noclusterw didn't work (except with old versions of
mount(8)) when I said to use them, but I thought that it worked in RELENG_7.
RELENG_7 is also missing the more critical fixes in msdosfs.c 1.179 :-(.

> Let me know if I can do further testing, so far all I can say is that md5 
> checksums don't match before/after the copy process.

cp uses mmap(), and the nocluster[rw] doesn't affect mmap, but the new
clustering code in msdosfs is used by mmap() too (mmap() uses essentially
the same clustering code as read()/write() but doesn't honor the
nocluster[rw] mount flags).  This at least makes the problem easier
to test for:

- try copying files larger than 8MB.  cp only uses mmap() for files smaller
   than this size.
- try copying files using a version of cp that doesn't have the dubious
   mmap() optimization.  This optimization is controlled by the bogus
   VM_AND_BUFFER_CACHE_SYNCHRONIZED option in cp/Makefile.

> umass0: <Sony Ericsson Sony Ericsson W810, class 0/0, rev 2.00/0.00, addr 3> 
> on uhub0
> umass1: <Sony Ericsson Sony Ericsson W810, class 0/0, rev 2.00/0.00, addr 3> 
> on uhub0
> da0 at umass-sim0 bus 0 target 0 lun 0
> da0: <SEMC Int.Memory 0000> Removable Direct Access SCSI-0 device
> da0: 1.000MB/s transfers
> da0: 26MB (54008 512 byte sectors: 64H 32S/T 26C)
> da1 at umass-sim1 bus 1 target 0 lun 0
> da1: <SEMC Mem-Stick 0000> Removable Direct Access SCSI-0 device
> da1: 1.000MB/s transfers
> da1: 3905MB (7999298 512 byte sectors: 255H 63S/T 497C)
> GEOM_LABEL: Label for provider da0s1 is msdosfs/PHONE.
> GEOM_LABEL: Label for provider da1s1 is msdosfs/PHONE CARD.
>
> [rainer_at_bsd ~]$ sudo usbdevs -v | grep Sony
> port 6 addr 3: full speed, power 500 mA, config 1, Sony Ericsson 
> W810(0xe042), Sony Ericsson(0x0fce), rev 0.00
>
> [rainer_at_bsd ~]$ sudo camcontrol devlist | grep SE
> <SEMC Int.Memory 0000>             at scbus4 target 0 lun 0 (da0,pass1)
> <SEMC Mem-Stick 0000>              at scbus5 target 0 lun 0 (da1,pass2)

The da driver has always advertised support for block sizes up to
DFLTPHYS (64K).  This is wrong if lower layers only support smaller
sizes and cam doesn't do any reblocking (I don't know if cam does any
reblocking but think it shouldn't.  Reblocking would be just a bug for
tape devices.  Reblocking now happens for disks in the geom layer, but
things still depend on lower levels advertising the correct max).  The
bug in the afd driver may even have given an example of a lower layer
which doesn't support DFTPHYS, since afd can be da using atapicam (?).
I don't know anything about block size limits in USB.

Bruce