Re: ZFS data error without reasons

From: Mark Powell <M.S.Powell_at_salford.ac.uk>
Date: Wed, 25 Mar 2009 09:13:09 +0000 (GMT)
Kevin,
   Did you fix your ZFS CRC errors?
   I responded to your thread, but no-one got back to me.
   I'm gonna start another thread later.
   This time I re-made the zpool in 8 compatible with 7. Once the errors 
started showing up in 8 I moved back to 7, on the same hardware, to 
perform the scrub to prove the problem is with 8. The 1st scrub in 7 found 
some errors, but of course it would if 8 had messed up the data. Removed 
the few unimportant bad files (all were in snapshots).
   Just performing the 2nd scrub in 7 now. If this comes back with no 
errors, then we have stronger proof that there is some wrong, which seems 
quite intermittent, in 8 that randomly writes bad data.
   Cheers.

On Mon, 16 Mar 2009, kevin wrote:
> Daniel Eriksson wrote:
>> kevin wrote:
>>
>> 
>>> Hi,
>>> Will any changes cause zfs data error?I find my disk data error without 
>>> any reasons(shutdown or reboot  normally).disk was bought 
>>> yesterday.sometimes it can be fixed with a zpool scrub.but mostly zpool 
>>> scrub will return more errors.Even i restore all zpool from my 
>>> backup,without 5 mins,zpool status shows data error and many checksum 
>>> errors.
>>> 
>> 
>> Is the drive connected to an "nVidia nForce MCP55 SATA300 controller"? I
>> have two machines with on-board MCP55 controllers. One of them works
>> perfectly, the other causes silent data corruption (each time I run a
>> zpool scrub it finds new checksum errors).
>> 
>> If you also have an MCP55 controller then maybe this is related.
>>
>> 
> My laptop is T61. RAM is also tested by memtest86+ and return no error.
> "zfs send tank/usr/home/kevin_at_2009-03-15-16:51:21|zfs receive backup/kevin" 
> hangs system and i have to power off the machine.when the system up,i find 
> file error in snapshot tank/usr/home/kevin_at_2009-03-15-16:51:21.when i destroy 
> tank/usr/home/kevin_at_2009-03-15-16:51:21,then reboot system, i find more 
> errors.
>
> #zpool status -v
> pool: tank
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
>       corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>       entire pool from backup.
>  see: http://www.sun.com/msg/ZFS-8000-8A
> scrub: scrub in progress for 0h10m, 96.10% done, 0h0m to go
> config:
>
>       NAME        STATE     READ WRITE CKSUM
>       tank        ONLINE       0     0     2
>         ad4s1d    ONLINE       0     0     4
>
> errors: Permanent errors have been detected in the following files:
>
>       /usr/bin/less
>       /usr/lib/libstdc++.so.6
>       /usr/bin/tbl
>       /usr/share/misc/termcap.db
>       /usr/bin/ssh-agent
>       /usr/local/bin/sudo
>       /usr/local/lib/libX11.so.6
>       /usr/home/kevin/memtest86+-2.11.iso
>
> when zpool scrub end.
> #zpool status -v
> pool: tank
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
>       corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>       entire pool from backup.
>  see: http://www.sun.com/msg/ZFS-8000-8A
> scrub: scrub completed after 0h10m with 2 errors on Mon Mar 16 21:01:12 2009
> config:
>
>       NAME        STATE     READ WRITE CKSUM
>       tank        ONLINE       0     0     2
>         ad4s1d    ONLINE       0     0     4
>
> errors: Permanent errors have been detected in the following files:
>
>       /usr/home/kevin/memtest86+-2.11.iso
>
> Should i just delete memtest86+-2.11.iso ?
>
> dmesg:
> Copyright (c) 1992-2009 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>       The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 8.0-CURRENT #0: Sun Mar 15 21:11:36 CST 2009
>   root_at_datastream-laptop.people.163.org:/usr/obj/usr/src/sys/G8laptop
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Core(TM)2 Duo CPU     T7700  _at_ 2.40GHz (2394.02-MHz K8-class 
> CPU)
> Origin = "GenuineIntel"  Id = 0x6fb  Stepping = 11
> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
> AMD Features=0x20100800<SYSCALL,NX,LM>
> AMD Features2=0x1<LAHF>
> TSC: P-state invariant
> Cores per package: 2
> usable memory = 4210061312 (4015 MB)
> avail memory  = 4039487488 (3852 MB)
> ACPI APIC Table: <LENOVO TP-7L   >
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> cpu0 (BSP): APIC ID:  0
> cpu1 (AP): APIC ID:  1
> This module (opensolaris) contains code covered by the
> Common Development and Distribution License (CDDL)
> see http://opensolaris.org/os/licensing/opensolaris_license/
> ACPI Warning (tbfadt-0505): Optional field "Gpe1Block" has zero address or 
> length:        0    102C/0 [20070320]
> ioapic0: Changing APIC ID to 1
> ioapic0 <Version 2.0> irqs 0-23 on motherboard
> kbd1 at kbdmux0
> acpi0: <LENOVO TP-7L> on motherboard
> acpi0: [ITHREAD]
> acpi_ec0: <Embedded Controller: GPE 0x12, ECDT> port 0x62,0x66 on acpi0
> acpi0: Power Button (fixed)
> acpi0: reservation of 0, a0000 (3) failed
> acpi0: reservation of 100000, bff00000 (3) failed
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0
> acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
> Timecounter "HPET" frequency 14318180 Hz quality 900
> acpi_lid0: <Control Method Lid Switch> on acpi0
> acpi_button0: <Sleep Button> on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
> pci1: <ACPI PCI bus> on pcib1
> vgapci0: <VGA-compatible display> port 0x2000-0x207f mem 
> 0xd2000000-0xd2ffffff,0xe0000000-0xefffffff,0xd0000000-0xd1ffffff irq 16 at 
> device 0.0 on pci1
> em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port 0x1840-0x185f mem 
> 0xfe200000-0xfe21ffff,0xfe225000-0xfe225fff irq 20 at device 25.0 on pci0
> em0: Using MSI interrupt
> em0: [FILTER]
> em0: Ethernet address: 00:1c:25:1c:fb:d0
> uhci0: <Intel 82801H (ICH8) USB controller USB-D> port 0x1860-0x187f irq 20 
> at device 26.0 on pci0
> uhci0: [ITHREAD]
> uhci0: LegSup = 0x0000
> usbus0: <Intel 82801H (ICH8) USB controller USB-D> on uhci0
> uhci1: <Intel 82801H (ICH8) USB controller USB-E> port 0x1880-0x189f irq 21 
> at device 26.1 on pci0
> uhci1: [ITHREAD]
> uhci1: LegSup = 0x0000
> usbus1: <Intel 82801H (ICH8) USB controller USB-E> on uhci1
> ehci0: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> mem 
> 0xfe226c00-0xfe226fff irq 22 at device 26.7 on pci0
> ehci0: [ITHREAD]
> usbus2: EHCI version 1.0
> usbus2: <Intel 82801H (ICH8) USB 2.0 controller USB2-B> on ehci0
> hdac0: <Intel 82801H High Definition Audio Controller> mem 
> 0xfe220000-0xfe223fff irq 17 at device 27.0 on pci0
> hdac0: HDA Driver Revision: 20090226_0129
> hdac0: [ITHREAD]
> pcib2: <ACPI PCI-PCI bridge> irq 20 at device 28.0 on pci0
> pci2: <ACPI PCI bus> on pcib2
> pci2: <memory> at device 0.0 (no driver attached)
> pcib3: <ACPI PCI-PCI bridge> irq 21 at device 28.1 on pci0
> pci3: <ACPI PCI bus> on pcib3
> iwn0: <Intel(R) PRO/Wireless 4965BGN> mem 0xd7dfe000-0xd7dfffff irq 17 at 
> device 0.0 on pci3
> iwn0: Reg Domain: MoW1, address 00:1d:e0:48:13:2f
> iwn0: [ITHREAD]
> iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
> iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
> iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 
> 36Mbps 48Mbps 54Mbps
> iwn0: 11na MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps 
> 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps
> iwn0: 11ng MCS: 15Mbps 30Mbps 45Mbps 60Mbps 90Mbps 120Mbps 135Mbps 150Mbps 
> 30Mbps 60Mbps 90Mbps 120Mbps 180Mbps 240Mbps 270Mbps 300Mbps
> pcib4: <ACPI PCI-PCI bridge> irq 22 at device 28.2 on pci0
> pci4: <ACPI PCI bus> on pcib4
> pcib5: <ACPI PCI-PCI bridge> irq 23 at device 28.3 on pci0
> pci5: <ACPI PCI bus> on pcib5
> pcib6: <ACPI PCI-PCI bridge> irq 20 at device 28.4 on pci0
> pci13: <ACPI PCI bus> on pcib6
> uhci2: <Intel 82801H (ICH8) USB controller USB-A> port 0x18a0-0x18bf irq 16 
> at device 29.0 on pci0
> uhci2: [ITHREAD]
> uhci2: LegSup = 0x0000
> usbus3: <Intel 82801H (ICH8) USB controller USB-A> on uhci2
> uhci3: <Intel 82801H (ICH8) USB controller USB-B> port 0x18c0-0x18df irq 17 
> at device 29.1 on pci0
> uhci3: [ITHREAD]
> uhci3: LegSup = 0x0000
> usbus4: <Intel 82801H (ICH8) USB controller USB-B> on uhci3
> uhci4: <Intel 82801H (ICH8) USB controller USB-C> port 0x18e0-0x18ff irq 18 
> at device 29.2 on pci0
> uhci4: [ITHREAD]
> uhci4: LegSup = 0x0000
> usbus5: <Intel 82801H (ICH8) USB controller USB-C> on uhci4
> ehci1: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> mem 
> 0xfe227000-0xfe2273ff irq 19 at device 29.7 on pci0
> ehci1: [ITHREAD]
> usbus6: EHCI version 1.0
> usbus6: <Intel 82801H (ICH8) USB 2.0 controller USB2-A> on ehci1
> pcib7: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> pci21: <ACPI PCI bus> on pcib7
> cbb0: <RF5C476 PCI-CardBus Bridge> mem 0xf8100000-0xf8100fff irq 16 at device 
> 0.0 on pci21
> cardbus0: <CardBus bus> on cbb0
> pccard0: <16-bit PCCard bus> on cbb0
> cbb0: [FILTER]
> isab0: <PCI-ISA bridge> at device 31.0 on pci0
> isa0: <ISA bus> on isab0
> atapci0: <Intel ICH8M UDMA100 controller> port 
> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1830-0x183f at device 31.1 on pci0
> ata0: <ATA channel 0> on atapci0
> ata0: [ITHREAD]
> atapci1: <Intel AHCI controller> port 
> 0x1c48-0x1c4f,0x1c1c-0x1c1f,0x1c40-0x1c47,0x1c18-0x1c1b,0x1c20-0x1c3f mem 
> 0xfe226000-0xfe2267ff irq 16 at device 31.2 on pci0
> atapci1: [ITHREAD]
> atapci1: AHCI Version 01.10 controller with 3 ports PM not supported
> ata2: <ATA channel 0> on atapci1
> ata2: [ITHREAD]
> ata3: <ATA channel 2> on atapci1
> ata3: [ITHREAD]
> pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
> acpi_tz0: <Thermal Zone> on acpi0
> acpi_tz1: <Thermal Zone> on acpi0
> atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
> atkbd0: <AT Keyboard> irq 1 on atkbdc0
> kbd0 at atkbd0
> atkbd0: [GIANT-LOCKED]
> atkbd0: [ITHREAD]
> psm0: <PS/2 Mouse> irq 12 on atkbdc0
> psm0: [GIANT-LOCKED]
> psm0: [ITHREAD]
> psm0: model Synaptics Touchpad, device ID 0
> battery0: <ACPI Control Method Battery> on acpi0
> acpi_acad0: <AC Adapter> on acpi0
> acpi_ibm0: <IBM ThinkPad ACPI Extras> on acpi0
> cpu0: <ACPI CPU> on acpi0
> coretemp0: <CPU On-Die Thermal Sensors> on cpu0
> est0: <Enhanced SpeedStep Frequency Control> on cpu0
> p4tcc0: <CPU Frequency Thermal Control> on cpu0
> cpu1: <ACPI CPU> on acpi0
> coretemp1: <CPU On-Die Thermal Sensors> on cpu1
> est1: <Enhanced SpeedStep Frequency Control> on cpu1
> p4tcc1: <CPU Frequency Thermal Control> on cpu1
> orm0: <ISA Option ROMs> at iomem 
> 0xc0000-0xcefff,0xcf000-0xcffff,0xd0000-0xd0fff,0xe0000-0xeffff on isa0
> sc0: <System console> at flags 0x100 on isa0
> sc0: VGA <16 virtual consoles, flags=0x300>
> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
> WARNING: ZFS is considered to be an experimental feature in FreeBSD.
> Timecounters tick every 1.000 msec
> usbus0: 12Mbps Full Speed USB v1.0
> usbus1: 12Mbps Full Speed USB v1.0
> usbus2: 480Mbps High Speed USB v2.0
> usbus3: 12Mbps Full Speed USB v1.0
> usbus4: 12Mbps Full Speed USB v1.0
> usbus5: 12Mbps Full Speed USB v1.0
> usbus6: 480Mbps High Speed USB v2.0
> ZFS filesystem version 13
> ZFS storage pool version 13
> ugen0.1: <Intel> at usbus0
> uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
> ugen1.1: <Intel> at usbus1
> uhub1: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus1
> ugen2.1: <Intel> at usbus2
> uhub2: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus2
> ugen3.1: <Intel> at usbus3
> uhub3: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus3
> ugen4.1: <Intel> at usbus4
> uhub4: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus4
> ugen5.1: <Intel> at usbus5
> uhub5: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus5
> ugen6.1: <Intel> at usbus6
> uhub6: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus6
> acd0: DVDR <Optiarc DVD RW AD-7910A/1.D1> at ata0-master UDMA33
> ad4: 305245MB <Seagate ST9320320AS SD03> at ata2-master SATA150
> hdac0: HDA Codec #0: Analog Devices AD1984
> hdac0: HDA Codec #1: Conexant (Unknown)
> pcm0: <HDA Analog Devices AD1984 PCM #0 Analog> at cad 0 nid 1 on hdac0
> pcm1: <HDA Analog Devices AD1984 PCM #1 Digital> at cad 0 nid 1 on hdac0
> SMP: AP CPU #1 Launched!
> uhub0: 2 ports with 2 removable, self powered
> uhub1: 2 ports with 2 removable, self powered
> uhub3: 2 ports with 2 removable, self powered
> uhub4: 2 ports with 2 removable, self powered
> uhub5: 2 ports with 2 removable, self powered
> GEOM: ad4s1: geometry does not match label (255h,63s != 16h,63s).
> Root mount waiting for: usbus6 usbus2
> Root mount waiting for: usbus6 usbus2
> uhub2: 4 ports with 4 removable, self powered
> uhub6: 6 ports with 6 removable, self powered
> Root mount waiting for: usbus2
> Trying to mount root from ufs:/dev/ad4s1a
> ugen0.2: <Broadcom Corp> at usbus0
> ubt0: <Broadcom Corp BCM2045B, class 224/1, rev 2.00/1.00, addr 2> on usbus0
> ugen0.3: <STMicroelectronics> at usbus0
> wlan0: Ethernet address: 00:1d:e0:48:13:2f
> WARNING: attempt to net_add_domain(bluetooth) after domainfinalize()
> WARNING: attempt to net_add_domain(netgraph) after domainfinalize()
> wlan0: link state changed to UP
> iwn0: need multicast update callback
> iwn0: need multicast update callback
> iwn0: need multicast update callback
>
> Thanks,
> kevin
>
>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
>

-- 
Mark Powell - UNIX System Administrator - The University of Salford
Information & Learning Services, Clifford Whitworth Building,
Salford University, Manchester, M5 4WT, UK.
Tel: +44 161 295 6843  Fax: +44 161 295 5888  www.pgp.com for PGP key
Received on Wed Mar 25 2009 - 08:13:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:44 UTC