Re: A tool for remapping bad sectors in CURRENT?

From: Miroslav Lachman <000.fbsd_at_quip.cz>
Date: Sun, 14 Mar 2010 10:55:19 +0100
Dag-Erling Smørgrav wrote:
> Miroslav Lachman<000.fbsd_at_quip.cz>  writes:
>> So... can somebody with enough knowledge write some docs / script how
>> to find the affected file based on LBA read error from messages /
>> SMART log?
>
> ZFS will tell you straight away, but I guess if you used ZFS, you
> wouldn't be asking :)

Yes, but we have ZFS only on two servers, others are using UFS2 (some 
with gmirror, some with gjournal)

> For FFS, you can unmount the file system (boot from a CD or memory stick
> or whatever if that file system is / or /usr), run fsdb on the failing
> disk, use findblk to look up the inode number for the file that contains
> the bad sector.  Note that you have to convert the LBA to an offset
> relative to the start of the partition.

As I write in my first post to this thread, I already tried fsdb + 
findblk, but without success. Findblk did not returned any inode. Maybe 
the meaning of block is of different size or something else I can't 
understand.

So can you please show me some real world example?


I have one from the past:

__________________
/var/log/messages:
Sep 23 23:58:00 edith kernel: ad4: FAILURE - READ_DMA 
status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=79725056
Sep 23 23:58:00 edith kernel: GEOM_MIRROR: Request failed (error=5). 
ad4[READ(offset=40819228672, length=131072)]

__________________
SMART log:
   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   40 51 00 6f 82 c0 44  Error: UNC at LBA = 0x04c0826f = 79725167


The LBA of bad sector is *79725167*

__________________
Information about disk slices:

sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
     start 63, size 209712447 (102398 Meg), flag 80 (active)
         beg: cyl 0/ head 1/ sector 1;
         end: cyl 1023/ head 254/ sector 63
The data for partition 2 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
     start 209712510, size 1743807555 (851468 Meg), flag 0
         beg: cyl 1023/ head 255/ sector 63;
         end: cyl 1023/ head 254/ sector 63

__________________
According to LBA and size of s1, I thing the error is in s1

# /dev/mirror/gm0s1:
8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
   a:  2097152        0    /
   b: 25165824  2097152    swap
   c: 209712447        0
   d: 12582912 27262976    /var
   e: 146800640 39845888   /var/db
   f: 16777216 186646528   /usr
   g:  6288703 203423744   /tmp


And LBA 79725056 is on */var/db* (between offset 39845888 and 186646528)

__________________
s1 starts 63 sectors from the beginning of the drive and /var/db has 
offset 39845888. So am I right that I need to find block number 
*39879105* by findblk command?

LBA err - s1 start - /var/db offset = findblk inside /dev/mirror/gm0s1e
79725056 - 63 - 39845888 = 39879105

__________________
/# fsdb -r /dev/mirror/gm0s1e
** /dev/mirror/gm0s1e (NO WRITE)
Examining file system '/dev/mirror/gm0s1e'
Last Mounted on /var/db
current inode: directory
I=2 MODE=40755 SIZE=512
         BTIME=May  1 08:07:23 2009 [0 nsec]
         MTIME=Sep 24 15:52:01 2009 [0 nsec]
         CTIME=Sep 24 15:52:01 2009 [0 nsec]
         ATIME=Sep 24 16:24:34 2009 [0 nsec]
OWNER=root GRP=wheel LINKCNT=11 FLAGS=0 BLKCNT=4 GEN=4ebc65fc

findblk 39879105
findblk 39879106
findblk 39879107
findblk 39879108
.
.

I tried more than 256 incrementing block numbers, but findblk didn't 
found any inode! (length=131072 in error message means 256 sectors, right?)


So there must be some misunderstanding on my part and that's why I am 
asking for some step-by-step documentation or script "how to find file 
by LBA read error message"


I tried the fsdb + findblk on well known data, but again without success.

I created file /tmp/test.txt, it has inum 3, than I use fsdb on gm0s1f 
(gm0s1f is mounted as /tmp). Command "inode 3" inside fsdb prompt 
returned informations about this file, command "blocks" returned 3001 as 
block number, but command "findblk 3001" returned nothing instead of 
inum 3!
Where is the error? What I am doing wrong?

__________________
~/# echo test > /tmp/test.txt

  ~/# ls -i /tmp/test.txt
3 /tmp/test.txt

~/# fsdb -r /dev/mirror/gm0s1f
** /dev/mirror/gm0s1f (NO WRITE)
Examining file system '/dev/mirror/gm0s1f'
Last Mounted on /tmp
current inode: directory
I=2 MODE=41777 SIZE=512
         BTIME=Feb  7 18:32:22 2008 [0 nsec]
         MTIME=Mar 14 10:33:22 2010 [0 nsec]
         CTIME=Mar 14 10:33:22 2010 [0 nsec]
         ATIME=Mar 14 10:33:35 2010 [0 nsec]
OWNER=root GRP=wheel LINKCNT=7 FLAGS=0 BLKCNT=4 GEN=3f7c9384

fsdb (inum: 2)> inode 3
current inode: regular file
I=3 MODE=100644 SIZE=5
         BTIME=Mar 14 10:33:22 2010 [0 nsec]
         MTIME=Mar 14 10:33:22 2010 [0 nsec]
         CTIME=Mar 14 10:33:22 2010 [0 nsec]
         ATIME=Mar 14 10:33:22 2010 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=4 GEN=45c26de1

fsdb (inum: 3)> blocks
Blocks for inode 3:
Direct blocks:
3001 (1 frag)

fsdb (inum: 3)> findblk 3001
fsdb (inum: 3)>

                ^^^^^^^^ findblk did not returned inode 3!

> Unfortunately, you can't easily go from inode to file name; you have to
> mount the file system and use something like find -inum.

Yes, I know this.

Thanks in advance to help me understand and use fsdb + findblk commands.

Miroslav Lachman


PS: all above was tested on gmirror gm0, but I did the same tests on 
single drive ad4 with the same "empty" result (info just for case if 
fsdb can't be used on gmirror, but I don't think so)
Received on Sun Mar 14 2010 - 08:55:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:01 UTC