Re: boot errors since upgrading to 12-current

From: tech-lists <tech-lists_at_zyxst.net>
Date: Wed, 15 Aug 2018 04:06:15 +0100
On 14/08/2018 21:16, Toomas Soome wrote:
> 
> 
>> On 14 Aug 2018, at 22:37, tech-lists <tech-lists_at_zyxst.net> wrote:
>> 
>> Hello,
>> 
>> context: amd64, FreeBSD 12.0-ALPHA1 #0 r337682, ZFS. The system is
>> *not* root-on-zfs. It boots to an SSD. The three disks indicated
>> below are spinning rust.
>> 
>> NAME        STATE     READ WRITE CKSUM storage     ONLINE       0
>> 0     0 raidz1-0  ONLINE       0     0     0 ada1    ONLINE       0
>> 0     0 ada2    ONLINE       0     0     0 ada3    ONLINE       0
>> 0     0
>> 
>> This machine was running 11.2 up until about a month ago.
>> 
>> Recently I've seen this flash up on the screen before getting to
>> the beastie screen:
>> 
>> BIOS drive C: is disk0 BIOS drive D: is disk1 BIOS drive E: is
>> disk2 BIOS drive F: is disk3 BIOS drive G: is disk4 BIOS drive H:
>> is disk5 BIOS drive I: is disk6 BIOS drive J: is disk7
>> 
>> [the above is normal and has always has been seen on every boot]
>> 
>> read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 
>> read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>> 0xcbdb1330, error: 0x31 read 1 from 0 to 0xcbdb1330, error: 0x31 
>> read 1 from 0 to 0xcbdb1330, error: 0x31 read 1 from 0 to
>> 0xcbdb1330, error: 0x31
>> 
>> the above has been happening since upgrading to -current a month
>> ago
>> 
>> ZFS: i/o error - all block copies unavailable ZFS: can't read MOS
>> of pool storage
>> 
>> the above is alarming and has been happening for the past couple of
>> days, since upgrading to r337682 on the 12th August.
>> 
>> The beastie screen then loads and it boots normally.
>> 
>> Should I be concerned? Is the output indicative of a problem?
>> 
> 
> Not immediately and yes. In BIOS loader, we do all disk IO with INT13
> and the error 0x31 is often hinting about missing media or some other
> controller related error. Could you paste the output from loader
> lsdev -v output?
> 
> The drive list appears as an result of probing the disks in
> biosdisk.c. The read errors are from attempt to read 1 sector from
> sector 0 (that is, to read the partition table from the disk). Why
> this does end with error, would be interesting to know, unfortunately
> that error does not tell us which disk was probed.

Hi Toomas, thanks for looking at this.

lsdev -v looks like this:

OK lsdev -v
disk devices:
	disk0: BIOS drive C (16514064 X 512):
	disk0s1: FreeBSD          111GB
	disk0s1a: FreeBSD UFS     108GB
	disk0s1b: FreeBSD swap    3881MB

	disk1: BIOS drive D (16514064 X 512):
	disk2: BIOS drive E (16514064 X 512):
	disk3: BIOS drive F (16514064 X 512):
	disk4: BIOS drive G (2880 X 512):
read 1 from 0 to 0xcbde0a20, error 0x31
	disk5: BIOS drive D (2880 X 512):
read 1 from 0 to 0xcbde0a20, error 0x31
	disk6: BIOS drive D (2880 X 512):
read 1 from 0 to 0xcbde0a20, error 0x31
	disk7: BIOS drive D (2880 X 512):
read 1 from 0 to 0xcbde0a20, error 0x31
OK

disk4 to disk7 corresponds with da0 to da3 which are sd/mmc devices 
without any media in. What made me notice it is it never showed the read 
1 from 0 to $random_value on 11-stable. The system runs 12-current now.

disk1 to disk3 are the hard drives making up ZFS. These are 4TB Western 
Digital SATA-3 WDC WD4001FAEX.

>> Since you are getting errors from data pool ‘storage’, it does not
>> affect the boot. Why the pool storage is unreadable - it likely has
>> to do about the errors above, but can not tell for sure based on the
>> data presented here….

Thing is, the data pool works fine when boot completes. i.e it loads 
read/write and behaves normally.

thanks,
-- 
J.
Received on Wed Aug 15 2018 - 01:06:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC