Re: head -r338804 boots threadripper 1950X fine; head -r338810+ do not; -r338807 seems implicated

From: Mark Millard <marklmi_at_yahoo.com>
Date: Mon, 22 Oct 2018 10:01:17 -0700
[I will note the the loader problem has been shown to
not be involved in the kernel problem that this
"Subject:" was originally for.]

On 2018-Oct-22, at 9:26 AM, Warner Losh <imp at sdimp.com> wrote:

> On Mon, Oct 22, 2018 at 6:39 AM Mark Millard <marklmi_at_yahoo.com> wrote:
>> On 2018-Oct-22, at 4:07 AM, Toomas Soome <tsoome at me.com> wrote:
>> 
>> > On 22 Oct 2018, at 13:58, Mark Millard <marklmi at yahoo.com> wrote:
>> >> 
>> >> On 2018-Oct-22, at 2:27 AM, Toomas Soome <tsoome at me.com> wrote:
>> >>> 
>> >>>> On 22 Oct 2018, at 06:30, Warner Losh <imp_at_bsdimp.com> wrote:
>> >>>> 
>> >>>> On Sun, Oct 21, 2018 at 9:28 PM Warner Losh <imp_at_bsdimp.com> wrote:
>> >>>> 
>> >>>>> 
>> >>>>> 
>> >>>>> On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable <
>> >>>>> freebsd-stable_at_freebsd.org> wrote:
>> >>>>> 
>> >>>>>> [I built based on WITHOUT_ZFS= for other reasons. But,
>> >>>>>> after installing the build, Hyper-V based boots are
>> >>>>>> working.]
>> >>>>>> 
>> >>>>>> On 2018-Oct-20, at 2:09 AM, Mark Millard <marklmi at yahoo.com> wrote:
>> >>>>>> 
>> >>>>>>> On 2018-Oct-20, at 1:39 AM, Mark Millard <marklmi at yahoo.com> wrote:
>> >>>>>>> . . .
>> >>>> 
>> >>> 
>> >>> It would help to get output from loader lsdev -v command.
>> >> 
>> >> That turned out to be very interesting: The non-ZFS loader
>> >> crashes during the listing, during disk8, which shows a
>> >> x0 instead of a x512.
>> >> 
>> > 
>> > Yes, thats the root cause there. The non-zfs loader does only *read* the boot disk, thats why the issue was not revealed there. 
>> > 
>> > It would help to identify the sector size for that disk, at least from OS, so we can compare with what we can get from INT13.
>> > 
>> > I have pretty good idea what to look there, but I am afraid we need to run few tests with you to understand why that disk is reporting sector size 0 there.
>> > 
>> > 
>> 
>> Looks like I guessed wrong about the device
>> for "drive8".
>> 
>> So I unplugged the only other external
>> storage device, so the original drives
>> 0-13 become 0-11 overall.
>> 
>> The machine has a multi-LUN media card reader with
>> no cards plugged in. It is built-in rather than
>> one that I plugged into a port. It has 4 LUN's.
>> 
>> So 8+4=12 and drives 0-7 show up with media before
>> it tries any of the 4 LUN's with no card in place.
>> 
>> I conclude that "drive8" is an empty LUN in a media
>> card reader.
>> 
>> I conclude that there is no sector size available for
>> any of the empty LUNs in the media reader.
>> 
> I think you are probably right and we're hitting some divide by 0 error when we should just ignore the disk.

In the Hyper-V context, the loader and kernel do not
see the 4-LUN media reader at all: only drives with
normal freebsd-* style partitions and free space.
This explains why I did not see a loader problem
in that context.

So I conclude that the kernel crash under Hyper-V
associated with -r338807 is a separate issue even
though WITHOUT_ZFS= seems to have avoided the
crash.

My plan is to continue with the -r338807 investigation
after the loader problem is fixed in my builds. Then
I've go back to trying builds using WITH_ZFS= (implicit),
both native boots and Hyper-V based ones.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Mon Oct 22 2018 - 15:01:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:18 UTC