Hang near end of kernel probes since r213267 (likely earlier)

From: David Wolfskill <david_at_catwhisker.org>
Date: Fri, 1 Oct 2010 14:20:38 -0700
I have recently acquired a new laptop (to replace the "Frankenlaptop"
I've been using for the last several years).

The new machine is a Dell Precision M4400, so it's pretty recent
technology compared to what I'm used to.  :-}

I installed FreeBSD 8.1-R on slice 1, customized it a bit to work in my
environment, then cloned slice 1 to slice 2, booted from slice 2,
populated /usr/src via "svn co" (pointing to stable/8, and upgraded
slice to stable/8 as of r213245.

So far, so good.

I tinkered with it a bit more, building ports the way I want them, &c.;
the following day, I upgraded to r213267.

That went well, so I cloned slice 2 to slice 4, used "svn switch" to
flip /usr/src from slice 4 to head, booted from slice 4, and upgraded
slice 4 to 9.0-CURRENT as of r213267.

On the reboot following the install (the "smoke test"), I noticed that
the machine got most of the way through the kernel probes, then hung
(requiring a power cycle to break out of it).

It did this a few more times, then the next boot worked.

I thought this odd, but not necessarily demonstrating a problem with
FreeBSD: I hadn't had much experienmce with this particular hardware,
after all.

The following day (as is my usual pattern), I upgraded slice 2 to
stable/8 as of r213295 without incident.  After upgrading the installed
ports, I then booted from slice 4 (after several tries), then upgraded
slice 4 to head as of r213295.  Again, attempts to boot from slice 4
usually -- but not always -- would hang, always in the same place.

Now, I had hada a somewhat-similar hang on my work desktop, which is
also a Dell machine.  And in that case -- though there were several
differences, soime of which may well be relevant -- a BIOS upgrade
resolved that issue.

So I checked; the laptop had BIOS A19, and Dell had A23 available.

This morning, I upgraded slice 2 to stable/8 as of r213322, booted slice
4, and upgraded it to head as of r213322.  Again, it woudl hang more
often than not.

This afternoon, after receiving appropriate encouragement (that yes, I
probably could use Dell's Linux BIOS updater from a KNOPPIX
environment), I was able to successfully update the BIOS to A23.

Unfortunately, booting head (slice 4) still hangs -- usually.  I'm
unable to detect a pattern in why it sometimes boots OK, while most of
the time it hangs.

So when it hangs (today), It's runing:

FreeBSD localhost 9.0-CURRENT FreeBSD 9.0-CURRENT #1 r213322: Fri Oct  1 10:18:30 PDT 2010     root_at_g1-222.catwhisker.org.:/usr/obj/usr/src/sys/CANARY  i386

And looking at the stable/8 /var/log/messages, when it boots under head,
it runs along:

...
Oct  1 13:37:41 localhost kernel: ugen6.1: <Intel> at usbus6
Oct  1 13:37:41 localhost kernel: uhub6: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus6
Oct  1 13:37:41 localhost kernel: ugen7.1: <Intel> at usbus7
Oct  1 13:37:41 localhost kernel: uhub7: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus7
Oct  1 13:37:41 localhost kernel: ad4: 238475MB <Seagate ST9250421ASG DE16> at ata2-master UDMA100 SATA 3Gb/s
Oct  1 13:37:41 localhost kernel: acd0: DVDR <TSSTcorp DVD+/-RW TS-U633A/D200> at ata3-master UDMA100 SATA 1.5Gb/s
Oct  1 13:37:41 localhost kernel: hdac0: HDA Codec #0: IDT 92HD71B7
Oct  1 13:37:41 localhost kernel: pcm0: <HDA IDT 92HD71B7 PCM #0 Analog> at cad 0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: pcm1: <HDA IDT 92HD71B7 PCM #1 Analog> at cad 0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: pcm2: <HDA IDT 92HD71B7 PCM #2 Digital> at cad 0 nid 1 on hdac0
Oct  1 13:37:41 localhost kernel: uhub0: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub1: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub2: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub4: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub5: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub6: 2 ports with 2 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub3: 6 ports with 6 removable, self powered
Oct  1 13:37:41 localhost kernel: uhub7: 6 ports with 6 removable, self powered
Oct  1 13:37:41 localhost kernel: acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): TEST UNIT READY. CDB: 0 0 0 0 0 0 
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): CAM status: SCSI Status Error
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): SCSI status: Check Condition
Oct  1 13:37:41 localhost kernel: (probe0:ata3:0:0:0): SCSI sense: NOT READY asc:3a,1 (Medium not present - tray closed)
Oct  1 13:37:41 localhost kernel: cd0 at ata3 bus 0 scbus1 target 0 lun 0
Oct  1 13:37:41 localhost kernel: cd0: <TSSTcorp DVD+-RW TS-U633A D200> Removable CD-ROM SCSI-0 device 
Oct  1 13:37:41 localhost kernel: cd0: 100.000MB/s transfersSMP: AP CPU #1 Launched!
Oct  1 13:37:41 localhost kernel: 
Oct  1 13:37:41 localhost kernel: cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
Oct  1 13:37:41 localhost kernel: Trying to mount root from ufs:/dev/ad4s2a
Oct  1 13:37:41 localhost kernel: ugen2.2: <Broadcom Corp> at usbus2


and stops right there when it hangs.

When it does not hang, the boot continues at that point with:

Oct  1 13:37:41 localhost kernel: WARNING: TMPFS is considered to be a highly experimental feature in FreeBSD.
Oct  1 13:37:41 localhost kernel: wlan0: Ethernet address: 00:21:6a:26:34:c0
Oct  1 13:37:41 localhost kernel: iwn0: radio is disabled by hardware switch
Oct  1 13:37:42 localhost kernel: em0: link state changed to DOWN
Oct  1 13:37:44 localhost kernel: em0: link state changed to UP
....


Now, I never saw this behavior with the old laptop, and I had been
tracking stable/7, stable/8, and head on that on a daily basis ever
since a stable/8 existed.  (And I've been tracking head rather
longer than that.)

While I'm not about to assume that this indicates something wrong
with FreeBSD, I'm a bit less inclined to believe that it might be a
hardware/BIOS issue than I was yesterday.

Here are some differences between what I saw with my work desktop vs.
the new laptop:

* Desktop would reliably hang on each alternate boot.  No pattern
  detected for laptop, but hangs predominate (by a factor of about 4:1).

* Desktop would hang on alternate boots regardless of which branch of
  FreeBSD I was trying to boot.  Laptop only hangs on head.

* BIOS upgrade resolved issue with desktop.  So far, it hasn't with the
  laptop.

How might I get sufficient appropriate additional detail that I might be
able to help get this figured out, and possibly even fixed?

I've attached a copy of the stable/8 dmesg.boot.  I can get one form
head, but this is what I have at the moment.

I will, of course, be happy to test patches.

Thanks!

Peace,
david
-- 
David H. Wolfskill				david_at_catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

Received on Fri Oct 01 2010 - 19:48:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC