Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

From: Svein Skogen (Listmail account) <"Svein>
Date: Wed, 21 Jul 2010 20:40:47 +0200
On 21.07.2010 18:33, Ståle Kristoffersen wrote:
> On 2010-07-20 at 14:16, Svein Skogen (Listmail account) wrote:
>> Sorry for the late response here, but what you're describing matches
>> fairly well what I saw with RELENG_8 (just after 8.0 was released), but
>> luckily I didn't have any disks on my MPT, just my tape autoloader.
>>
>> Random timeouts, and then bus resets (that made tape IO unreliable).
>>
>> The bad news, is that I had the exact same trouble with OpenSolaris
>> (134), and something-similar with Linux (can't remember versions), at
>> the time.
>>
>> I never did find a solution, and ended up throwing windows on the box,
>> just to get reliable backups.
>>
>> My MPT is a 3801 LSI1068e based card running the latest bios.
> 
> Hmm, that does not sound good. Did windows work on the same hardware
> without problems?

Yup. But notice that I do _NOT_ have any disks on my MPT (I have an MFI
for that), it's just a mini-sas<-->mini-sas into a HP 1/8G2 LTO3 Autoloader.

> I -might- have solved my problem. It has now ran for 24h without timeouts,
> and with a bit of load on it. I think I might have ran into the seagate +
> NCQ-problem, even tho seagate's webpage told me my drives was not affected
> (according to the serial numbers). I did however update the following
> num drives       firmware 
> 6x  ST31000340AS SD15
> 4x  ST31500341AS SD17

I have 8 of the last type (31500341AS) mine running on CC1H firmware,
connected to my MFI. Not a single glitch so far.

> 
> to firmware SD1B (old SD17) and SD1A (old SD15), and that looks like it has
> done the trick. I'll report back in a week or so if the problem has not
> reappeared.

Hope it's fixed for you. I'm still keeping an eye on the MPT code to see
if someone changes something that CAN be affecting my timeout
issues/reset, and if I see something promising, I'm willing to dump out
the entire server to tapes, and test run (I have sufficient spare tapes
to actually test without losing data), but such a job will take me a
week to prepare, and another to test. Quite a bit of time for something
that "may" solve my problem... ;)

//Svein

-- 
--------+-------------------+-------------------------------
  /"\   |Svein Skogen       | svein_at_d80.iso100.no
  \ /   |Solberg Østli 9    | PGP Key:  0xE5E76831
   X    |2020 Skedsmokorset | svein_at_jernhuset.no
  / \   |Norway             | PGP Key:  0xCE96CE13
        |                   | svein_at_stillbilde.net
 ascii  |                   | PGP Key:  0x58CD33B6
 ribbon |System Admin       | svein-listmail_at_stillbilde.net
Campaign|stillbilde.net     | PGP Key:  0x22D494A4
        +-------------------+-------------------------------
        |msn messenger:     | Mobile Phone: +47 907 03 575
        |svein_at_jernhuset.no | RIPE handle:    SS16503-RIPE
--------+-------------------+-------------------------------
         If you really are in a hurry, mail me at
               svein-mobile_at_stillbilde.net
 This mailbox goes directly to my cellphone and is checked
        even when I'm not in front of my computer.
------------------------------------------------------------
                     Picture Gallery:
          https://gallery.stillbilde.net/v/svein/
------------------------------------------------------------


Received on Wed Jul 21 2010 - 16:41:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:05 UTC