Re: Difference between 6.2 and 7.0 Adaptec 39320D - 7.0 performing less

From: Scott Long <scottl_at_samsco.org>
Date: Thu, 19 Apr 2007 08:24:24 -0600
Gelsema, P (Patrick) wrote:
> On Wed, April 18, 2007 22:51, Scott Long wrote:
>> Gelsema, P (Patrick) wrote:
>>> On Tuesday 17 April 2007 18:24, Scott Long wrote:
>>>> Gelsema, P (Patrick) - FreeBSD wrote:
>>>>> On Tue, April 17, 2007 16:45, Scott Long wrote:
>>>>>> Gelsema, P (Patrick) - FreeBSD wrote:
>>>>> <SNAP></SNAP>
>>>>>
>>>>>> The 39320D is a finicky card.  I don't recall putting in the code
>>>>>> that
>>>>>> would downshift the speed like this, but it wouldn't surprise me if
>>>>>> it
>>>>>> is a side effect of the system going slower.  Anyways, it sounds like
>>>>>> you're a good candidate/victim for the MPSAFE locking changes that I
>>>>>> just made to the SCSI layer and the ahc/ahd drivers.  Would you mind
>>>>>> testing it out (just update to the latest 7-CURRENT sources) and let
>>>>>> me
>>>>>> know how it works for you?
>>> <SNAP></SNAP>
>>>
>>>>> Is building world/kernel sufficient as test or do you want me to do
>>>>> more
>>>>> tests?
>>>> Any amount of testing that you can do is appreciated.  Even verifying
>>>> that it boots is helpful =-)
>>> Cvsupped this evening at about 6.15 UTC time (20:15 CET zone)
>>> FreeBSD hulk.superhero.nl 7.0-CURRENT FreeBSD 7.0-CURRENT #0: Wed Apr 18
>>> 21:56:58 CEST 2007
>>> root_at_hulk.superhero.nl:/usr/obj/usr/src/sys/GENERIC
>>> amd64
>>>
>>> After buildworld and the whole lot the computer boots fine, however the
>>> disk
>>> is still detected as only 160.00MB/s.
>>>
>>> I do get the following crash. It seems to be related to pressing scroll
>>> lock
>>> on the console and hitting the page up/down buttons. When I just log on
>>> locally or remotely it seems to be ok. When I hit the scroll lock key
>>> before
>>> or after logging on I get the below crash.
>>>
>>> Apr 18 22:08:22 hulk kernel: lock order reversal: (Giant after
>>> non-sleepable)
>>> Apr 18 22:08:22 hulk kernel: 1st 0xffffff007b413358 ahd_lock (ahd_lock)
>>> _at_ /usr/src/sys/cam/cam_periph.c:559
>>> Apr 18 22:08:22 hulk kernel: 2nd 0xffffffff80977f20 Giant (Giant)
>>> _at_ /usr/src/sys/vm/vm_contig.c:590
>>> Apr 18 22:08:22 hulk kernel: KDB: stack backtrace:
>>> Apr 18 22:08:22 hulk kernel: db_trace_self_wrapper() at
>>> db_trace_self_wrapper+0x3a
>>> Apr 18 22:08:22 hulk kernel: witness_checkorder() at
>>> witness_checkorder+0x4f9
>>> Apr 18 22:08:22 hulk kernel: _mtx_lock_flags() at _mtx_lock_flags+0x75
>>> Apr 18 22:08:22 hulk kernel: contigmalloc() at contigmalloc+0x63
>>> Apr 18 22:08:22 hulk kernel: bus_dmamem_alloc() at bus_dmamem_alloc+0x8d
>>> Apr 18 22:08:22 hulk kernel: ahd_alloc_scbs() at ahd_alloc_scbs+0x32a
>>> Apr 18 22:08:22 hulk kernel: ahd_get_scb() at ahd_get_scb+0x69
>>> Apr 18 22:08:22 hulk kernel: ahd_action() at ahd_action+0x47c
>>> Apr 18 22:08:22 hulk kernel: xpt_run_dev_sendq() at
>>> xpt_run_dev_sendq+0x1ae
>>> Apr 18 22:08:22 hulk kernel: xpt_action() at xpt_action+0x4d3
>>> Apr 18 22:08:22 hulk kernel: dastart() at dastart+0x211
>>> Apr 18 22:08:22 hulk kernel: xpt_run_dev_allocq() at
>>> xpt_run_dev_allocq+0xf4
>>> Apr 18 22:08:22 hulk kernel: dastrategy() at dastrategy+0x78
>>> Apr 18 22:08:22 hulk kernel: g_disk_start() at g_disk_start+0xe6
>>> Apr 18 22:08:22 hulk kernel: g_io_schedule_down() at
>>> g_io_schedule_down+0x189
>>> Apr 18 22:08:22 hulk kernel: g_down_procbody() at g_down_procbody+0x7a
>>> Apr 18 22:08:22 hulk kernel: fork_exit() at fork_exit+0xaa
>>> Apr 18 22:08:22 hulk kernel: fork_trampoline() at fork_trampoline+0xe
>>> Apr 18 22:08:22 hulk kernel: --- trap 0, rip = 0, rsp =
>>> 0xffffffffac102d30,
>>> rbp = 0 ---
>>>
>>> Is this information sufficient? If not please let me know what more is
>>> required.
>>>
>>> Rgds,
>>>
>>> Patrick
>>>
>> Thanks for the info.  Fixing this problem is going to be a royal pain.
>> You can probably get around it by disabling WITNESS and INVARIANTS.
>>
>> Scott
> 
> The computer seems to remain working even with the crash. Disabling
> WINTNESS and INVARIANTS only disables the checking but not the actual
> problem, is that correct?
> 
> If you want I can provide you full SSH access to the box to make working
> on the fix of this problem easier? I am not using this box for anything
> else than just toying, getting a better understanding. Just let me know.
> HTH.
> 

Thanks for the offer.  I have tons of hardware, I just didn't think to 
check the adaptec drivers on amd64 specifically.  On i386 they don't 
trigger the warning (though they do still have the same problem) so I
didn't notice it.

> Also the disk is still detected as only 160.00MB/s, any thought about this?
> 

I'll look into this as well.  Actually, it might be a result of the
simple domain validation code that was added to 7-current a while back.
DV is both very tricky to implement and very tricky to predict in
operation, so what you're seeing might be a bug or it might be a
legitimate problem with your disk or cables.

Scott
Received on Thu Apr 19 2007 - 12:24:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:08 UTC