Re: Multiple MSI on SMP, misrouting or misunderstanding?

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 8 Jun 2009 11:16:40 -0400
On Monday 08 June 2009 5:15:24 am Alexander Motin wrote:
> Hi.
> 
> While experimenting with using multiple MSIs support on AHCI controller 
> I have got the problem. When system boots as UP - everything is fine, 
> driver allocates all available 16 MSIs and works. But when system booted 
> as SMP, interrupts begin to behave strange: I didn't receive expected 
> AHCI IRQs, but instead receive IRQ1 interrupts of atkbd0, while I have 
> no PS/2 keyboard/mouse attached.
> 
> As I have found, problem appears due to IRQ rebalancing between CPUs. As 
> I have got, MSI requires that all vectors from the same group to be 
> allocated sequentially, but IRQ rebalancing breaks correct order, that 
> happed during initial allocation.
> 
> I was quite surprised by this issue. If multiple MSI vectors of the same 
> device have to be allocated sequentially and bound to the same CPU, then 
> they will be unable to give any SMP scalability benefits. Am I right, or 
> there is some special technique expected to be used to somehow 
> distribute grouped MSI vectors between CPUs which we don't have?
> 
> I have made small patch that denies rebalancing for grouped MSIs, to 
> make them work at least somehow. It works fine for me, but I am not sure 
> that it is the best solution.

It is a limitation of MSI.  With MSI, you have a single address register for 
the entire group of messages (the individual messages are just distinguished 
by toggling the lower N bits in the message data register).  On x86 the 
address register includes the APIC ID.  That means that all of the messages 
get sent to the same CPU.  With MSI-X, there is a table with separate address 
and data registers for each message.  This allows a driver to distribute 
interrupts across CPUs.  I had old patches prior to the per-CPU IDT stuff to 
handle this quirk of MSI groups.  The approach I used there was that I would 
only allow reassigning of the entire group by assigning to the first 
interrupt in the group.  With per-CPU IDTs that gets trickier though as you 
need to allocate a whole block of aligned, consecutive IDT vectors in the new 
CPU.

-- 
John Baldwin
Received on Mon Jun 08 2009 - 13:33:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:49 UTC