I have seen similar behaviour before. The problem is that every CPU receives an NMI concurrently. As I recall, one of them gets some kind of pseudo-spinlock and tries to stop the other CPUs with an NMI. However, because they are already in an NMI handler, they don't get the second NMI and don't stop properly. The case that I saw actually had to do with a panic triggered by an NMI, not entering the debugger, but I believe that both cases use stop_cpus_hard() under the hood and have a similar issue. (I also recall seeing the exact situation that you describe while originally developing SR-IOV on an alpha version of the Fortville hardware and firmware with a very buggy SR-IOV implementation. I've never seen it on ixgbe before, although I haven't used SR-IOV there very much at all) On Thu, Aug 20, 2015 at 6:15 PM, Adrian Chadd <adrian_at_freebsd.org> wrote: > Hi! > > This has started happening on -HEAD recently. No, I don't have any > more details yet than "recently." > > Whenever I get an NMI panic (and getting an NMI is a separate issue, > sigh) I get a slew of "failed to stop cpu" messages, and all CPUs > enter ddb. This is .. sub-optimal. Has anyone seen this? Does anyone > have any ideas? > > > -adrian > _______________________________________________ > freebsd-arch_at_freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe_at_freebsd.org" >Received on Fri Aug 21 2015 - 12:23:37 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:59 UTC