Re: ipmi patch for review

From: Alfred Perlstein <bright_at_mu.org>
Date: Fri, 30 May 2014 17:44:05 -0700
On 5/30/14, 10:44 AM, Doug Ambrisko wrote:
> On Thu, Sep 19, 2013 at 03:04:46PM -0400, John Baldwin wrote:
> | On Tuesday, September 17, 2013 6:21:10 am Gleb Smirnoff wrote:
> | >   Hi!
> | >
> | >   When system is writing a kernel core dump, it issues watchdog
> | > pat wdog_kern_pat(WD_LASTVAL). If ipmi is in action, it registers
> | > ipmi_wd_event() as event for watchdog. Thus ipmi_wd_event() is
> | > called in dumping context.
> | >
> | > The problem is that ipmi_wd_event() calls into ipmi_set_watchdog(),
> | > that calls into ipmi_alloc_request(), which uses M_WAITOK and
> | > thus sleeps. This is a smaller problem, since can be converted to
> | > M_NOWAIT. But ipmi_set_watchdog() then calls into
> | > ipmi_submit_driver_request(), which calls msleep() any time.
> | >
> | >   The attached patch allows me to successfully write cores in
> | > presence of IPMI.
> |
> | Of course, the watchdog might go off during your dump. :)
> |
> | The real fix is more complicated, which is that we should not use
> | a worker thread for at least SMIC and KCS.
>
> I haven't looked at this patch but I have local code that switches
> KCS into polling direct mode when the kernel goes into panic mode.
> I use this to write Linux compatible back traces into the system
> event logs.  This could allow the watchdogd to continue to work.
> This should be easily extended to SMIC mode.  SMBUS would be
> harder but at a prior company I made the SMBIO driver work in polled
> mode.
>
> If someone wants to look at this I can post the changes for KCS and
> the kernel backtrace to the system event log.  We find this useful
> when looking at customer machines.
>
> IPMI gets upset if things get intermixed/interrupted so there needs
> to be serialization and cancellation if being interrupted.
>
These patches would be really nice to have in base.  I noticed this 
problem too, you can't really touch watchdogs some of the time when in a 
panic(9) situation and it leaves you in a bad state to stop them or 
reset them while you are dumping.

Thank you for looking at this.

-Alfred
Received on Fri May 30 2014 - 22:44:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC