Re: minidump fails on SMP machines

From: Attilio Rao <attilio_at_freebsd.org>
Date: Sun, 28 Feb 2010 22:20:46 +0100
2010/2/28 Andrew Brampton <brampton+freebsd_at_gmail.com>:
> Hello,
>
> When many interrupts are firing on my amd64 SMP machine, and a panic
> occurs, dump will fail with the error "Attempt to write outside dump
> device boundaries". This problem has been discussed before[1][2], and
> I even filled a PR about it last year[3].
>
> From what I understand the reason this occurs is because interrupts
> are not disabled on all cores during the dump. There was even a TODO
> comment about this in kern_shutdown.c
>  414        /* XXX This doesn't disable interrupts any more.  Reconsider? */
>  415        splhigh();
>
> However, looking at my FreeBSD 8.0 and HEAD source, these two lines
> were removed by commit r196198 by attilio. Now that commit seems to
> deal with disabling interrupts on other cores, but perhaps not
> specifically to fix this bug. So my question is, should this dump bug
> now be fixed? Or should the XXX comment been left in? Or is the dump
> bug caused by something else? I would appreciate this problem being
> fixed, as it causes me a lot of headaches when trying to debug my
> kernel module :(.

So, I didn't look in big detail to your problem, but specifically,
spinlock_enter() just disable the interrupts on the machine it runs
on. If you want to handle also the APs case you need to install an IPI
or maybe you can do this with a rendezvous points.

Said that, please also note that the place where spinlock_enter() is
called now is too late wrt dumping -- I assume physical dump to happen
far earlier than the spinlock_exit().


Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Sun Feb 28 2010 - 20:20:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:01 UTC