Re: taskqgroup_adjust kernel panic

From: Shawn Webb <shawn.webb_at_hardenedbsd.org>
Date: Tue, 6 Sep 2016 12:25:41 -0400
On Mon, Sep 05, 2016 at 07:51:02PM -0400, Shawn Webb wrote:
> On Mon, Sep 05, 2016 at 02:54:54PM -0700, Mark Johnston wrote:
> > On Mon, Sep 05, 2016 at 01:55:38PM -0400, Shawn Webb wrote:
> > > Hey all,
> > > 
> > > I'm at revision 3872750 of the hardened/current/drm-next-4.7 branch in
> > > the HardenedBSD/hardenedBSD-playground repo. I've gotten this kernel
> > > panic a couple times when booting. I'm using full-disk encryption with
> > > ZFS and encrypted swap. The hardware is a Purism 15 2K laptop.
> > > 
> > > The panic doesn't happen often nor is there a way I can reproduce it
> > > 100%.
> > > 
> > > Here's my `uname -a` output:
> > > 
> > > FreeBSD hbsd-dev-laptop 12.0-CURRENT-HBSD FreeBSD 12.0-CURRENT-HBSD #0 3872750(hardened/current/drm-next-4.7): Tue Aug 30 17:41:53 EDT 2016     shawn_at_hbsd-dev-laptop:/usr/obj/usr/src/sys/LATT-SEC  amd64
> > > 
> > > Here's a couple pictures of the panic I took:
> > > 
> > > https://goo.gl/photos/P5kiwabPYjwQX7Kr8
> > > https://goo.gl/photos/BWtvBnq7QLnwgRP28
> > 
> > Based on the faulting instruction, the panic probably happened because
> > qid is uninitialized in the loop that starts with
> > 
> >     while ((gtask = LIST_FIRST(&gtask_head))) {
> > 
> > I don't know this code very well, so I'm not sure how that can happen. I
> > suspect iflib_irq_alloc_generic() is buggy: it calls
> > 
> >     taskqgroup_attach_cpu(... CPU_FFS(&cpus) ...);
> > 
> > and CPU_FFS returns 1-indexed IDs, but taskqgroup_attach_cpu() pretty
> > clearly expects 0-indexed CPU IDs. There's a similar bug in find_nth()
> > in iflib.c.
> 
> I think you hit the nail right on the head. Attached is a patch that
> doesn't fix the underlying issue, but at least detects improperly
> setting qid. It'll throw a KASSERT if qid isn't set properly.
> 
> I'll study this code a bit more within the next couple days and I hope
> to have a full patch to address the underlying issue.

I've now verified, using that patch, that qid is always uninitialized in
my case. That KASSERT is hit 100% of the time when the laptop is booting
up.

I've filed a bug report. The problematic code exists in 11-STABLE and
11.0-RELENG as well.

Link to bug report:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212418

Thanks,

-- 
Shawn Webb
Cofounder and Security Engineer
HardenedBSD

GPG Key ID:          0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE

Received on Tue Sep 06 2016 - 14:25:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC