Re: panic: refcount inconsistency: found: 0 total: 1

From: David Wolfskill <david_at_catwhisker.org>
Date: Tue, 3 Nov 2015 13:10:01 -0800
On Tue, Nov 03, 2015 at 08:39:52AM -0800, David Wolfskill wrote:
> On Tue, Nov 03, 2015 at 06:15:35PM +0300, Alexander V. Chernikov wrote:
> > ...
> > > I tried booting it, and during the transition to multi-user mode,
> > > once ipfw was being invoked, I got the above-cited panic. Circumvention
> > > was to leave it disconnected from a network (turn off the WiFi
> > > switch, in my case), so we don't get a chance to use the network.
> > It is most probably related with r290334. Would you mind reverting it and checking if ipfw works correctly ?
> > ....
> 
>  ... [Reverting r290334 gets things working again -- dhw]
> 
> at a *guess*, we'll need a bit more effort to keep "found" and
> "ci->object_opcodes" in sync, at least by the time the KASSERT on
> src/sys/netpfil/ipfw/ip_fw_table.c:3395 looks at the values.
> ...

So... looking at the code a bit (and bearing in mind that I still am
unfamiliar with said code, and I hack more than I "write" code)...
here's the original bit of src/sys/netpfil/ipfw/ip_fw_table.c in
question, with the modification from r290334 shown:

...
/*
 * Finds and bumps refcount for objects referenced by given _at_rule.
 * Auto-creates non-existing tables.
 * Fills in _at_oib array with userland/kernel indexes.
 *
 * Returns 0 on success.
 */
static int
ref_rule_objects(struct ip_fw_chain *ch, struct ip_fw *rule,
    struct rule_check_info *ci, struct obj_idx *oib, struct tid_info
*ti)
{
...
        if (error != 0) {
                /* Unref everything we have already done */
                unref_oib_objects(ch, rule->cmd, oib, pidx);
                IPFW_UH_WUNLOCK(ch);
                return (error);
        }

        IPFW_UH_WUNLOCK(ch);

        found = pidx - oib;
        KASSERT(found == ci->object_opcodes,
            ("refcount inconsistency: found: %d total: %d",
            found, ci->object_opcodes));
   
        /* Perform auto-creation for non-existing objects */
        if (numnew != 0)
                error = create_objects_compat(ch, rule->cmd, oib, pidx, ti);
   
+	/* Calculate real number of dynamic objects */
+	ci->object_opcodes = (uint16_t)(pidx - oib);
+
        return (error);
}
...


The added code to "Calculate real number of dynamic objects" is
apparently setting ci->object_opcodes to the same value that "found" was
just set to (pidx - oib -- cast to uint16_t in the case of the added
code).

But that appears to come a bit late for the KASSERT.

Moving the KASSERT relative to the added code (so the KASSERT comes
after) would be an option, but I'm not sure it makes sense to
actually do the KASSERT unless we have some reason to believe that
the difference between pidx and oib might change in the interval
represented by a couple lines of code AND we have not coded to
handle that situation any more "gracefully" than to panic.

So... with the change from r290334, what's the point of the KASSERT?

Peace,
david
-- 
David H. Wolfskill				david_at_catwhisker.org
Those who would murder in the name of God or prophet are blasphemous cowards.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

Received on Tue Nov 03 2015 - 20:10:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:00 UTC