On Sat, 3 Sep 2005, Robert Watson wrote: > I believe I've chatted with Gleb about this some, but want to confirm > that I understand the problem here: this occurs when an interface is > removed while IP multicast membership is still present for multicast > groups on the interface. When the multicast socket is closed, then the > kernel panics because it has a now invalid cached pointer to the > interface structure (now freed), which cases an assertion failure > because the mutex code detects that it is operating on an invalid mutex. > > So it sounds like we need to figure out how the multicast code should > behave on interface removal -- I wonder what other operating systems do > here? Do they simply invalidate current membership related with the > interface, or do they leave the multicast sockets in a state such that > if the interface comes back, the memberships are re-bound? I've now committed a regression test for this bug: src/tools/regression/netinet/msocket_ifnet_remove Which basically simulates the removal of an interface while in use for multicast, resulting in a similar panic to the one of the ones you've reported. An if_disc discard interface is used. It tests both raw and UDP socket variants, and should panic 6.x and 7.x boxes; it may panic 4.x and 5.x, but may just corrupt kernel memory silently. I believe the solution for now is that on ifnet tear-down, we will need to walk the various pcb lists and trim references to the multicast address. I chatted a little with Bill Fenner today about what the application semantics should be, and likely we need to substantially change the way IPv4 and IPv6 multicast handle group membership for sockets in order to get the "right" behavior, so a panic work-around for 6.0 is the right thing to do, even though it won't be the final answer. I should have an opportunity to look into a possible solution for this in the next few days. Robert N M WatsonReceived on Mon Sep 05 2005 - 11:26:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC