Brian, thanks for your work. Question: is this patch is something I can feel comfortable about trying it on a production server? Did you test it? On Thu, 7 Oct 2004 14:18:52 -0400, Brian Fundakowski Feldman <green_at_freebsd.org> wrote: > > > On Thu, Oct 07, 2004 at 01:00:08AM +0200, Marc UBM Bocklet wrote: > > On Wed, 6 Oct 2004 17:51:34 -0400 > > Brian Fundakowski Feldman <green_at_freebsd.org> wrote: > > > > > On Wed, Oct 06, 2004 at 03:55:21PM -0400, Vlad wrote: > > > > Brian, > > > > > > > > I've created ticket a while ago in regards to this problem: > > > > > > > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/72126 > > > > > > > > also, attached are some additional debug info I could get during > > > > last two times it crashed. > > > > > > > > unfortunately I don't have big enough stand-alone swap partition to > > > > get kernel crash dump, so those files are the best I could get. > > > > > > > > here is week-old thread for the same problem: > > > > http://groups.google.com/groups?th=fc0d7d881f0713cc > > > > > > Can you tell me where sorele() crashed when you did _not_ have > > > INVARIANTS enabled? That will help because the INVARIANTS panic tells > > > us one half of where the problem is and the sorele() panic is the > > > other half. Want to try this (untested) patch for the problem that I'm > > > guessing about? > > > > [possible patch for sodealloc panic] > > > > Ok, I'm just now recompiling my kernel with your patch, I should be able > > to boot the kernel tomorrow, then we'll see if there are any further > > panics. > > > > Thanks a lot for your help! :-) > > You're welcome! If that does work (or mostly works), please try these > deltas instead. I have only tried to make TCP work correctly in this > respect, but all protocols need to be able to atomically with regard > to both the accept lock and the socket lock give up their pcb from > the proto bottom half and give final "ownership" of the socket to its > parent (listening socket) before never touching it again. > > The order of SOCK_LOCK(), so_pcb = NULL, check for 0 references, check > for no fd reference, SOCK_UNLOCK(), ACCEPT_LOCK() violates this. This > is mostly something that could easily be fixed by using regular GC > methods to do the final bit of deallocation. > > Index: sys/socketvar.h > =================================================================== > RCS file: /usr/ncvs/src/sys/sys/socketvar.h,v > retrieving revision 1.133 > diff -u -r1.133 socketvar.h > --- sys/socketvar.h 12 Jul 2004 21:42:33 -0000 1.133 > +++ sys/socketvar.h 7 Oct 2004 17:52:49 -0000 > _at__at_ -205,6 +205,7 _at__at_ > #define SS_ASYNC 0x0200 /* async i/o notify */ > #define SS_ISCONFIRMING 0x0400 /* deciding to accept connection req */ > #define SS_ISDISCONNECTED 0x2000 /* socket disconnected from peer */ > +#define SS_HASBEENCOMP 0x4000 /* proto made socket SQ_COMP */ > > /* > * Socket state bits now stored in the socket buffer state field. > _at__at_ -360,6 +361,16 _at__at_ > SOCK_UNLOCK(so); \ > } while(0) > > +#define sopcbfreed(so) do { \ > + SOCK_LOCK_ASSERT(so); \ > + (so)->so_pcb = NULL; \ > + if ((so)->so_count == 0 && \ > + ((so)->so_state & SS_HASBEENCOMP) == 0) \ > + sofree(so); \ > + else \ > + SOCK_UNLOCK(so); \ > +} while(0) > + > /* > * In sorwakeup() and sowwakeup(), acquire the socket buffer lock to > * avoid a non-atomic test-and-wakeup. However, sowakeup is > Index: kern/uipc_socket2.c > =================================================================== > RCS file: /usr/ncvs/src/sys/kern/uipc_socket2.c,v > retrieving revision 1.137 > diff -u -r1.137 uipc_socket2.c > --- kern/uipc_socket2.c 15 Aug 2004 06:24:41 -0000 1.137 > +++ kern/uipc_socket2.c 7 Oct 2004 17:42:15 -0000 > _at__at_ -117,12 +117,15 _at__at_ > { > struct socket *head; > > + ACCEPT_LOCK(); > SOCK_LOCK(so); > so->so_state &= ~(SS_ISCONNECTING|SS_ISDISCONNECTING|SS_ISCONFIRMING); > so->so_state |= SS_ISCONNECTED; > - SOCK_UNLOCK(so); > - ACCEPT_LOCK(); > head = so->so_head; > + if (head && (so->so_qstate & SQ_INCOMP) && > + (so->so_options & SO_ACCEPTFILTER) == 0) > + so->so_state |= SS_HASBEENCOMP; > + SOCK_UNLOCK(so); > if (head != NULL && (so->so_qstate & SQ_INCOMP)) { > if ((so->so_options & SO_ACCEPTFILTER) == 0) { > TAILQ_REMOVE(&head->so_incomp, so, so_list); > Index: kern/uipc_syscalls.c > =================================================================== > RCS file: /usr/ncvs/src/sys/kern/uipc_syscalls.c,v > retrieving revision 1.200 > diff -u -r1.200 uipc_syscalls.c > --- kern/uipc_syscalls.c 15 Aug 2004 06:24:41 -0000 1.200 > +++ kern/uipc_syscalls.c 7 Oct 2004 17:55:22 -0000 > _at__at_ -314,9 +314,8 _at__at_ > KASSERT(so->so_qstate & SQ_COMP, ("accept1: so not SQ_COMP")); > > /* > - * Before changing the flags on the socket, we have to bump the > - * reference count. Otherwise, if the protocol calls sofree(), > - * the socket will be released due to a zero refcount. > + * XXX This reference should really be done by soaccept(). Once > + * SQ_COMP, a socket will never disappear from underneath us. > */ > SOCK_LOCK(so); > soref(so); /* file descriptor reference */ > cvs diff: Diffing netinet > Index: netinet/in_pcb.c > =================================================================== > RCS file: /usr/ncvs/src/sys/netinet/in_pcb.c,v > retrieving revision 1.155 > diff -u -r1.155 in_pcb.c > --- netinet/in_pcb.c 29 Sep 2004 04:01:13 -0000 1.155 > +++ netinet/in_pcb.c 7 Oct 2004 17:58:38 -0000 > _at__at_ -688,8 +688,7 _at__at_ > in_pcbremlists(inp); > if (so) { > SOCK_LOCK(so); > - so->so_pcb = NULL; > - sotryfree(so); > + sopcbfreed(so); > } > if (inp->inp_options) > (void)m_free(inp->inp_options); > Index: netinet/tcp_subr.c > =================================================================== > RCS file: /usr/ncvs/src/sys/netinet/tcp_subr.c,v > retrieving revision 1.203 > diff -u -r1.203 tcp_subr.c > --- netinet/tcp_subr.c 5 Sep 2004 02:34:12 -0000 1.203 > +++ netinet/tcp_subr.c 7 Oct 2004 17:59:01 -0000 > _at__at_ -1686,10 +1686,9 _at__at_ > tcp_discardcb(tp); > so = inp->inp_socket; > SOCK_LOCK(so); > - so->so_pcb = NULL; > tw->tw_cred = crhold(so->so_cred); > tw->tw_so_options = so->so_options; > - sotryfree(so); > + sopcbfreed(so); > inp->inp_socket = NULL; > if (acknow) > tcp_twrespond(tw, TH_ACK); > > > > -- > Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ > <> green_at_FreeBSD.org \ The Power to Serve! \ > Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org" > -- VladReceived on Thu Oct 07 2004 - 17:11:53 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:16 UTC