On Wed, May 01, 2013 at 06:45:53PM +0100, Robert N. M. Watson wrote: > > On 1 May 2013, at 16:56, John Baldwin wrote: > > > It looks like the ipi_hash_lock is locked (and udp_connect() locks it), so I > > think the offending code is somewhere else. Also, I can't find anything that > > removes an inp without hold the correct pcbinfo lock. Only thing I can think > > of is if the pcbinfo pointer for an inp could change, so we could maybe > > lock the wrong one while removing it? > > > > Hmmmmmm, you know. In in_pcbremlists() and in_pcbdrop(), we read inp_phd > > without holding the hash lock. I think that probably don't actaully break > > anything, but this feels like a locking issue of some sort. > > I'll need to catch up on this thread later, but a few questions: > > Do we know if the application in question is multithreaded, and > if so, might it be attempting concurrent operations on this socket? I do not know if zabbix-agent is multithreaded, but cf-agent is. > The corrupted pointer is worrying ... but interesting, and suggests > something else is going on here -- stack corruption earlier in the > system call, perhaps? > > In general, to modify our various hash lists you must lock both > the inpcb and the list. It's therefore sufficient to hold either > lock to read, so reading inp_phd should be OK with the inpcb lock > held, even without the hash lock held. > > Do we have a dump of *inp, and if so, can we confirm that the > inpcb is still properly referenced, if there is an associated socket, > likewise a dump of *inp->inp_socket to check things are properly > referenced there? > I will follow up with this information as soon as possible. Glen
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:37 UTC