Re: Weird issue with hastd(8)

From: Mikolaj Golub <trociny_at_freebsd.org>
Date: Sun, 29 May 2011 14:11:55 +0300
On Wed, 25 May 2011 11:21:04 -0700 Maxim Sobolev wrote:

 MS> Hi Pawel,

 MS> I am observing strange errors while synchronizing the data between
 MS> primary and secondary. I keep getting the following error messages:

 MS> May 25 11:09:19 eights hastd[10113]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:09:24 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=10113, exitcode=75).
 MS> May 25 11:10:17 eights hastd[12109]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:10:18 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=12109, exitcode=75).
 MS> May 25 11:10:39 eights hastd[14685]: [test] (secondary) Unable to
 MS> receive request header: Socket is not connected.
 MS> May 25 11:10:44 eights hastd[37571]: [test] (secondary) Worker process
 MS> exited ungracefully (pid=14685, exitcode=75).

 MS> The synchronization steel proceeds, but it's slow due to the need to
 MS> re-negotiate and re-spawn the secondary worker. I have tried to ktrace
 MS> both server and client at the same time. For some reason the primary
 MS> keeps sending data, while client gets 0-read from the recvfrom at some
 MS> point, while the primary keeps sending more data. This is 8-STABLE
 MS> code on both ends.

 MS> Any ideas of what could be wrong here are appreciated.

This might be MSG_WAITALL issue I described on net_at_ (look for the thread
"recv() with MSG_WAITALL might stuck when receiving more than rcvbuf", and
also kern/154504).

Could you please try the patch?

http://people.freebsd.org/~trociny/uipc_socket.c.patch

-- 
Mikolaj Golub
Received on Sun May 29 2011 - 09:38:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:14 UTC