Re: sendmail Broken Pipe Error

From: Willy Offermans <Willy_at_Offermans.Rompen.nl>
Date: Fri, 28 Mar 2014 12:11:06 +0100
Hello John-Mark and FreeBSD friends,

On Thu, Mar 27, 2014 at 08:27:01PM -0700, John-Mark Gurney wrote:
> Willy Offermans wrote this message on Thu, Mar 27, 2014 at 15:46 +0100:
> > Hello John-Mark and FreeBSD friends,
> > 
> > On Wed, Mar 26, 2014 at 04:04:27PM -0700, John-Mark Gurney wrote:
> > > Willy Offermans wrote this message on Wed, Mar 26, 2014 at 18:22 +0100:
> > > > Hello John-Mark and FreeBSD friends,
> > > > 
> > > > On Wed, Mar 26, 2014 at 09:20:35AM -0700, John-Mark Gurney wrote:
> > > > > Willy Offermans wrote this message on Wed, Mar 26, 2014 at 12:17 +0100:
> > > > > > On Tue, Mar 25, 2014 at 09:43:16AM -0700, John-Mark Gurney wrote:
> > > > > > > Willy Offermans wrote this message on Tue, Mar 25, 2014 at 11:39 +0100:
> > > > > > > > I'm not an expert in tcpdump. Can anyone make sense out of the messages?
> > > > > > > 
> > > > > > > If you dumped the contents, using -s 0 -X, and look at that last packet
> > > > > > > you should see 0d 0a 2e 0d 0a at the end..  which is CR/LF/./CR/LF.. If
> > > > > > > you don't see that, then for some reason sendmail/FreeBSD isn't telling
> > > > > > > the server that it's done sending which would prevent the receiving
> > > > > > > side from ack'ing the email causing the timeout...
> > > > > > 
> > > > > > I followed your suggestions. However I'm not able to distinguish the last
> > > > > > packet. Is there a way to find this with help of the Flags? The following
> > > > > > is the output of tcpdump -r /root/tmp/tcpdump -X | grep Flags
> > > > > > 
> > > > > > 11:57:56.539788 IP MyServer.com.41115 > Smarthost.com.smtp: Flags [S], seq 1001452351, win 65535, options [mss 1448,nop,wscale 6,sackOK,TS val 407239960 ecr 0], length 0
> > > > > > 11:57:56.555262 IP Smarthost.com.smtp > MyServer.com.41115: Flags [S.], seq 1277075046, ack 1001452352, win 8192, options [mss 1452], length 0
> > > > > 
> > > > > It should look something like:
> > > > > 09:18:34.723280 IP jmgmac.funkthat.com.64724 > h2.funkthat.com.ssh: Flags [.], ack 177, win 33280, options [nop,nop,TS val 1854905469 ecr 3482476972], length 0
> > > > >         0x0000:  4510 0034 d7ac 4000 4006 e1af c0a8 0003  E..4.._at_._at_.......
> > > > >         0x0010:  c0a8 0004 fcd4 0016 7e48 238e d872 43dc  ........~H#..rC.
> > > > >         0x0020:  8010 8200 7c08 0000 0101 080a 6e8f 9c7d  ....|.......n..}
> > > > >         0x0030:  cf92 61ac                                ..a.
> > > > > 
> > > > > Notice the hex output...  I didn't see any of that in your output...
> > > > > The last packet I was talking about is the last one that had length
> > > > > 1448 that your server sent...
> > > > 
> > > > I sent two e-mails consecutively: the first without an attachment, the
> > > > second with attachment. I dumped tcp of the NIC for port smtp. I got the
> > > > following:
> > > > 
> > > > 
> > > > 12:20:55.988622 IP MyServer.com.37191 > Smarthost.com.smtp: Flags [P.], seq 18943:19104, ack 412, win 65535, length 161
> > > >         0x0000:  4500 00c9 eebd 4000 4006 0000 c0a8 0004  E....._at_._at_.......
> > > >         0x0010:  d54b 3f0d 9147 0019 4ea0 36dd 15a7 38a0  .K?..G..N.6...8.
> > > >         0x0020:  5018 ffff 4481 0000 2020 2020 2020 2020  P...D...........
> > > >         0x0030:  2020 2020 2020 2020 2020 2020 2020 2020  ................
> > > >         0x0040:  2020 2020 2020 2020 2020 2020 2020 205c  ...............\
> > > >         0x0050:  2f20 205c 205e 0d0a 2020 2020 2020 2020  /..\.^..........
> > > >         0x0060:  2020 2020 2020 2020 2020 2020 2020 2020  ................
> > > >         0x0070:  2020 2020 2020 2020 2020 2020 2020 2020  ................
> > > >         0x0080:  2020 202e 5c2e 5f2f 5f29 0d0a 0d0a 2020  ....\._/_)......
> > > >         0x0090:  2020 2020 2020 2020 2020 2020 2020 2020  ................
> > > >         0x00a0:  2020 2020 2020 2020 2020 2020 2020 2020  ................
> > > >         0x00b0:  2020 2020 2077 7777 2e46 7265 6542 5344  .....www.FreeBSD
> > > >         0x00c0:  2e6f 7267 0d0a 2e0d 0a                   .org.....
> > > > 
> > > > As predicted by John-Mark, the first ended with "0d0a 2e0d 0a". However it
> > > > was not the last packet with length 1448. I hope that this will not spoil
> > > > the party. Is the Flag [P.] more indicative? It looks like to me, but I'm
> > > > just learning.
> > > > 
> > > > Anyway the second mail ended with:
> > > > 
> > > > 12:22:17.960896 IP MyServer.com.37191 > Smarthost.com.smtp: Flags [.], seq 35127:36575, ack 638, win 65535, length 1448
> > > >         0x0000:  4500 05d0 fe9d 4000 4006 0000 c0a8 0004  E....._at_._at_.......
> > > > <snip>
> > > > </snip>
> > > >         0x0560:  5670 6876 4a67 5a5a 5a50 4b2f 4b78 3774  VphvJgZZZPK/Kx7t
> > > >         0x0570:  382f 4230 594f 6b78 3449 0d0a 4a76 6551  8/B0YOkx4I..JveQ
> > > >         0x0580:  2b6e 7765 5647 2f33 6e79 6231 6133 496f  +nweVG/3nyb1a3Io
> > > >         0x0590:  5474 554f 4d61 4374 696b 714b 436b 4959  TtUOMaCtikqKCkIY
> > > >         0x05a0:  704a 7668 3055 416d 6c33 4754 4f4c 6455  pJvh0UAml3GTOLdU
> > > >         0x05b0:  774b 4145 7151 5741 7841 4141 5a66 7647  wKAEqQWAxAAAZfvG
> > > >         0x05c0:  706b 6c36 0d0a 7a4e 6234 745a 6633 5a6c  pkl6..zNb4tZf3Zl
> > > > 
> > > > Being packet with length 1448 and sent from my side. The code "0d0a 2e0d
> > > > 0a" is missing. Immediately thereafter the following packets:
> > > 
> > > We clearly haven't gotten the last mime-boundary, we are still in the
> > > base64 encoded data of the attachment...
> > > 
> > > > 12:22:18.003557 IP Smarthost.com.smtp > MyServer.com.37191: Flags [.], ack 36575, win 65160, length 0
> > > >         0x0000:  4500 0028 11fb 4000 7a06 19d0 d54b 3f0d  E..(.._at_.z....K?.
> > > >         0x0010:  c0a8 0004 0019 9147 15a7 3982 4ea0 7bbd  .......G..9.N.{.
> > > >         0x0020:  5010 fe88 315f 0000 0000 0000 0000       P...1_........
> > > 
> > > The remote acking that it got your last packet...
> > > 
> > > > 12:22:37.665889 IP MyServer.com.37191 > Smarthost.com.smtp: Flags [R.], seq 39471, ack 638, win 65535, length 0
> > > >         0x0000:  4500 0028 492c 4000 4006 0000 c0a8 0004  E..(I,_at_._at_.......
> > > >         0x0010:  d54b 3f0d 9147 0019 4ea0 870d 15a7 3982  .K?..G..N.....9.
> > > >         0x0020:  5014 ffff 2494 0000                      P...$...
> > > 
> > > Local host closing down the connection because of time out...
> > > 
> > > > 12:22:37.680857 IP Smarthost.com.smtp > MyServer.com.37191: Flags [.], ack 36575, win 65160, length 0
> > > >         0x0000:  4500 0028 0584 0000 f906 e746 d54b 3f0d  E..(.......F.K?.
> > > >         0x0010:  c0a8 0004 0019 9147 15a7 3982 4ea0 7bbd  .......G..9.N.{.
> > > >         0x0020:  5010 fe88 315f 0000 0000 0000 0000       P...1_........
> > > > 12:22:37.680920 IP MyServer.com.37191 > Smarthost.com.smtp: Flags [R], seq 1319140285, win 0, length 0
> > > >         0x0000:  4500 0028 4935 4000 4006 0000 c0a8 0004  E..(I5_at_._at_.......
> > > >         0x0010:  d54b 3f0d 9147 0019 4ea0 7bbd 0000 0000  .K?..G..N.{.....
> > > >         0x0020:  5004 0000 7f1d 0000                      P.......
> > > > 
> > > > It looks like Smarthost.com asks for more, but there is not more to sent.
> > > > The final packet seems to be absent. I cannot look for the closing remark
> > > > "www.FreeBSD.org", since there is the attachment at the end of the second
> > > > mail. Is there any connection between the encoded attachment in the second
> > > > mail and the output of tcpdump? 
> > > > 
> > > > Am I the only one noticing this error? I can hardly believe this.
> > > > 
> > > > Is there a way to force the insertion of <CRLF>.<CRLF>?
> > > 
> > > So, this is more looking like a kernel problem where the last packet(s)
> > > aren't being sent out...  Could you possibly catch the output of
> > > netstat -anfinet of the connection between the last packet and the
> > > reset?  It'll be interesting to see if there is data in the send-q
> > > for the connection (third column)...  If it's zero, that seems to imply
> > > that the server process hasn't sent all the data necessary...  Then
> > > the next bit of investigation would be to run ktrace on the sendmail
> > > process and make sure that it writes all the correct data to the
> > > socket...
> > > 
> > > Do you know what OS the remote side is running?  You could use nmap
> > > to try to figure it out, as it could be a TCP stack interaction issue..
> > 
> > netstat -anfinet gave:
> > 
> > tcp4       0  33304 MyServerIP.35395      SmarthostIP.25 ESTABLISHED
> > 
> > So there is still data to be sent, if I interpret correctly. What is next
> > to be investigated?
> 
> Yeh, if there is data in the send-q (per above) and you aren't seeing
> any more packets, someone on -net with some TCP clue should help you
> debug this...
> 
> Also, did you figure out what the OS is of the other end?  Knowing
> that will also help them...
> 
> For -net's reference, the opening syns look like:
> 11:57:56.539788 IP MyServer.com.41115 > Smarthost.com.smtp: Flags [S], seq 1001452351, win 65535, options [mss 1448,nop,wscale 6,sackOK,TS val 407239960 ecr 0], length 0
> 11:57:56.555262 IP Smarthost.com.smtp > MyServer.com.41115: Flags [S.], seq 1277075046, ack 1001452352, win 8192, options [mss 1452], length 0
> 
> -- 
>   John-Mark Gurney				Voice: +1 415 225 5579
> 
>      "All that I will do, has been done, All that I have, has not."
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"

I did some more investigations and I found the following:

1) The error depends on the size of the e-mail not the presence or absence
of an attachment. I put rfc3028.txt into a test mail and try to sent it. It
got stuck in the mail server with eventually the same Broken pipe/time out
error as previously discussed.

2) I noticed a substantial increase in CPU usage of natd and dhcpd when the 
e-mail is being processed by the mail server.

>From top:

 PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND

 1235 root          1  93    0 28908K  2144K RUN     0  54:05  71.78% natd
 1614 dhcpd         1   4    0 26784K 14936K RUN     0  29:24  38.77% dhcpd

The idle cpu load of these services is ca. 0.00%.

3) I experienced extremely slow copy processes with scp not related with
e-mail processing.

4) All other processes run smoothly to my notice and that makes it tricky 
to discover.

5) I tried to figure out the OS on the Smarthost:

Nmap scan report for Smarthost.com (SmarthostIP)
Host is up (0.023s latency).
Not shown: 994 closed ports
PORT     STATE    SERVICE
25/tcp   open     smtp
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
445/tcp  filtered microsoft-ds
1720/tcp filtered H.323/Q.931
5555/tcp filtered freeciv

This looks like some kind of MS OS to me.

*)
So I get the feeling that the encountered problems are somehow 
hardware related or related to my settings. Can someone suggest a good
description of my problem?

-- 
Met vriendelijke groeten,
With kind regards,
Mit freundlichen Gruessen,
De jrus wah,

Wiel

*************************************
 W.K. Offermans
e-mail: Willy_at_Offermans.Rompen.nl

                                       Powered by ....

                                            (__)
                                         \\\'',)
                                           \/  \ ^
                                           .\._/_)

                                       www.FreeBSD.org
Received on Fri Mar 28 2014 - 10:13:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:48 UTC