Postfix behind AWS NLB with Proxy Protocol does not send banner until CRLF is sent
I've redeployed my mail stack as a Kubernetes pod. This pod is on an EKS cluster in the private subnet, behind an NLB. Postfix and the NLB are configured to speak proxy protocol v2.
Originally I had this setup without proxy protocol, and the Postfix ports responded as expected, immediately sending the Postfix banner upon connect, however Postfix could not identify the remote server sending mail to it correctly, and it marked everything as spam. So I've decided to go the proxy protocol route.
When connecting via telnet, the connection opens, but Postfix does not send it's banner. It's banner is not sent until a CRLF is sent (enter key is pressed) - You can send any other character and nothing will happen until the CRLF is sent. This affects the submission port on (587) and breaks client connections, as SMTP protocol declares the receiving server must respond first.
Initial connection:
❯ telnet mx01.example.com 587
Trying x.x.x.x...
Connected to mx01.example.com.
Escape character is '^]'.
After CRLF is sent:
❯ telnet mx01.example.com 587
Trying x.x.x.x...
Connected to mx01.example.com.
Escape character is '^]'.
220 mx01.example.com ESMTP Postfix (Ubuntu)
500 5.5.2 Error: bad syntax
And this is without the Proxy Protocol configuration:
❯ telnet mx01.example.com 587
Trying x.x.x.x...
Connected to mx01.example.com.
Escape character is '^]'.
220 mx01.example.com ESMTP Postfix (Ubuntu)
Versions:
OS: Ubuntu 20.10
Postfix version: 3.5.6-1
Postfix master.cf
#
# Postfix master process configuration file. For details on the format
# of the file, see the master(5) manual page (command: "man 5 master" or
# on-line: http://www.postfix.org/master.5.html).
#
# Do not forget to execute "postfix reload" after editing this file.
#
# ==========================================================================
# service type private unpriv chroot wakeup maxproc command + args
# (yes) (yes) (no) (never) (100)
# ==========================================================================
smtp inet n - y - 1 postscreen
smtpd pass - - y - - smtpd
dnsblog unix - - y - 0 dnsblog
tlsproxy unix - - y - 0 tlsproxy
submission inet n - - - - smtpd
-o syslog_name=postfix/submission
-o smtpd_tls_security_level=encrypt
-o smtpd_client_restrictions=permit_sasl_authenticated,reject
-o cleanup_service_name=header_cleanup
-o smtpd_upstream_proxy_protocol=haproxy
#smtps inet n - y - - smtpd
# -o syslog_name=postfix/smtps
# -o smtpd_tls_wrappermode=yes
# -o smtpd_sasl_auth_enable=yes
# -o smtpd_reject_unlisted_recipient=no
# -o smtpd_client_restrictions=$mua_client_restrictions
# -o smtpd_helo_restrictions=$mua_helo_restrictions
# -o smtpd_sender_restrictions=$mua_sender_restrictions
# -o smtpd_recipient_restrictions=
# -o smtpd_relay_restrictions=permit_sasl_authenticated,reject
# -o milter_macro_daemon_name=ORIGINATING
#628 inet n - y - - qmqpd
pickup unix n - y 60 1 pickup
cleanup unix n - y - 0 cleanup
header_cleanup unix n - - - 0 cleanup
-o header_checks=regexp:/etc/postfix/submission_header_cleanup.cf
qmgr unix n - n 300 1 qmgr
#qmgr unix n - n 300 1 oqmgr
tlsmgr unix - - y 1000? 1 tlsmgr
rewrite unix - - y - - trivial-rewrite
bounce unix - - y - 0 bounce
defer unix - - y - 0 bounce
trace unix - - y - 0 bounce
verify unix - - y - 1 verify
flush unix n - y 1000? 0 flush
proxymap unix - - n - - proxymap
proxywrite unix - - n - 1 proxymap
smtp unix - - y - - smtp
relay unix - - y - - smtp
-o syslog_name=postfix/$service_name
# -o smtp_helo_timeout=5 -o smtp_connect_timeout=5
showq unix n - y - - showq
error unix - - y - - error
retry unix - - y - - error
discard unix - - y - - discard
local unix - n n - - local
virtual unix - n n - - virtual
lmtp unix - - y - - lmtp
anvil unix - - y - 1 anvil
scache unix - - y - 1 scache
postlog unix-dgram n - n - 1 postlogd
#
# ====================================================================
# Interfaces to non-Postfix software. Be sure to examine the manual
# pages of the non-Postfix software to find out what options it wants.
#
# Many of the following services use the Postfix pipe(8) delivery
# agent. See the pipe(8) man page for information about ${recipient}
# and other message envelope options.
# ====================================================================
#
# maildrop. See the Postfix MAILDROP_README file for details.
# Also specify in main.cf: maildrop_destination_recipient_limit=1
#
maildrop unix - n n - - pipe
flags=DRhu user=vmail argv=/usr/bin/maildrop -d ${recipient}
#
# ====================================================================
#
# See the Postfix UUCP_README file for configuration details.
#
uucp unix - n n - - pipe
flags=Fqhu user=uucp argv=uux -r -n -z -a$sender - $nexthop!rmail ($recipient)
#
# Other external delivery methods.
#
ifmail unix - n n - - pipe
flags=F user=ftn argv=/usr/lib/ifmail/ifmail -r $nexthop ($recipient)
bsmtp unix - n n - - pipe
flags=Fq. user=bsmtp argv=/usr/lib/bsmtp/bsmtp -t$nexthop -f$sender $recipient
scalemail-backend unix - n n - 2 pipe
flags=R user=scalemail argv=/usr/lib/scalemail/bin/scalemail-store ${nexthop} ${user} ${extension}
mailman unix - n n - - pipe
flags=FR user=list argv=/usr/lib/mailman/bin/postfix-to-mailman.py
${nexthop} ${user}
I have run into this issue but not with Postfix, but with SSH (openssh <6.2). I found out this post on SO that was related to the issue I had: https://stackoverflow.com/questions/66770798/awss-proxy-protocol-v2-breaking-application-due-to-absence-of-psh-flag. The answer posted there by dade describes a feature flag that you can set on the target group that fixed the issue on my side. I'm pretty sure it might be related to the issues you're seeing too!