How to diagnose/fix CloudFormation/autoscaling SSL errors on file download
I have an autoscaling group that was created by AWS CloudFormation. It runs on Amazon Linux 2. Last week, it was working fine. Now, new instances throw a "certificate has expired" error when trying to download phpMyAdmin. It appears cfn-init
is running yum
fine, and grabbing files from S3 fine, but not working when it goes to grab remote files (which it appears it does with Python).
EDIT: see below, but I initially thought this was some problem with the Python scripts. I'm no longer sure that's true, because it works fine in a different autoscaling group grabbing different files. So ... now I'm thinking it's actually some problem with the certificate on phpMyAdmin, but ... I'm not sure. And I still don't know how to diagnose further.
/var/log/cfn-init.log:
2021-10-06 10:08:24,895 [INFO] -----------------------Starting build------------
-----------
2021-10-06 10:08:24,896 [DEBUG] Not setting a reboot trigger as scheduling suppo
rt is not available
2021-10-06 10:08:24,898 [INFO] Running configSets: default
2021-10-06 10:08:24,899 [INFO] Running configSet default
2021-10-06 10:08:24,906 [INFO] Running config config
2021-10-06 10:08:34,173 [DEBUG] Installing/updating ['mod_ssl', 'ksh', 'httpd']
via yum
2021-10-06 10:08:37,523 [INFO] Yum installed ['mod_ssl', 'ksh', 'httpd']
2021-10-06 10:08:37,523 [DEBUG] No groups specified
2021-10-06 10:08:37,523 [DEBUG] No users specified
2021-10-06 10:08:37,523 [DEBUG] No sources specified
... [successfully installs some files from S3 ... ]
2021-10-06 10:08:37,621 [DEBUG] Writing content to /tmp/phpmyadmin.tgz
2021-10-06 10:08:37,621 [DEBUG] Retrieving contents from https://www.phpmyadmin.
net/downloads/phpMyAdmin-latest-all-languages.tar.gz
2021-10-06 10:08:37,640 [ERROR] SSLError
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/cfnbootstrap/packages/requests/packages
/urllib3/connectionpool.py", line 544, in urlopen
body=body, headers=headers)
File "/usr/lib/python3.7/site-packages/cfnbootstrap/packages/requests/packages
/urllib3/connectionpool.py", line 341, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3.7/site-packages/cfnbootstrap/packages/requests/packages
/urllib3/connectionpool.py", line 762, in _validate_conn
conn.connect()
File "/usr/lib/python3.7/site-packages/cfnbootstrap/packages/requests/packages
/urllib3/connection.py", line 238, in connect
ssl_version=resolved_ssl_version)
File "/usr/lib/python3.7/site-packages/cfnbootstrap/packages/requests/packages/urllib3/util/ssl_.py", line 265, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib64/python3.7/ssl.py", line 423, in wrap_socket
session=session
File "/usr/lib64/python3.7/ssl.py", line 870, in _create
self.do_handshake()
File "/usr/lib64/python3.7/ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1091)
...
but, if I log into the machine, downloading with wget
works just fine, EDIT no longer true: so I don't think it's a problem on the remote site, I think it's something with cfn-init
or the associated Python scripts.
EDIT: From a different autoscaling group, also created by CloudFormation, the remote downloads work:
...
2021-10-07 04:21:10,855 [DEBUG] Writing content to /tmp/mysql80.rpm
2021-10-07 04:21:10,855 [DEBUG] Retrieving contents from https://dev.mysql.com/g
et/mysql80-community-release-el7-2.noarch.rpm
2021-10-07 04:21:11,425 [DEBUG] Setting mode for /tmp/mysql80.rpm to 000600
2021-10-07 04:21:11,425 [DEBUG] Setting owner 0 and group 0 for /tmp/mysql80.rpm
...
So, it's successfully downloading from https://dev.mysql.com/
, but failing with an "expired certificate" from https://www.phpmyadmin.net
, even though wget
and going directly there through a browser don't show any errors.
end EDIT
Any idea on how to debug this?
Solution 1:
Short answer:
Use openssl to find if there is an expired certificate:
openssl s_client -showcerts -connect <falingwebsite.com>:443 | grep "certificate has expired"
Get the new, unexpired certificate and attach it to the AWS package ca_bundle:
cat <newcert>.pem >> /path/to/your/python/env/lib/python3.9/site-packages/cfnbootstrap/packages/requests/cacert.pem
Long answer:
I ended up writing some python scripts to try and download the server side certificate, to see if there was something going awry with the download. I never did get the certificates directly downloaded with urllib3
, but while chasing that down on StackOverflow, @clockwatcher discovered the actual problem, and it seems likely it will affect others, and not just with phpMyAdmin.
It has to do with the DST Root CA X3 Expiration, which happened on September 30th. AWS does not include the new certificate in their bundled approved certificates, so you have to manually add the new one.
You can grab the self-signed ISRG Root X1 pem file from LetsEncrypt, and then add it to cfnbootstrap
's ca_cert.pem
file, as per above.
Now, this solves the problem in the case where you have a running machine; however, it's still an issue for autoscaling groups, as there's no easy way to use the
AWS::CloudFormation::Init:
config:
files:
/path/to/file:
source: https://somewhere.using.letsencrypt.com/file
behavior. So, anywhere in your CloudFormation templates using that paradigm, will need to be rewritten so that both the file you actually want and the LetsEncrypt certificate are downloaded with wget
as part of the commands:
section. For example,
01_get_phpmyadmin:
command: !Join ["",
["wget -O phpmyadmin.tgz ",
"https://www.phpmyadmin.net/downloads/",
"phpMyAdmin-latest-all-languages.tar.gz"]]
cwd: /tmp