How to split a PEM file

Solution 1:

The awk snippet works for extracting the different parts, but you still need to know which section is the key / cert / chain. I needed to extract a specific section, and found this on the OpenSSL mailinglist: http://openssl.6102.n7.nabble.com/Convert-pem-to-crt-and-key-files-tp47681p47697.html

# Extract key
openssl pkey -in foo.pem -out foo-key.pem

# Extract all the certs
openssl crl2pkcs7 -nocrl -certfile foo.pem |
  openssl pkcs7 -print_certs -out foo-certs.pem

# Extract the textually first cert as DER
openssl x509 -in foo.pem -outform DER -out first-cert.der

Solution 2:

The split command is available on most systems, and its invocation is likely easier to remember.

If you have a file collection.pem that you want to split into individual-* files, use:

split -p "-----BEGIN CERTIFICATE-----" collection.pem individual-

If you don't have split, you could try csplit:

csplit -s -z -f individual- collection.pem '/-----BEGIN CERTIFICATE-----/' '{*}'

-s skips printing output of file sizes

-z does not create empty files

Solution 3:

This was previously answered on StackOverflow :

awk '
  split_after == 1 {n++;split_after=0}
  /-----END CERTIFICATE-----/ {split_after=1}
  {print > "cert" n ".pem"}' < $file

Edit 29/03/2016 : See @slugchewer answer

Solution 4:

If you want to get a single certificate out of a multi-certificate PEM bundle, try:

$ openssl crl2pkcs7 -nocrl -certfile INPUT.PEM | \
    openssl pkcs7 -print_certs | \
    awk '/subject.*CN=host.domain.com/,/END CERTIFICATE/'

The first two openssl commands will process a PEM file and and spit it back out with pre-pended "subject:" and "issuer:" lines before each cert. If your PEM is already formatted this way, all you need is the final awk command.
The awk command will spit out the individual PEM matching the CN (common name) string.

source1 , source2

Solution 5:

If you are handling full chain certificates (i.e. the ones generated by letsencrypt / certbot etc), which are a concatenation of the certificate and the certificate authority chain, you can use bash string manipulation.

For example:

# content of /path/to/fullchain.pem
-----BEGIN CERTIFICATE-----
some long base64 string containing
the certificate
-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----
another base64 string
containing the first certificate
in the authority chain
-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----
another base64 string
containing the second certificate
in the authority chain
(there might be more...)
-----END CERTIFICATE-----

To extract the certificate and the certificate authority chain into variables:

# load the certificate into a variable
FULLCHAIN=$(</path/to/fullchain.pem)
CERTIFICATE="${FULLCHAIN%%-----END CERTIFICATE-----*}-----END CERTIFICATE-----"
CHAIN=$(echo -e "${FULLCHAIN#*-----END CERTIFICATE-----}" | sed '/./,$!d')

Explanation:

Instead of using awk or openssl (which are powerful tools but not always available, i.e. in Docker Alpine images), you can use bash string manipulation.

"${FULLCHAIN%%-----END CERTIFICATE-----*}-----END CERTIFICATE-----": from end of the content of FULLCHAIN, return the longest substring match, then concat -----END CERTIFICATE----- as it get stripped away. The * matches all the characters after -----END CERTIFICATE-----.

$(echo -e "${FULLCHAIN#*-----END CERTIFICATE-----}" | sed '/./,$!d'): from the beginning of the content of FULLCHAIN, return the shortest substring match, then strip leading new lines. Likewise, the * matches all the characters before -----END CERTIFICATE-----.

For a quick reference (while you can find more about string manipulation in bash here):

${VAR#substring}= the shortest substring from the beginning of the content of VAR

${VAR%substring}= the shortest substring from the end of the content of VAR

${VAR##substring}= the longest substring from the beginning of the content of VAR

${VAR%%substring}= the longest substring from the end of the content of VAR