New server unable to fetch configuration from Puppetmaster due to some ssl error

Three machines in the production environment had some hardware issues and were decommissioned. The infrastructure team has reinstalled them and gave them the same hostnames and IP addresses. The aim is to run Puppet on these systems so these can be commissioned again.


Attempt

1) The old Puppet certificates were removed from the Puppetmaster by issuing the following commands:

puppet cert revoke grb16.company.com
puppet cert clean grb16.company.com

2) Once the old certificate was removed, a new certificate request was created by issuing the following command from one of the reinstalled nodes:

[root@grb16 ~]# puppet agent -t
Info: csr_attributes file loading from /etc/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for grb16.company.com
Info: Certificate Request fingerprint (SHA256): 6F:2D:1D:71:67:18:99:86:2C:22:A1:14:80:55:34:35:FD:20:88:1F:36:ED:A7:7B:2A:12:09:4D:F8:EC:BF:6D
Exiting; no certificate found and waitforcert is disabled
[root@grb16 ~]#

3) Once the certificate request was visible on the Puppetmaster, the following command was issued to sign the certificate request:

[root@foreman ~]# puppet cert sign grb16.company.com
Notice: Signed certificate request for grb16.company.com
Notice: Removing file Puppet::SSL::CertificateRequest grb16.company.com at '/var/lib/puppet/ssl/ca/requests/grb16.company.com.pem'
[root@foreman ~]# 

Problem

Once the certificate request has been signed and a Puppet run has been started the following error is thrown:

[root@grb16 ~]# puppet agent -t
Info: Caching certificate for grb16.company.com
Error: Could not request certificate: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Exiting; failed to retrieve certificate and waitforcert is disabled
[root@grb16 ~]# 

Running Puppet for the second time results in:

[root@grb16 ~]# puppet agent -t
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Info: Retrieving pluginfacts
Error: /File[/var/lib/puppet/facts.d]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Error: /File[/var/lib/puppet/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet://foreman.company.com/pluginfacts: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Wrapped exception:
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Info: Retrieving plugin
Error: /File[/var/lib/puppet/lib]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Error: /File[/var/lib/puppet/lib]: Could not evaluate: Could not retrieve file metadata for puppet://foreman.company.com/plugins: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Wrapped exception:
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Error: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [CRL is not yet valid for /CN=Puppet CA: foreman.company.com]
[root@grb16 ~]# 

Analysis

In order to solve the issue, the error message were investigated and it looks like that the problem is SSL or Puppet related. Perhaps one of these packages has been installed incorrectly or a wrong version has been installed on the reinstalled node.

Puppet

[root@grb16 ~]# yum list installed |grep puppet
facter.x86_64          1:2.3.0-1.el6    @puppetlabs_6_products                  
hiera.noarch           1.3.4-1.el6      @puppetlabs_6_products                  
puppet.noarch          3.7.3-1.el6      @puppetlabs_6_products                  
puppetlabs-release.noarch
                       6-11             @puppetlabs_6_products                  
ruby-augeas.x86_64     0.4.1-3.el6      @puppetlabs_6_deps                      
ruby-shadow.x86_64     1:2.2.0-2.el6    @puppetlabs_6_deps                      
rubygem-json.x86_64    1.5.5-3.el6      @puppetlabs_6_deps  

SSL

[root@grb16 ~]# yum list installed |grep ssl
nss_compat_ossl.x86_64 0.9.6-1.el6      @anaconda-CentOS-201410241409.x86_64/6.6
openssl.x86_64         1.0.1e-30.el6_6.4
openssl-devel.x86_64   1.0.1e-30.el6_6.4
[root@grb16 ~]# 

No discrepancies were found between the SSL and Puppet packages that are installed on various servers. The systems that have not been decommissioned or reinstalled are still able to run Puppet. The issue is restricted to the reinstalled server. Note that Puppet has not been run on the other two reinstalled servers. What is causing this issue and how to solve it?


Solution 1:

Concise answer

The issue CRL is not yet valid for indicates that the time between the Puppet-agent and the Puppetmaster is out of sync. Sync the time (NTP). Remove the certificate from the Puppet-agent and Puppetmaster as well and run Puppet on the agent.


Comprehensive answer

CRL is not yet valid for resides in the following snippet.

The following test code snippet describes what causes the issue:

it 'includes the CRL issuer in the verify error message' do
  crl = OpenSSL::X509::CRL.new
  crl.issuer = OpenSSL::X509::Name.new([['CN','Puppet CA: puppetmaster.example.com']])
  crl.last_update = Time.now + 24 * 60 * 60
  ssl_context.stubs(:current_crl).returns(crl)

  subject.call(false, ssl_context)
  expect(subject.verify_errors).to eq(["CRL is not yet valid for /CN=Puppet CA: puppetmaster.example.com"])
end

ssl_context

let(:ssl_context) do
  mock('OpenSSL::X509::StoreContext')
end

subject

subject do
  described_class.new(ssl_configuration,
  ssl_host)
end

The code includes snippets from the OpenSSL::X509::CRL class.

issuer=(p1)

               static VALUE
ossl_x509crl_set_issuer(VALUE self, VALUE issuer)
{
    X509_CRL *crl;

    GetX509CRL(self, crl);

    if (!X509_CRL_set_issuer_name(crl, GetX509NamePtr(issuer))) { /* DUPs name */
        ossl_raise(eX509CRLError, NULL);
    }
    return issuer;
}

last_update=(p1)

               static VALUE
ossl_x509crl_set_last_update(VALUE self, VALUE time)
{
    X509_CRL *crl;
    time_t sec;

    sec = time_to_time_t(time);
    GetX509CRL(self, crl);
    if (!X509_time_adj(crl->crl->lastUpdate, 0, &sec)) {
        ossl_raise(eX509CRLError, NULL);
    }

    return time;
}

The last_updated time will be the current time plus an additional day and will be passed to the subject function that calls the call function that resides in the default_validator class.

class Puppet::SSL::Validator::DefaultValidator #< class Puppet::SSL::Validator
  attr_reader :peer_certs
  attr_reader :verify_errors
  attr_reader :ssl_configuration

  FIVE_MINUTES_AS_SECONDS = 5 * 60

  def initialize(
    ssl_configuration = Puppet::SSL::Configuration.new(
    Puppet[:localcacert], {
      :ca_auth_file => Puppet[:ssl_client_ca_auth]
    }),

    ssl_host = Puppet::SSL::Host.localhost)
    reset!
    @ssl_configuration = ssl_configuration
    @ssl_host = ssl_host
  end

  def call(preverify_ok, store_context)
    if preverify_ok
      ...
    else
      ...
      crl = store_context.current_crl
      if crl
        if crl.last_update && crl.last_update < Time.now + FIVE_MINUTES_AS_SECONDS
          ...
        else
          @verify_errors << "#{error_string} for #{crl.issuer}"
        end
        ...
      end
    end
  end

If preverify_ok is false the else clause is applicable. As if crl.last_update && crl.last_update < Time.now + FIVE_MINUTES_AS_SECONDS results in false because the time has been stubbed with an additional day the else statement will be applicable. The evaluation of @verify_errors << "#{error_string} for #{crl.issuer}" results in CRL is not yet valid for /CN=Puppet CA: puppetmaster.example.com.

In order to solve the issue:

  1. Sync the time between the Puppet-agent and the Puppetmaster. Does the NTP server run (well) on both nodes?
  2. Remove or rename the complete ssl folder (/var/lib/puppet/ssl) from the agent.
  3. Revoke the cert from the master by issuing sudo puppet cert clean <fqdn-puppet-agent>
  4. Sign the cert if autosign is disabled
  5. Run puppet on the agent

In conclusion, the time on Puppet-agents and Puppetmaster should be synced all the time. Exceeding the maximum allowed deviation of 5 minutes will cause the issue.

Solution 2:

Ran into the same issue.

Our puppet setup is version controlled using GitHub, so every time we provision a new puppetmaster, we run into cert issues. Normally puppet ca --clean --all works, but we have found the following more reliable:

rm -rf $(puppet master --configprint ssldir)