Add Kubernetes scrape target to Prometheus instance that is NOT in Kubernetes

Solution 1:

If I understand your question, you want to monitor kubernetes cluster where prometheus is not installed on remote kubernetes cluster.

I monitor many different kubernetes cluster from one prometheus which is installed on a standalone server.

You can do this by generating a token on the kubernetes server using a service account which has proper permission to access the kubernetes api.

Kubernetes-api:

Following are the details required to configure prometheus scrape job.

  1. Create a service account which has permissions to read and watch the pods.
  2. Generate token from the service account.
  3. Create scrape job as following.
- job_name: kubernetes
  kubernetes_sd_configs:
  - role: node
    api_server: https://kubernetes-cluster-api.com
    tls_config:
      insecure_skip_verify: true
      bearer_token: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  bearer_token: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  scheme: https
  tls_config:
    insecure_skip_verify: true
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap

I have explained the same in detail in the article
"Monitor remote kubernetes cluster using prometheus". https://amjadhussain3751.medium.com/monitor-remote-kubernetes-cluster-using-prometheus-a3781b041745

Solution 2:

In my opinion, deploying a Prometheus instance in each cluster is a more simple and clean way than organizing external access. The main problem is that the targets discovered with kubernetes_sd_configs are cluster-internal DNS-names and IP-addresses (or at least, it is so in my AWS EKS cluster). To resolve and reach these, you have to be inside the cluster.

This problem can be resolved by using a proxy and so the configuration below uses API-server's proxy endpoint to reach targets. I'm not sure about its performance in large clusters, but in such case it is well-worth to deploy an internal Prometheus instance.

External access through API-server proxy

Things you need (for each cluster):

  1. API-server CA certificate for HTTPS to work (see below how to get it).
  2. Service account token with appropriate permissions (depends on your needs).

Assuming you already have these, here is an example Prometheus configuration:

- job_name: 'kubelet-cadvisor'
  scheme: https

  kubernetes_sd_configs:
  - role: node
    api_server: https://api-server.example.com

    # TLS and auth settings to perform service discovery
    authorization:
      credentials_file: /kube/token  # the file with your service account token
    tls_config:
      ca_file: /kube/CA.crt  # the file with the CA certificate

  # The same as above but for actual scrape request.
  # We're going to send scrape requests back to the API-server
  # so the credentials are the same.
  bearer_token_file: /kube/token
  tls_config:
    ca_file: /kube/CA.crt

  relabel_configs:
  # This is just to drop this long __meta_kubernetes_node_label_ prefix
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)

  # By default Prometheus goes to /metrics endpoint.
  # This relabeling changes it to /api/v1/nodes/[kubernetes_io_hostname]/proxy/metrics/cadvisor
  - source_labels: [kubernetes_io_hostname]
    replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
    target_label: __metrics_path__

  # This relabeling defines that Prometheus should connect to the
  # API-server instead of the actual instance. Together with the relabeling
  # from above this will make the scrape request proxied to the node kubelet.
  - replacement: api-server.example.com
    target_label: __address__

The above is tailored for scraping role: node. To make it working with other roles, you've got to change __metrics_path__ label. The "Manually constructing apiserver proxy URLs" article can help constructing the path.

How to get API-server CA certificate

There are several ways to get it but getting it from kubeconfig appears to me as the simplest:

❯ kubectl config view --raw
apiVersion: v1
clusters:
- cluster:                      # you need this ⤋ long value 
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJ...
    server: https://api-server.example.com
  name: default
...

The certificate in kubeconfig is base64-encoded so you have to decode it before it can be used:

echo LS0tLS1CRUdJTiBDRVJUSUZJ... | base64 -d > CA.crt