I'm following the Getting started with Endpoints for GKE with ESPv2. I'm using Workload Identity Federation and Autopilot on the GKE cluster.

I've been running into the error:

F0110 03:46:24.304229 8 server.go:54] fail to initialize config manager: http call to GET https://servicemanagement.googleapis.com/v1/services/name:bookstore.endpoints.<project>.cloud.goog/rollouts?filter=status=SUCCESS returns not 200 OK: 403 Forbidden

Which ultimately leads to a transport failure error and shut down of the Pod.

My first step was to investigate permission issues, but I could really use some outside perspective on this as I've been going around in circles on this.

Here's my config:

>> gcloud container clusters describe $GKE_CLUSTER_NAME \
--zone=$GKE_CLUSTER_ZONE \
--format='value[delimiter="\n"](nodePools[].config.oauthScopes)'
['https://www.googleapis.com/auth/devstorage.read_only', 
'https://www.googleapis.com/auth/logging.write', 
'https://www.googleapis.com/auth/monitoring', 
'https://www.googleapis.com/auth/service.management.readonly', 
'https://www.googleapis.com/auth/servicecontrol', 
'https://www.googleapis.com/auth/trace.append']

>> gcloud container clusters describe $GKE_CLUSTER_NAME \
--zone=$GKE_CLUSTER_ZONE \
--format='value[delimiter="\n"](nodePools[].config.serviceAccount)'
default
default

Service-Account-Name: test-espv2

Roles

Cloud Trace Agent
Owner
Service Account Token Creator
Service Account User
Service Controller
Workload Identity User

I've associated the WIF svc-act with the Cluster with the following yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    iam.gke.io/gcp-service-account: test-espv2@<project>.iam.gserviceaccount.com
  name: test-espv2
  namespace: eventing

And then I've associated the pod with the test-espv2 svc-act

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: esp-grpc-bookstore
  namespace: eventing
spec:
  replicas: 1
  selector:
    matchLabels:
      app: esp-grpc-bookstore
  template:
    metadata:
      labels:
        app: esp-grpc-bookstore
    spec:
      serviceAccountName: test-espv2

Since the gcr.io/endpoints-release/endpoints-runtime:2 is limited, I created a test container and deployed it into the same eventing namespace.

Within the container, I'm able to retrieve the endpoint service config with the following command:

curl --fail -o "service.json" -H "Authorization: Bearer $(gcloud auth print-access-token)" \
 "https://servicemanagement.googleapis.com/v1/services/${SERVICE}/configs/${CONFIG_ID}?view=FULL" 

And also within the container, I'm running as the impersonated service account, tested with:

curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/

Are there any other tests I can run to help me debug this issue?

Thanks in advance,


Solution 1:

Around debugging - I've often found my mistakes by following one of the other methods/programming languages in the Google tutorials.

Have you looked at the OpenAPI notes and tried to follow along?