403 Forbidden on ESPv2, GKE AutoPilot, WIF
I'm following the Getting started with Endpoints for GKE with ESPv2. I'm using Workload Identity Federation and Autopilot on the GKE cluster.
I've been running into the error:
F0110 03:46:24.304229 8 server.go:54] fail to initialize config manager: http call to GET https://servicemanagement.googleapis.com/v1/services/name:bookstore.endpoints.<project>.cloud.goog/rollouts?filter=status=SUCCESS returns not 200 OK: 403 Forbidden
Which ultimately leads to a transport failure error and shut down of the Pod.
My first step was to investigate permission issues, but I could really use some outside perspective on this as I've been going around in circles on this.
Here's my config:
>> gcloud container clusters describe $GKE_CLUSTER_NAME \
--zone=$GKE_CLUSTER_ZONE \
--format='value[delimiter="\n"](nodePools[].config.oauthScopes)'
['https://www.googleapis.com/auth/devstorage.read_only',
'https://www.googleapis.com/auth/logging.write',
'https://www.googleapis.com/auth/monitoring',
'https://www.googleapis.com/auth/service.management.readonly',
'https://www.googleapis.com/auth/servicecontrol',
'https://www.googleapis.com/auth/trace.append']
>> gcloud container clusters describe $GKE_CLUSTER_NAME \
--zone=$GKE_CLUSTER_ZONE \
--format='value[delimiter="\n"](nodePools[].config.serviceAccount)'
default
default
Service-Account-Name: test-espv2
Roles
Cloud Trace Agent
Owner
Service Account Token Creator
Service Account User
Service Controller
Workload Identity User
I've associated the WIF svc-act with the Cluster with the following yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
iam.gke.io/gcp-service-account: test-espv2@<project>.iam.gserviceaccount.com
name: test-espv2
namespace: eventing
And then I've associated the pod with the test-espv2
svc-act
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: esp-grpc-bookstore
namespace: eventing
spec:
replicas: 1
selector:
matchLabels:
app: esp-grpc-bookstore
template:
metadata:
labels:
app: esp-grpc-bookstore
spec:
serviceAccountName: test-espv2
Since the gcr.io/endpoints-release/endpoints-runtime:2
is limited,
I created a test container and deployed it into the same eventing
namespace.
Within the container, I'm able to retrieve the endpoint service config with the following command:
curl --fail -o "service.json" -H "Authorization: Bearer $(gcloud auth print-access-token)" \
"https://servicemanagement.googleapis.com/v1/services/${SERVICE}/configs/${CONFIG_ID}?view=FULL"
And also within the container, I'm running as the impersonated service account, tested with:
curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/
Are there any other tests I can run to help me debug this issue?
Thanks in advance,
Solution 1:
Around debugging - I've often found my mistakes by following one of the other methods/programming languages in the Google tutorials.
Have you looked at the OpenAPI notes and tried to follow along?