Multiple environments (Staging, QA, production, etc) with Kubernetes

What is considered a good practice with K8S for managing multiple environments (QA, Staging, Production, Dev, etc)?

As an example, say that a team is working on a product which requires deploying a few APIs, along with a front-end application. Usually, this will require at least 2 environments:

Staging: For iterations/testing and validation before releasing to the client
Production: This the environment the client has access to. Should contain stable and well-tested features.

So, assuming the team is using Kubernetes, what would be a good practice to host these environments? This far we've considered two options:

Use a K8s cluster for each environment
Use only one K8s cluster and keep them in different namespaces.

(1) Seems the safest options since it minimizes the risks of potential human mistake and machine failures, that could put the production environment in danger. However, this comes with the cost of more master machines and also the cost of more infrastructure management.

(2) Looks like it simplifies infrastructure and deployment management because there is one single cluster but it raises a few questions like:

How does one make sure that a human mistake might impact the production environment?
How does one make sure that a high load in the staging environment won't cause a loss of performance in the production environment?

There might be some other concerns, so I'm reaching out to the K8s community on StackOverflow to have a better understanding of how people are dealing with these sort of challenges.

Multiple Clusters Considerations

Take a look at this blog post from Vadim Eisenberg (IBM / Istio): Checklist: pros and cons of using multiple Kubernetes clusters, and how to distribute workloads between them.

I'd like to highlight some of the pros/cons:

Reasons to have multiple clusters

Separation of production/development/test: especially for testing a new version of Kubernetes, of a service mesh, of other cluster software

Compliance: according to some regulations some applications must run in separate clusters/separate VPNs

Better isolation for security

Cloud/on-prem: to split the load between on-premise services

Reasons to have a single cluster

Reduce setup, maintenance and administration overhead

Improve utilization

Cost reduction

Considering a not too expensive environment, with average maintenance, and yet still ensuring security isolation for production applications, I would recommend:

1 cluster for DEV and STAGING (separated by namespaces, maybe even isolated, using Network Policies, like in Calico)
1 cluster for PROD

Environment Parity

It's a good practice to keep development, staging, and production as similar as possible:

Differences between backing services mean that tiny incompatibilities crop up, causing code that worked and passed tests in development or staging to fail in production. These types of errors create friction that disincentivizes continuous deployment.

Combine a powerful CI/CD tool with helm. You can use the flexibility of helm values to set default configurations, just overriding the configs that differ from an environment to another.

GitLab CI/CD with AutoDevops has a powerful integration with Kubernetes, which allows you to manage multiple Kubernetes clusters already with helm support.

Managing multiple clusters (`kubectl` interactions)

When you are working with multiple Kubernetes clusters, it’s easy to mess up with contexts and run kubectl in the wrong cluster. Beyond that, Kubernetes has restrictions for versioning mismatch between the client (kubectl) and server (kubernetes master), so running commands in the right context does not mean running the right client version.

To overcome this:

Use asdf to manage multiple kubectl versions
Set the KUBECONFIG env var to change between multiple kubeconfig files
Use kube-ps1 to keep track of your current context/namespace
Use kubectx and kubens to change fast between clusters/namespaces
Use aliases to combine them all together

I have an article that exemplifies how to accomplish this: Using different kubectl versions with multiple Kubernetes clusters

I also recommend the following reads:

Mastering the KUBECONFIG file by Ahmet Alp Balkan (Google Engineer)
How Zalando Manages 140+ Kubernetes Clusters by Henning Jacobs (Zalando Tech)

Definitely use a separate cluster for development and creating docker images so that your staging/production clusters can be locked down security wise. Whether you use separate clusters for staging + production is up to you to decide based on risk/cost - certainly keeping them separate will help avoid staging affecting production.

I'd also highly recommend using GitOps to promote versions of your apps between your environments.

To minimise human error I also recommend you look into automating as much as you can for your CI/CD and promotion.

Here's a demo of how to automate CI/CD with multiple environments on Kubernetes using GitOps for promotion between environments and Preview Environments on Pull Requests which was done live on GKE though Jenkins X supports most kubernetes clusters

It depends on what you want to test in each of the scenarios. In general I would try to avoid running test scenarios on the production cluster to avoid unnecessary side effects (performance impact, etc.).

If your intention is testing with a staging system that exactly mimics the production system I would recommend firing up an exact replica of the complete cluster and shut it down after you're done testing and move the deployments to production.

If your purpose is testing a staging system that allows testing the application to deploy I would run a smaller staging cluster permanently and update the deployments (with also a scaled down version of the deployments) as required for continuous testing.

To control the different clusters I prefer having a separate ci/cd machine that is not part of the cluster but used for firing up and shutting down clusters as well as performing deployment work, initiating tests, etc. This allows to set up and shut down clusters as part of automated testing scenarios.

It's clear that by keeping the production cluster appart from the staging one, the risk of potential errors impacting the production services is reduced. However this comes at a cost of more infrastructure/configuration management, since it requires at least:

at least 3 masters for the production cluster and at least one master for the staging one
2 Kubectl config files to be added to the CI/CD system

Let’s also not forget that there could be more than one environment. For example I've worked at companies where there are at least 3 environments:

QA: This where we did daily deploys and where we did our internal QA before releasing to the client)
Client QA: This where we deployed before deploying to production so that the client could validate the environment before releasing to production)
Production: This where production services are deployed.

I think ephemeral/on-demand clusters makes sense but only for certain use cases (load/performance testing or very « big » integration/end-to-end testing) but for more persistent/sticky environments I see an overhead that might be reduced by running them within a single cluster.

I guess I wanted to reach out to the k8s community to see what patterns are used for such scenarios like the ones I've described.

Multiple environments (Staging, QA, production, etc) with Kubernetes

Multiple Clusters Considerations

Environment Parity

Managing multiple clusters (`kubectl` interactions)

Related

Recent Posts

Multiple environments (Staging, QA, production, etc) with Kubernetes

Multiple Clusters Considerations

Environment Parity

Managing multiple clusters (kubectl interactions)

Related

Recent Posts

Managing multiple clusters (`kubectl` interactions)