Automatically run script on startup with conda environment

I have configured a compute image with everything I need to complete this task. When I create a new instance, I want the instance to automatically run a command on startup. I used the --metadata option with the google cloud compute create, and supplied the script. Everything worked well, until it required some applications that were in a custom conda environment. I tried adding conda activate where applicable, and nothing seemed to work. Is there some way to activate this environment when running this startup scripts?


This one in particular was a little bit tricky because I ran into a few caveats while testing:

At first, I tried to dynamically choose the interpreter of the script, but right away I realized it wouldn't be possible.

Instead, I chose to use two different startup scripts (init.sh and conda.sh), one that installs and configure conda environment and replace the existing script with another, which will run the conda commands afterwards. I did this in part because scripts with an interpreter like #!/usr/bin/env python will not run properly in a startup script. I haven't been able to pinpoint the exact reason why this happens.

This is also necessary because, once you install conda, you need to close and re-open your current shell, and I achieved that with a simple reboot.

Keep in mind that startup scripts run as root.

I'm hosting both the scripts in Cloud Storage, however you can keep them anywhere you want as long as the instance can reach and download them. I tested everything on a Debian 9 instance. Feel free to edit any of this to suit your needs.

First, add a startup script to your instance, then power it on:

gcloud compute instances add-metadata [INSTANCE_NAME] --metadata startup-script-url=gs://[BUCKET]/init.sh

Here's the contents of init.sh:

#!/bin/bash

# Prophylactically update all the packages, optional.
apt -y update && apt -y upgrade

# Downloads and installs miniconda silently
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
bash ~/miniconda.sh -b -p $HOME/miniconda

# Also prophylactic just to make sure commands run properly
export PATH="$HOME/miniconda/bin:$PATH"

# Initiate the conda environment
conda init

# Make sure the environment is activated by default
conda config --set auto_activate_base true

# Removes the current startup script and replaces it with the other
gcloud compute instances remove-metadata [INSTANCE_NAME] --zone [ZONE] --keys=startup-script-url
gcloud compute instances add-metadata [INSTANCE_NAME] --zone [ZONE] --metadata startup-script-url=gs://[BUCKET]/conda.sh

# Necessary to run conda commands still within a startup script (workaround to close and re-open your current shell)
reboot

Here's the contents of conda.sh:

#!/bin/bash

# Again, prophylactic just to make sure commands run properly
export PATH="$HOME/miniconda/bin:$PATH"

# Simple test to make sure everything works
conda -V > /condatest.txt

If condatest.txt is populated with your installed version (in my case it was conda 4.7.10) then it means you're successful.

Finally, it is very important that you adjust your Cloud API access scopes of your instance to Compute Engine Read/Write, as by default, you won't be able to edit your instace's metadata from within the instance (I ran into something similar as this).

If you need debugging tools, you can enable the Serial Console output to see what the instance does during boot and you can also run startup scripts manually with debugging enabled to see where it's crashing.