Gunicorn (with Flask) parameters for Google Cloud Run - production setup?

Solution 1:

The point here isn't only about gunicorn, but what your API does, how much time it takes to answer, and how much memory it requires to execute.

Supposing it's a simple API answering in 2-digit milliseconds and doesn't require any heavy library, I think 1 worker + 4 threads is supposed to work smoothly for your traffic (100-150 requests per 30 minutes).

Anyway, considering your application is thread-safe, I'd use 4 workers for having extra power in a way it can deal with some unexpected higher traffic.