How to handle in-progress request when applying rolling update?

Given a set of web servers subject to rolling update, such as through kubernetes rolling updates, if a request is issued to one such termination-pending web server milliseconds before a SIGTERM signal is issued to said web server,

  1. Should the server signal the client that it is being SIGTERM'd and tell the client to "try again" using a different (or same) network address (with a potential delay)?
  2. Else, could the server redirect automatically the request to another pod/instance of the webserver that has been rolled up already?
  3. In the specific case of kubernetes, could the request be sent back to the service and let it know to send it back once at least one of the pods has been rolled out?

Solution 1:

When a pod is terminating it has some time (by default 30 seconds) to complete the request when it receives SIGTERM and before it gets SIGKILL. You can configure longer timeouts. There is also preStop hook is called before SIGTERM is sent to a pod. See Kubernetes best practices: terminating with grace blog post for details.

Alternatively, you can configure load balancers to retry failed requests but this work only for idempotent requests.