Where does the convention of using /healthz for application health checks come from?
It historically comes from Google’s internal practices. They're called "z-pages".
The reason it ends with z
is to reduce collisions with actual application endpoints with the same name (like /status
). See this talk for more: https://vimeo.com/173610242
Similar endpoints (at least inside Google) are /varz
, /statusz
, /rpcz
. Services developed at Google automatically get these endpoints to export their health and metrics and there are tools that collect the exposed metrics/statuses from all the deployed services.
Open source tools like Prometheus implement this pattern (since original authors of Prometheus are also ex-Googlers) by coming to a well-known endpoint to collect metrics from your application. Similarly OpenCensus allows you to expose z-pages from your app (ideally on a different port) to diagnose problems.