Elastic Beanstalk disable health state change based on 4xx responses
I have a rest api running on Elastic Beanstalk, which works great. Everything application-wise is running good, and working as expected.
The application is a rest api, used to lookup different users.
example url: http://service.com/user?uid=xxxx&anotherid=xxxx
If a user with either id's is found, the api responds with 200 OK
, if not, responds with 404 Not Found
as per. HTTP/1.1
status code defenitions.
It is not uncommon for our api to answer 404 Not Found
on a lot of requests, and the elastic beanstalk transfers our environment from OK
into Warning
or even into Degraded
because of this. And it looks like nginx
has refused connection to the application because of this degraded state. (looks like it has a threshold of 30%+ into warning
and 50%+ into degraded
states. This is a problem, because the application is actually working as expected, but Elastic Beanstalks default settings thinks it is a problem, when it's really not.
Does anyone know of a way to edit the threshold of the 4xx warnings and state transitions in EB, or completely disable them?
Or should i really do a symptom-treatment and stop using 404 Not Found
on a call like this? (i really do not like this option)
Solution 1:
Update: AWS EB finally includes a built-in setting for this: https://stackoverflow.com/a/51556599/1123355
Old Solution: Upon diving into the EB instance and spending several hours looking for where EB's health check daemon actually reports the status codes back to EB for evaluation, I finally found it, and came up with a patch that can serve as a perfectly fine workaround for preventing 4xx
response codes from turning the environment into a Degraded
environment health state, as well as pointlessly notifying you with this e-mail:
Environment health has transitioned from Ok to Degraded. 59.2 % of the requests are erroring with HTTP 4xx.
The status code reporting logic is located within healthd-appstat
, a Ruby script developed by the EB team that constantly monitors /var/log/nginx/access.log
and reports the status codes to EB, specifically in the following path:
/opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.2.0/gems/healthd-appstat-1.0.1/lib/healthd-appstat/plugin.rb
The following .ebextensions
file will patch this Ruby script to avoid reporting 4xx
response codes back to EB. This means that EB will never degrade the environment health due to 4xx
errors because it just won't know that they're occurring. This also means that the "Health" page in your EB environment will always display 0
for the 4xx
response code count.
container_commands:
01-patch-healthd:
command: "sudo /bin/sed -i 's/\\# normalize units to seconds with millisecond resolution/if status \\&\\& status.index(\"4\") == 0 then next end/g' /opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.2.0/gems/healthd-appstat-1.0.1/lib/healthd-appstat/plugin.rb"
02-restart-healthd:
command: "sudo /usr/bin/kill $(/bin/ps aux | /bin/grep -e '/bin/bash -c healthd' | /usr/bin/awk '{ print $2 }')"
ignoreErrors: true
Yes, it's a bit ugly, but it gets the job done, at least until the EB team provide a way to ignore 4xx
errors via some configuration parameter. Include it with your application when you deploy, in the following path relative to the root directory of your project:
.ebextensions/ignore_4xx.config
Good luck, and let me know if this helped!
Solution 2:
There is a dedicated Health monitoring rule customization called Ignore HTTP 4xx (screenshot attached) Just enable it and EB will not degrade instance health on 4xx errors.
Solution 3:
Thank you for your answer Elad Nava, I had the same problem and your solution worked perfectly for me!
However, after opening a ticket in the AWS Support Center, they recommended me to modify the nginx
configuration to ignore 4xx on Health Check instead of modifying the ruby script. To do that, I also had to add a config file to the .ebextensions
directory, in order to overwrite the default nginx.conf
file:
files:
"/tmp/nginx.conf":
content: |
# Elastic Beanstalk Managed
# Elastic Beanstalk managed configuration file
# Some configuration of nginx can be by placing files in /etc/nginx/conf.d
# using Configuration Files.
# http://docs.amazonwebservices.com/elasticbeanstalk/latest/dg/customize-containers.html
#
# Modifications of nginx.conf can be performed using container_commands to modify the staged version
# located in /tmp/deployment/config/etc#nginx#nginx.conf
# Elastic_Beanstalk
# For more information on configuration, see:
# * Official English Documentation: http://nginx.org/en/docs/
# * Official Russian Documentation: http://nginx.org/ru/docs/
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
worker_rlimit_nofile 1024;
events {
worker_connections 1024;
}
http {
###############################
# CUSTOM CONFIG TO IGNORE 4xx #
###############################
map $status $loggable {
~^[4] 0;
default 1;
}
map $status $modstatus {
~^[4] 200;
default $status;
}
#####################
# END CUSTOM CONFIG #
#####################
port_in_redirect off;
include /etc/nginx/mime.types;
default_type application/octet-stream;
# This log format was modified to ignore 4xx status codes!
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
log_format healthd '$msec"$uri"'
'$modstatus"$request_time"$upstream_response_time"'
'$http_x_forwarded_for' if=$loggable;
sendfile on;
include /etc/nginx/conf.d/*.conf;
keepalive_timeout 1200;
}
container_commands:
01_modify_nginx:
command: cp /tmp/nginx.conf /tmp/deployment/config/#etc#nginx#nginx.conf
Although this solution is quite more verbose, I personally believe that it is safer to implement, as long as it does not depend on any AWS proprietary script. What I mean is that, if for some reason AWS decides to remove or modify their ruby script (believe me or not, they love to change scripts without previous notice), there is a big chance that the workaround with sed
will not work anymore.
Solution 4:
Here is a solution based off of Adriano Valente's answer. I couldn't get the $loggable
bit to work, although skipping logging for the 404s seems like that would be a good solution. I simply created a new .conf
file that defined the $modstatus
variable, and then overwrote the healthd
log format to use $modstatus
in place of $status
. This change also required nginx to get restarted. This is working on Elastic Beanstalk's 64bit Amazon Linux 2016.09 v2.3.1 running Ruby 2.3 (Puma).
# .ebextensions/nginx.conf
files:
"/tmp/nginx.conf":
content: |
# Custom config to ignore 4xx in the health file only
map $status $modstatus {
~^[4] 200;
default $status;
}
container_commands:
modify_nginx_1:
command: "cp /tmp/nginx.conf /etc/nginx/conf.d/custom_status.conf"
modify_nginx_2:
command: sudo sed -r -i 's@\$status@$modstatus@' /opt/elasticbeanstalk/support/conf/webapp_healthd.conf
modify_nginx_3:
command: sudo /etc/init.d/nginx restart