Ruby on Rails "invalid byte sequence in UTF-8" due to bot

So you don't have to piece together the comments in my other reply, this is what I'm doing now – I've seen no errors for 24 hours, so it looks very promising:

Add rack-utf8_sanitizer to your Gemfile:

gem 'rack-utf8_sanitizer'

and run

bundle

Put this middleware in app/middleware/handle_invalid_percent_encoding.rb and rename the class HandleInvalidPercentEncoding (because ExceptionApp is a bit too general).

In the config block of config/application.rb do:

require "#{Rails.root}/app/middleware/handle_invalid_percent_encoding.rb"


# NOTE: These must be in this order relative to each other.
# HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover,
# so it must run after (= be inserted before) Rack::UTF8Sanitizer.
config.middleware.insert 0, HandleInvalidPercentEncoding
config.middleware.insert 0, Rack::UTF8Sanitizer  # from a gem

Deploy. Done.

(app happens to be the location for middleware in the project I'm working on, but I'd probably prefer lib. Whatever. Either should work.)


Add this line to your Gemfile, then run bundle in your terminal:

gem "handle_invalid_percent_encoding_requests"

This solution is based on Henrik's answer, turned into a Rails Engine gem.