Ruby/Rails CSV parsing, invalid byte sequence in UTF-8

Solution 1:

You need to tell Ruby that the file is in ISO-8859-1. Change your file open line to this:

file=File.open("input_file", "r:ISO-8859-1")

The second argument tells Ruby to open read only with the encoding ISO-8859-1.

Solution 2:

Specify the encoding with encoding option:

CSV.foreach(file.path, headers: true, encoding:'iso-8859-1:utf-8') do |row|
  ...
end

Solution 3:

You can supply source encoding straight in the file mode parameter:

CSV.foreach( "file.csv", "r:windows-1250" ) do |row|
   <your code>
end

Solution 4:

If you have only one (or few) file, so when its not needed to automatically declare encoding on whatever file you get from input, and you have the contents of this file visible in plaintext (txt, csv etc) separated with i.e. semicolon, you can create new file with .csv extension manually, and paste the contents of your file there, then parse the contents like usual.

Keep in mind, that this is a workaround, but in need of parsing in linux only one big excel file, converted to some flavour of csv, it spares time on experimenting with all those fancy encodings