Get "embedded nul(s) found in input" when reading a csv using read.csv()
Solution 1:
Your CSV might be encoded in UTF-16. This isn't uncommon when working with some Windows-based tools.
You can try loading a UTF-16 CSV like this:
read.csv("mycsv.csv", ..., fileEncoding="UTF-16LE")
Solution 2:
You can try using the skipNul = TRUE
option.
mydata = read.csv("mycsv.csv", quote = "\"", skipNul = TRUE)
From ?read.csv
Embedded nuls in the input stream will terminate the field currently being read, with a warning once per call to scan. Setting
skipNul = TRUE
causes them to be ignored.
It worked for me.
Solution 3:
This is nothing to do with the encoding. This is the problem with reading of the nulls in the file. To handle that, you need to pass the skipNul = TRUE
paramater.
for example:
neg = scan('F:/Natural_Language_Processing/negative-words.txt', what = 'character', comment.char = '', encoding = "UTF-8", skipNul = TRUE)