Replace NA with interpolated value for specific column fields in r

Solution 1:

It is specified in the ?na.approx

An object of similar structure as object with NAs replaced by interpolation. For na.approx only the internal NAs are replaced and leading or trailing NAs are omitted if na.rm = TRUE or not replaced if na.rm = FALSE.

By default, the na.approx uses na.rm = TRUE

na.approx(object, x = index(object), xout, ..., na.rm = TRUE, maxgap = Inf, along)


Thus, we can change the code to

my_data[, 42] <- na.approx(my_data[, 42], na.rm = FALSE)

In a large dataset, it is possible to have leading/lagging NAs and using the OP's code results in an output vector with less number of elements as na.rm = TRUE, which triggers the length difference error in replacement

Solution 2:

perhaps we can use approxfun if you need linear interpolation, for example

> y <- c(17.58, rep(NA, 28), 16.58)

> approxfun(which(!is.na(y)), na.omit(y))(seq_along(y))
 [1] 17.58000 17.54552 17.51103 17.47655 17.44207 17.40759 17.37310 17.33862
 [9] 17.30414 17.26966 17.23517 17.20069 17.16621 17.13172 17.09724 17.06276
[17] 17.02828 16.99379 16.95931 16.92483 16.89034 16.85586 16.82138 16.78690
[25] 16.75241 16.71793 16.68345 16.64897 16.61448 16.58000