Replace all occurrences of a string in a data frame
I'm working on a data frame that has non-detects which are coded with '<'. Sometimes there is a space after the '<' and sometimes not e.g. '<2' or '< 2'. I'd like to remove every occurrence of the space.
Example:
data <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep('< 2', 9), var2 = rep('<3', 9))
name var1 var2
1 a < 2 <3
2 b < 2 <3
3 c < 2 <3
This is where I've got to:
I can extract all the values and make the new strings but I can't put them back in the data frame.
index <- str_detect(unlist(data), '<')
index <- matrix(index, nrow = 3)
data[index]
#[1] "< 2" "< 2" "< 2" "<3" "<3" "<3"
replacements <- str_replace_all(data[index], "<[ ]+","<")
replacements
#[1] "<2" "<2" "<2" "<3" "<3" "<3"
data[index] <- replacements
#Error in `[<-.data.frame`(`*tmp*`, index, value = c("<2", "<2", "<2", :
# unsupported matrix index in replacement
If you are only looking to replace all occurrences of "< "
(with space) with "<"
(no space), then you can do an lapply
over the data frame, with a gsub
for replacement:
> data <- data.frame(lapply(data, function(x) {
+ gsub("< ", "<", x)
+ }))
> data
name var1 var2
1 a <2 <3
2 a <2 <3
3 a <2 <3
4 b <2 <3
5 b <2 <3
6 b <2 <3
7 c <2 <3
8 c <2 <3
9 c <2 <3
Equivalent to "find and replace." Don't overthink it.
Try it with one:
library(tidyverse)
df <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep('< 2', 9), var2 = rep('<3', 9))
df %>%
mutate(var1 = str_replace(var1, " ", ""))
#> name var1 var2
#> 1 a <2 <3
#> 2 a <2 <3
#> 3 a <2 <3
#> 4 b <2 <3
#> 5 b <2 <3
#> 6 b <2 <3
#> 7 c <2 <3
#> 8 c <2 <3
#> 9 c <2 <3
Apply to all
df %>%
mutate_all(funs(str_replace(., " ", "")))
#> name var1 var2
#> 1 a <2 <3
#> 2 a <2 <3
#> 3 a <2 <3
#> 4 b <2 <3
#> 5 b <2 <3
#> 6 b <2 <3
#> 7 c <2 <3
#> 8 c <2 <3
#> 9 c <2 <3
If the extra space was produced by uniting columns, think about making str_trim
part of your workflow.
Created on 2018-03-11 by the reprex package (v0.2.0).