Use gsub remove all string before first white space in R
I have a data frame like this:
name weight
r apple 0.5
y pear 0.4
y cherry 0.1
g watermelon 5.0
pp grape 0.5
y apple pear 0.4
... ...
I would like to remove all characters before the first white space in the name column. Can anybody give me a favor? Thank you!
Try this:
sub(".*? ", "", D$name)
Edit:
The pattern is looking for any character zero or more times (.*
) up until the first space, and then capturing the one or more characters ((.+)
) after that first space. The ?
after .*
makes it "lazy" rather than "greedy" and is what makes it stop at the first space found. So, the .*?
matches everything before the first space, the space matches the first space found.
If D
is your data frame, try
sub(".+? ", "", D$name)
Let's say your data frame is called 'df'
library(reshape2)
df$name = colsplit(df$name," ", names = c("chuck","name"))[,2]
The following solution does not use gsub but it can be applied to a dataframe using a pipe operator %>%
.
library(tidyverse)
# The data
df <- structure(list(name = c("r apple", "y pear", "y cherry", "g watermelon",
"pp grape", "y apple pear"), weight = c(0.5, 0.4, 0.1, 5.0, 0.5, 0.4)),
class = "data.frame", row.names = c(NA, -6L))
# Remove the first characters preceding a white space in the column "name"
df2 <- df %>%
mutate(name = str_replace(name, "^\\S* ", ""))
The regular expression "^\\S* "
search for all characters from the beginning of the string until the first white space.