R: Select values from data table in range

I have a data table in R:

name    date
----    ----
John    1156649280
Adam    1255701960
...etc...

I want to get all of the rows that have a date within a range. In SQL, I might say SELECT * FROM mytable WHERE date > 5 AND date < 15

What is the equivalent in R, to select rows based on the range of values in a particular column?


Solution 1:

Construct some data

df <- data.frame( name=c("John", "Adam"), date=c(3, 5) )

Extract exact matches:

subset(df, date==3)

  name date
1 John    3

Extract matches in range:

subset(df, date>4 & date<6)

  name date
2 Adam    5

The following syntax produces identical results:

df[df$date>4 & df$date<6, ]

  name date
2 Adam    5

Solution 2:

Lots of options here, but one of the easiest to follow is subset. Consider:

> set.seed(43)
> df <- data.frame(name = sample(letters, 100, TRUE), date = sample(1:500, 100, TRUE))
> 
> subset(df, date > 5 & date < 15)
   name date
11    k   10
67    y   12
86    e    8

You can also insert logic directly into the index for your data.frame. The comma separates the rows from columns. We just have to remember that R indexes rows first, then columns. So here we are saying rows with date > 5 & < 15 and then all columns:

df[df$date > 5 & df$date < 15 ,]

I'd also recommend checking out the help pages for subset, ?subset and the logical operators ?"&"

Solution 3:

One should also consider another intuitive way to do this using filter() from dplyr. Here are some examples:

set.seed(123)
df <- data.frame(name = sample(letters, 100, TRUE),
                 date = sample(1:500, 100, TRUE))
library(dplyr)
filter(df, date < 50) # date less than 50
filter(df, date %in% 50:100) # date between 50 and 100
filter(df, date %in% 1:50 & name == "r") # date between 1 and 50 AND name is "r"
filter(df, date %in% 1:50 | name == "r") # date between 1 and 50 OR name is "r"

# You can also use the pipe (%>%) operator
df %>% filter(date %in% 1:50 | name == "r")