Removing specific rows from a dataframe
DF[ ! ( ( DF$sub ==1 & DF$day==2) | ( DF$sub ==3 & DF$day==4) ) , ] # note the ! (negation)
Or if sub is a factor as suggested by your use of quotes:
DF[ ! paste(sub,day,sep="_") %in% c("1_2", "3_4"), ]
Could also use subset:
subset(DF, ! paste(sub,day,sep="_") %in% c("1_2", "3_4") )
(And I endorse the use of which
in Dirk's answer when using "[" even though some claim it is not needed.)
This boils down to two distinct steps:
- Figure out when your condition is true, and hence compute a vector of booleans, or, as I prefer, their indices by wrapping it into
which()
- Create an updated
data.frame
by excluding the indices from the previous step.
Here is an example:
R> set.seed(42)
R> DF <- data.frame(sub=rep(1:4, each=4), day=sample(1:4, 16, replace=TRUE))
R> DF
sub day
1 1 4
2 1 4
3 1 2
4 1 4
5 2 3
6 2 3
7 2 3
8 2 1
9 3 3
10 3 3
11 3 2
12 3 3
13 4 4
14 4 2
15 4 2
16 4 4
R> ind <- which(with( DF, sub==2 & day==3 ))
R> ind
[1] 5 6 7
R> DF <- DF[ -ind, ]
R> table(DF)
day
sub 1 2 3 4
1 0 1 0 3
2 1 0 0 0
3 0 1 3 0
4 0 2 0 2
R>
And we see that sub==2
has only one entry remaining with day==1
.
Edit The compound condition can be done with an 'or' as follows:
ind <- which(with( DF, (sub==1 & day==2) | (sub=3 & day=4) ))
and here is a new full example
R> set.seed(1)
R> DF <- data.frame(sub=rep(1:4, each=5), day=sample(1:4, 20, replace=TRUE))
R> table(DF)
day
sub 1 2 3 4
1 1 2 1 1
2 1 0 2 2
3 2 1 1 1
4 0 2 1 2
R> ind <- which(with( DF, (sub==1 & day==2) | (sub==3 & day==4) ))
R> ind
[1] 1 2 15
R> DF <- DF[-ind, ]
R> table(DF)
day
sub 1 2 3 4
1 1 0 1 1
2 1 0 2 2
3 2 1 1 0
4 0 2 1 2
R>