Count number of rows within each group
I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate
function to sum data as follows:
df2 <- aggregate(x ~ Year + Month, data = df1, sum)
Now, I would like to count observations but can't seem to find the proper argument for FUN
. Intuitively, I thought it would be as follows:
df2 <- aggregate(x ~ Year + Month, data = df1, count)
But, no such luck.
Any ideas?
Some toy data:
set.seed(2)
df1 <- data.frame(x = 1:20,
Year = sample(2012:2014, 20, replace = TRUE),
Month = sample(month.abb[1:3], 20, replace = TRUE))
Solution 1:
Current best practice (tidyverse) is:
require(dplyr)
df1 %>% count(Year, Month)
Solution 2:
Following @Joshua's suggestion, here's one way you might count the number of observations in your df
dataframe where Year
= 2007 and Month
= Nov (assuming they are columns):
nrow(df[,df$YEAR == 2007 & df$Month == "Nov"])
and with aggregate
, following @GregSnow:
aggregate(x ~ Year + Month, data = df, FUN = length)
Solution 3:
dplyr
package does this with count
/tally
commands, or the n()
function:
First, some data:
df <- data.frame(x = rep(1:6, rep(c(1, 2, 3), 2)), year = 1993:2004, month = c(1, 1:11))
Now the count:
library(dplyr)
count(df, year, month)
#piping
df %>% count(year, month)
We can also use a slightly longer version with piping and the n()
function:
df %>%
group_by(year, month) %>%
summarise(number = n())
or the tally
function:
df %>%
group_by(year, month) %>%
tally()
Solution 4:
An old question without a data.table
solution. So here goes...
Using .N
library(data.table)
DT <- data.table(df)
DT[, .N, by = list(year, month)]