R: Compute Cohen's d based on t-statistic of a coefficient in multiple linear regression
I'm looking at age- and sex-adjusted group differences in a continuous variable of interest. As done in other studies in my field, I want to calculate Cohen's d based on contrasts extracted from a multiple linear regression model.
The original formula (Nakagawa & Cuthill, 2007) is as follows:
n1 = sample size in Group 1
n2 = sample size in Group 2
df' = degrees of freedom used for a corresponding t value in a linear model
t = t-statistic corresponding to the contrast of interest
So far I've attempted to apply this in R, but the results are looking strange (much larger effect sizes than expected).
Here's some simulated data:
library(broom)
df = data.frame(ID = c(1001, 1002, 1003, 1004, 1005, 1006,1007, 1008, 1009, 1010),
Group = as.numeric(c('0','1','0','0','1','1','0','1','0','1')),
age = as.numeric(c('23','28','30','15','7','18','29','27','14','22')),
sex = as.numeric(c('1','0','1','0','0','1','1','0','0','1')),
test_score = as.numeric(c('18','20','19','15','20','23','19','25','10','14')))
# run lm and extract regression coefficients
model <- lm(test_score ~ Group + age + sex, data = df)
tidy_model <- tidy(model)
tidy_model
# A tibble: 4 x 5
#term estimate std.error statistic p.value
#<chr> <dbl> <dbl> <dbl> <dbl>
# 1 (Intercept) 11.1 4.41 2.52 0.0451
# 2 Group 4.63 2.65 1.75 0.131
# 3 age 0.225 0.198 1.13 0.300
# 4 sex 0.131 2.91 0.0452 0.965
t_statistic <- tidy_model[2,4] # = 1.76
n <- 5 #(equal n of participants in Group1 as in Group2)
cohens_d <- t_statistic*(n + n)/(sqrt(n * n) * sqrt(1)) # 1 dof for 1 estimated parameter (group contrast)
cohens_d # = 3.518096
Could you please flag up where I'm going wrong?
Solution 1:
You have set the degrees of freedom to 1. However, you actually have 6 degrees of freedom which you can see if you type: summary(model)
.
If you set your degrees of freedom to 6 your Cohen's d will be ~1.7 which should be more inline with what you expect.