library(tidyverse)
library(rstatix)
effect_size
Effect Size
What is the Effect Size of the independent two sample t-test with unequal varainces?
The effect size is the difference between the two means divided by the pooled standard deviation. The pooled standard deviation is the square root of the average of the two sample variances. The effect size is a measure of the magnitude of the difference between the two means. It is a measure of the strength of the relationship between the two variables. It is a measure of the magnitude of the difference between the two means.
The formula for the effect size is:
\[d = \frac{\mu_1 - \mu_2}{\sqrt{\frac{\sigma_1^2 + \sigma_2^2}{2}}}\]
where \(\mu_1\) and \(\mu_2\) are the means of the two samples and \(\sigma_1\) and \(\sigma_2\) are the standard deviations of the two samples.
The formula for the estimated effect size is:
\[d = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2 + s_2^2}{2}}}\]
where \(\bar{x}_1\) and \(\bar{x}_2\) are the means of the two samples and \(s_1\) and \(s_2\) are the standard deviations of the two samples.
The ranges of values for the effect size are:
Effect Size | Range |
---|---|
Small | 0.2 |
Medium | 0.5 |
Large | 0.8 |
A nice reference
Example
The rstatix provides a simple and intuitive pipe-friendly framework, coherent with the ‘tidyverse’ design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses.
The output of each test is automatically transformed into a tidy data frame to facilitate visualization.
data("ToothGrowth")
<- ToothGrowth df
get_summary_stats(df)
# A tibble: 2 × 13
variable n min max median q1 q3 iqr mad mean sd se
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 len 60 4.2 33.9 19.2 13.1 25.3 12.2 9.04 18.8 7.65 0.988
2 dose 60 0.5 2 1 0.5 2 1.5 0.741 1.17 0.629 0.081
# ℹ 1 more variable: ci <dbl>
# One-sample test
%>% t_test(len ~ 1, mu = 0) df
# A tibble: 1 × 7
.y. group1 group2 n statistic df p
* <chr> <chr> <chr> <int> <dbl> <dbl> <dbl>
1 len 1 null model 60 19.1 59 6.94e-27
|> cohens_d(len ~ 1) df
# A tibble: 1 × 6
.y. group1 group2 effsize n magnitude
* <chr> <chr> <chr> <dbl> <int> <ord>
1 len 1 null model 2.46 60 large
# Two-samples unpaired test
%>% t_test(len ~ supp) df
# A tibble: 1 × 8
.y. group1 group2 n1 n2 statistic df p
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
1 len OJ VC 30 30 1.92 55.3 0.0606
|> cohens_d(len ~ supp, var.equal = FALSE) df
# A tibble: 1 × 7
.y. group1 group2 effsize n1 n2 magnitude
* <chr> <chr> <chr> <dbl> <int> <int> <ord>
1 len OJ VC 0.495 30 30 small