effect_size

Effect Size

What is the Effect Size of the independent two sample t-test with unequal varainces?

The effect size is the difference between the two means divided by the pooled standard deviation. The pooled standard deviation is the square root of the average of the two sample variances. The effect size is a measure of the magnitude of the difference between the two means. It is a measure of the strength of the relationship between the two variables. It is a measure of the magnitude of the difference between the two means.

The formula for the effect size is:

\[d = \frac{\mu_1 - \mu_2}{\sqrt{\frac{\sigma_1^2 + \sigma_2^2}{2}}}\]

where \(\mu_1\) and \(\mu_2\) are the means of the two samples and \(\sigma_1\) and \(\sigma_2\) are the standard deviations of the two samples.

The formula for the estimated effect size is:

\[d = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2 + s_2^2}{2}}}\]

where \(\bar{x}_1\) and \(\bar{x}_2\) are the means of the two samples and \(s_1\) and \(s_2\) are the standard deviations of the two samples.

The ranges of values for the effect size are:

Effect Size Range
Small 0.2
Medium 0.5
Large 0.8

A nice reference

Example

library(tidyverse)
library(rstatix)

The rstatix provides a simple and intuitive pipe-friendly framework, coherent with the ‘tidyverse’ design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses.

The output of each test is automatically transformed into a tidy data frame to facilitate visualization.

data("ToothGrowth")
df <- ToothGrowth
get_summary_stats(df)
# A tibble: 2 × 13
  variable     n   min   max median    q1    q3   iqr   mad  mean    sd    se
  <fct>    <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 len         60   4.2  33.9   19.2  13.1  25.3  12.2 9.04  18.8  7.65  0.988
2 dose        60   0.5   2      1     0.5   2     1.5 0.741  1.17 0.629 0.081
# ℹ 1 more variable: ci <dbl>
# One-sample test
df %>% t_test(len ~ 1, mu = 0)
# A tibble: 1 × 7
  .y.   group1 group2         n statistic    df        p
* <chr> <chr>  <chr>      <int>     <dbl> <dbl>    <dbl>
1 len   1      null model    60      19.1    59 6.94e-27
df |> cohens_d(len ~ 1)
# A tibble: 1 × 6
  .y.   group1 group2     effsize     n magnitude
* <chr> <chr>  <chr>        <dbl> <int> <ord>    
1 len   1      null model    2.46    60 large    
# Two-samples unpaired test
df %>% t_test(len ~ supp)
# A tibble: 1 × 8
  .y.   group1 group2    n1    n2 statistic    df      p
* <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>  <dbl>
1 len   OJ     VC        30    30      1.92  55.3 0.0606
df |> cohens_d(len ~ supp, var.equal = FALSE)
# A tibble: 1 × 7
  .y.   group1 group2 effsize    n1    n2 magnitude
* <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
1 len   OJ     VC       0.495    30    30 small