--- title: "effect_size" author: "Prof. Eric A. Suess" format: html: embed-resources: true --- ## What is the Effect Size of the independent two sample t-test with unequal variances? The *effect size* is the difference between the two means divided by the pooled standard deviation. The pooled standard deviation is the square root of the average of the two sample variances. The effect size is a measure of the magnitude of the difference between the two means. It is a measure of the strength of the relationship between the two variables. It is a measure of the magnitude of the difference between the two means. The formula for the effect size is: $$d = \frac{\mu_1 - \mu_2}{\sqrt{\frac{\sigma_1^2 + \sigma_2^2}{2}}}$$ where $\mu_1$ and $\mu_2$ are the means of the two samples and $\sigma_1$ and $\sigma_2$ are the standard deviations of the two samples. The formula for the estimated effect size is: $$d = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2 + s_2^2}{2}}}$$ where $\bar{x}_1$ and $\bar{x}_2$ are the means of the two samples and $s_1$ and $s_2$ are the standard deviations of the two samples. The ranges of values for the effect size are: | Effect Size | Range | |:-----------:|:-----:| | Small | 0.2 | | Medium | 0.5 | | Large | 0.8 | A nice [reference](https://www.datanovia.com/en/lessons/t-test-effect-size-using-cohens-d-measure/#:~:text=The%20effect%20size%20for%20a,the%20difference%2C%20as%20shown%20below.&text=Where%20D%20is%20the%20differences%20of%20the%20paired%20samples%20values.) ## Example ```{r} #| warning=FALSE, message=FALSE library(tidyverse) library(rstatix) ``` The [rstatix](https://rpkgs.datanovia.com/rstatix/) provides a simple and intuitive pipe-friendly framework, coherent with the ‘tidyverse’ design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses. The output of each test is automatically transformed into a tidy data frame to facilitate visualization. ```{r} data("ToothGrowth") df <- ToothGrowth ``` ```{r} get_summary_stats(df) ``` ```{r} # One-sample test df %>% t_test(len ~ 1, mu = 0) ``` ```{r} df |> cohens_d(len ~ 1) ``` ```{r} # Two-samples unpaired test df %>% t_test(len ~ supp) ``` ```{r} df |> cohens_d(len ~ supp, var.equal = FALSE) ```