--- title: "Statistics 652 - Section 1 or 2: Homework 1" author: "Prof. Eric A. Suess" date: " February 17, 2020" output: html_notebook --- ### Exercise 7.1 Using the [ModernDive](moderndive.com) book, [Chapter 8 Bootstrapping and Confidence Intervals](https://moderndive.com/8-confidence-intervals.html). ```{r} library(pacman) p_load(mdsr, tidyverse, mosaicData, infer) ``` ```{r} Gestation ``` ```{r} Gestation %>% select(age) %>% drop_na(age) %>% summarize( n = n(), age_mean = mean(age, na.rm=TRUE), age_sd = sd(age, na.rm=TRUE)) ``` ```{r} Gestation %>% select(age) %>% drop_na(age) %>% ggplot(aes(x = age)) + geom_histogram(binwidth = 1, color = "white") ``` ```{r} x_bar <- Gestation %>% drop_na(age) %>% select(age) %>% specify(response = age) %>% calculate(stat = "mean") x_bar ``` Generate the bootstrap distribution of the sample mean. Take 1000 random samples with replacement from the original sample and compute the sample mean of each of the bootstrap samples. ```{r} Gestation_distribution <- Gestation %>% drop_na(age) %>% select(age) %>% specify(response = age) %>% generate(reps = 1000, type = "bootstrap") %>% calculate(stat = "mean") ``` Boostrap distribution. ```{r} Gestation_distribution %>% visualize() ``` Boostrap Confidence Interval ```{r} Gestation_distribution %>% get_confidence_interval(level = 0.95, type = "percentile") ``` ```{r} percentile_ci <- Gestation_distribution %>% get_confidence_interval(level = 0.95, type = "percentile") percentile_ci ``` ```{r} Gestation_distribution %>% visualize() + shade_confidence_interval(endpoints = percentile_ci) ``` The traditional on sample confidence interval. ```{r} standard_error_ci <- Gestation_distribution %>% get_confidence_interval(type = "se", point_estimate = x_bar) standard_error_ci ``` ```{r} Gestation_distribution %>% visualize() + shade_confidence_interval(endpoints = standard_error_ci) ``` ### Exercise 7.2 ```{r} Gestation %>% select(age) %>% drop_na(age) %>% summarize( n = n(), age_mean = mean(age, na.rm=TRUE), age_sd = sd(age, na.rm=TRUE)) ``` ```{r} Gestation_distribution <- Gestation %>% drop_na(age) %>% select(age) %>% specify(response = age) %>% generate(reps = 1000, type = "bootstrap") %>% calculate(stat = "median") ``` Boostrap distribution. ```{r} Gestation_distribution %>% visualize() ``` ```{r} percentile_ci <- Gestation_distribution %>% get_confidence_interval(level = 0.95, type = "percentile") percentile_ci ``` ```{r} Gestation_distribution %>% visualize() + shade_confidence_interval(endpoints = percentile_ci) ``` ### Exercise 7.5 Read [Appendix E Regression modeling](https://mdsr-book.github.io/excerpts/mdsr-regression.pdf) ```{r} ds ```