--- title: "Practice for the practice Quiz" output: word_document: default pdf_document: default html_notebook: default --- Using Problem 12.2.1 Exercise 2 as a guide, use the ideas from Chapter 13 to answer the questions for *table2*. 1. Compute the rate and include it in a final dataframe with the years as columns. **Answer:** The first answer approaches the problem by splitting the dataset into two and then joining the two dataset. ```{r message=FALSE} library(tidyverse) table2 ``` ```{r} table2 %>% arrange(type) ``` ```{r} table2_cases <- table2 %>% filter(type == "cases") %>% select(country, year, count) %>% rename(cases = count) table2_cases ``` ```{r} library(stringr) table2_pop <- table2 %>% filter(type == "population") %>% select(country, year, count) %>% rename(population = count) table2_pop ``` Now join the two datasets using two variables as the unique key. ```{r} table2_join <- table2_cases %>% inner_join(table2_pop, by=c("country", "year")) table2_join ``` Create the new column. ```{r} table2_new <- table2_join %>% mutate(rate = cases / population * 10000) table2_new ``` Now spread the data out into two columns. ```{r} table2_new_spread <- table2_new %>% select(country, year, rate) %>% spread(year, rate) table2_new_spread ``` Now try the new function *pivot_wider()*. Note new this function is from the *tidyr* 1.0 package. ```{r} table2_new_spread2 <- table2_new %>% select(country, year, rate) %>% pivot_wider(country, names_from = year, values_from = rate) table2_new_spread2 ``` Are the two files the same. Lets give the *comparedf()* function a try. It is from the *arsenal* R package. ```{r} library(arsenal) comparedf(table2_new_spread, table2_new_spread2) ``` **Anternative Solution:** Can we use spread from the beginning? Yes. ```{r} table2 %>% spread(key = type, value = count) %>% mutate(rate = cases/population) %>% select(-cases, -population) %>% spread(key = year, value = rate) ``` Or ```{r} table2 %>% pivot_wider(names_from = type, values_from = count) %>% mutate(rate = cases/population) %>% select(-cases, -population) %>% pivot_wider(names_from = year, values_from = rate) ``` 2. Now make a clustered bar graph. Question, which table is the one to use, table2_new or table2_new_spread? **Answer:** The one to use is in tidy format. So table2_new. Note the use of as.factor() function. This is our next topic of discussion. ```{r} table2_new %>% ggplot(aes(x = country, y = rate, fill = as.factor(year))) + geom_bar(stat = "identity", position = "dodge") + theme_light() ``` Or you can make the plot using year to group the bars. ```{r} table2_new %>% ggplot(aes(x = as.factor(year), y = rate, fill = country)) + geom_bar(stat = "identity", position = "dodge") + theme_light() ``` ```