---
title: "Practice for the practice Quiz"
output:
  word_document: default
  pdf_document: default
  html_notebook: default
---

Using Problem 12.2.1 Exercise 2 as a guide, use the ideas from Chapter 13 to answer the questions for *table2*.

1. Compute the rate and include it in a final dataframe with the years as columns.  

**Answer:**

The first answer approaches the problem by splitting the dataset into two and then joining the two dataset.

```{r message=FALSE}
library(tidyverse)

table2
```

```{r}
table2 %>% arrange(type)
```


```{r}
table2_cases <- table2 %>% filter(type == "cases") %>% 
  select(country, year, count) %>%
  rename(cases = count)
table2_cases
```


```{r}
library(stringr)


table2_pop <- table2 %>% filter(type == "population") %>% 
  select(country, year, count) %>%
  rename(population = count)
table2_pop
```

Now join the two datasets using two variables as the unique key.

```{r}
table2_join <- table2_cases %>% inner_join(table2_pop, by=c("country", "year")) 

table2_join
```
 
 Create the new column.
 
```{r}
table2_new <- table2_join %>% mutate(rate = cases / population * 10000)

table2_new
```

Now spread the data out into two columns.
 
```{r}
table2_new_spread <- table2_new %>% select(country, year, rate) %>%
  spread(year, rate)

table2_new_spread
```
 
Now try the new function *pivot_wider()*.  Note new this function is from the *tidyr* 1.0 package.

```{r}
table2_new_spread2 <- table2_new %>% select(country, year, rate) %>%
  pivot_wider(country, names_from = year, values_from = rate)

table2_new_spread2
```

Are the two files the same.  Lets give the *comparedf()* function a try.  It is from the *arsenal* R package. 

```{r}
library(arsenal)

comparedf(table2_new_spread, table2_new_spread2)
```

**Anternative Solution:**

Can we use spread from the beginning?  Yes.

```{r}
table2 %>% spread(key = type, value = count) %>%
  mutate(rate = cases/population) %>%
  select(-cases, -population) %>%
  spread(key = year, value = rate)
```

Or


```{r}
table2 %>% pivot_wider(names_from = type, values_from = count) %>%
  mutate(rate = cases/population) %>%
  select(-cases, -population) %>%
  pivot_wider(names_from = year, values_from = rate)
```

2. Now make a clustered bar graph. Question, which table is the one to use, table2_new or table2_new_spread?
 
 **Answer:**  The one to use is in tidy format. So table2_new.  Note the use of as.factor() function.  This is our next topic of discussion.
 
```{r}
table2_new %>% ggplot(aes(x = country, y = rate, fill = as.factor(year))) +
  geom_bar(stat = "identity", position = "dodge") +
  theme_light()
```
 
Or you can make the plot using year to group the bars.
 
```{r}
table2_new %>% ggplot(aes(x = as.factor(year), y = rate, fill = country)) +
  geom_bar(stat = "identity", position = "dodge") +
  theme_light()
```
```