Using Problem 12.2.1 Exercise 2 as a guide, use the ideas from Chapter 13 to answer the questions for table2.
- Compute the rate and include it in a final dataframe with the years as columns.
Answer:
The first answer approaches the problem by splitting the dataset into two and then joining the two dataset.
library(tidyverse)
table2
table2 %>% arrange(type)
table2_cases <- table2 %>% filter(type == "cases") %>%
select(country, year, count) %>%
rename(cases = count)
table2_cases
library(stringr)
table2_pop <- table2 %>% filter(type == "population") %>%
select(country, year, count) %>%
rename(population = count)
table2_pop
Now join the two datasets using two variables as the unique key.
table2_join <- table2_cases %>% inner_join(table2_pop, by=c("country", "year"))
table2_join
Create the new column.
table2_new <- table2_join %>% mutate(rate = cases / population * 10000)
table2_new
Now spread the data out into two columns.
table2_new_spread <- table2_new %>% select(country, year, rate) %>%
spread(year, rate)
table2_new_spread
Now try the new function pivot_wider(). Note new this function is from the tidyr 1.0 package.
table2_new_spread2 <- table2_new %>% select(country, year, rate) %>%
pivot_wider(country, names_from = year, values_from = rate)
table2_new_spread2
Are the two files the same. Lets give the comparedf() function a try. It is from the arsenal R package.
library(arsenal)
comparedf(table2_new_spread, table2_new_spread2)
Compare Object
Function Call:
comparedf(x = table2_new_spread, y = table2_new_spread2)
Shared: 3 non-by variables and 3 observations.
Not shared: 0 variables and 0 observations.
Differences found in 0/3 variables compared.
0 variables compared have non-identical attributes.
Anternative Solution:
Can we use spread from the beginning? Yes.
table2 %>% spread(key = type, value = count) %>%
mutate(rate = cases/population) %>%
select(-cases, -population) %>%
spread(key = year, value = rate)
Or
table2 %>% pivot_wider(names_from = type, values_from = count) %>%
mutate(rate = cases/population) %>%
select(-cases, -population) %>%
pivot_wider(names_from = year, values_from = rate)
- Now make a clustered bar graph. Question, which table is the one to use, table2_new or table2_new_spread?
Answer: The one to use is in tidy format. So table2_new. Note the use of as.factor() function. This is our next topic of discussion.
table2_new %>% ggplot(aes(x = country, y = rate, fill = as.factor(year))) +
geom_bar(stat = "identity", position = "dodge") +
theme_light()
![]()
Or you can make the plot using year to group the bars.
table2_new %>% ggplot(aes(x = as.factor(year), y = rate, fill = country)) +
geom_bar(stat = "identity", position = "dodge") +
theme_light()
![]()
```
LS0tCnRpdGxlOiAiUHJhY3RpY2UgZm9yIHRoZSBwcmFjdGljZSBRdWl6IgpvdXRwdXQ6CiAgd29yZF9kb2N1bWVudDogZGVmYXVsdAogIHBkZl9kb2N1bWVudDogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKLS0tCgpVc2luZyBQcm9ibGVtIDEyLjIuMSBFeGVyY2lzZSAyIGFzIGEgZ3VpZGUsIHVzZSB0aGUgaWRlYXMgZnJvbSBDaGFwdGVyIDEzIHRvIGFuc3dlciB0aGUgcXVlc3Rpb25zIGZvciAqdGFibGUyKi4KCjEuIENvbXB1dGUgdGhlIHJhdGUgYW5kIGluY2x1ZGUgaXQgaW4gYSBmaW5hbCBkYXRhZnJhbWUgd2l0aCB0aGUgeWVhcnMgYXMgY29sdW1ucy4gIAoKKipBbnN3ZXI6KioKClRoZSBmaXJzdCBhbnN3ZXIgYXBwcm9hY2hlcyB0aGUgcHJvYmxlbSBieSBzcGxpdHRpbmcgdGhlIGRhdGFzZXQgaW50byB0d28gYW5kIHRoZW4gam9pbmluZyB0aGUgdHdvIGRhdGFzZXQuCgpgYGB7ciBtZXNzYWdlPUZBTFNFfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKCnRhYmxlMgpgYGAKCmBgYHtyfQp0YWJsZTIgJT4lIGFycmFuZ2UodHlwZSkKYGBgCgoKCmBgYHtyfQp0YWJsZTJfY2FzZXMgPC0gdGFibGUyICU+JSBmaWx0ZXIodHlwZSA9PSAiY2FzZXMiKSAlPiUgCiAgc2VsZWN0KGNvdW50cnksIHllYXIsIGNvdW50KSAlPiUKICByZW5hbWUoY2FzZXMgPSBjb3VudCkKdGFibGUyX2Nhc2VzCmBgYAoKCmBgYHtyfQpsaWJyYXJ5KHN0cmluZ3IpCgoKdGFibGUyX3BvcCA8LSB0YWJsZTIgJT4lIGZpbHRlcih0eXBlID09ICJwb3B1bGF0aW9uIikgJT4lIAogIHNlbGVjdChjb3VudHJ5LCB5ZWFyLCBjb3VudCkgJT4lCiAgcmVuYW1lKHBvcHVsYXRpb24gPSBjb3VudCkKdGFibGUyX3BvcApgYGAKCk5vdyBqb2luIHRoZSB0d28gZGF0YXNldHMgdXNpbmcgdHdvIHZhcmlhYmxlcyBhcyB0aGUgdW5pcXVlIGtleS4KCmBgYHtyfQp0YWJsZTJfam9pbiA8LSB0YWJsZTJfY2FzZXMgJT4lIGlubmVyX2pvaW4odGFibGUyX3BvcCwgYnk9YygiY291bnRyeSIsICJ5ZWFyIikpIAoKdGFibGUyX2pvaW4KYGBgCiAKIENyZWF0ZSB0aGUgbmV3IGNvbHVtbi4KIApgYGB7cn0KdGFibGUyX25ldyA8LSB0YWJsZTJfam9pbiAlPiUgbXV0YXRlKHJhdGUgPSBjYXNlcyAvIHBvcHVsYXRpb24gKiAxMDAwMCkKCnRhYmxlMl9uZXcKYGBgCgpOb3cgc3ByZWFkIHRoZSBkYXRhIG91dCBpbnRvIHR3byBjb2x1bW5zLgogCmBgYHtyfQp0YWJsZTJfbmV3X3NwcmVhZCA8LSB0YWJsZTJfbmV3ICU+JSBzZWxlY3QoY291bnRyeSwgeWVhciwgcmF0ZSkgJT4lCiAgc3ByZWFkKHllYXIsIHJhdGUpCgp0YWJsZTJfbmV3X3NwcmVhZApgYGAKIApOb3cgdHJ5IHRoZSBuZXcgZnVuY3Rpb24gKnBpdm90X3dpZGVyKCkqLiAgTm90ZSBuZXcgdGhpcyBmdW5jdGlvbiBpcyBmcm9tIHRoZSAqdGlkeXIqIDEuMCBwYWNrYWdlLgoKYGBge3J9CnRhYmxlMl9uZXdfc3ByZWFkMiA8LSB0YWJsZTJfbmV3ICU+JSBzZWxlY3QoY291bnRyeSwgeWVhciwgcmF0ZSkgJT4lCiAgcGl2b3Rfd2lkZXIoY291bnRyeSwgbmFtZXNfZnJvbSA9IHllYXIsIHZhbHVlc19mcm9tID0gcmF0ZSkKCnRhYmxlMl9uZXdfc3ByZWFkMgpgYGAKCkFyZSB0aGUgdHdvIGZpbGVzIHRoZSBzYW1lLiAgTGV0cyBnaXZlIHRoZSAqY29tcGFyZWRmKCkqIGZ1bmN0aW9uIGEgdHJ5LiAgSXQgaXMgZnJvbSB0aGUgKmFyc2VuYWwqIFIgcGFja2FnZS4gCgpgYGB7cn0KbGlicmFyeShhcnNlbmFsKQoKY29tcGFyZWRmKHRhYmxlMl9uZXdfc3ByZWFkLCB0YWJsZTJfbmV3X3NwcmVhZDIpCmBgYAoKKipBbnRlcm5hdGl2ZSBTb2x1dGlvbjoqKgoKQ2FuIHdlIHVzZSBzcHJlYWQgZnJvbSB0aGUgYmVnaW5uaW5nPyAgWWVzLgoKYGBge3J9CnRhYmxlMiAlPiUgc3ByZWFkKGtleSA9IHR5cGUsIHZhbHVlID0gY291bnQpICU+JQogIG11dGF0ZShyYXRlID0gY2FzZXMvcG9wdWxhdGlvbikgJT4lCiAgc2VsZWN0KC1jYXNlcywgLXBvcHVsYXRpb24pICU+JQogIHNwcmVhZChrZXkgPSB5ZWFyLCB2YWx1ZSA9IHJhdGUpCmBgYAoKT3IKCgpgYGB7cn0KdGFibGUyICU+JSBwaXZvdF93aWRlcihuYW1lc19mcm9tID0gdHlwZSwgdmFsdWVzX2Zyb20gPSBjb3VudCkgJT4lCiAgbXV0YXRlKHJhdGUgPSBjYXNlcy9wb3B1bGF0aW9uKSAlPiUKICBzZWxlY3QoLWNhc2VzLCAtcG9wdWxhdGlvbikgJT4lCiAgcGl2b3Rfd2lkZXIobmFtZXNfZnJvbSA9IHllYXIsIHZhbHVlc19mcm9tID0gcmF0ZSkKYGBgCgoyLiBOb3cgbWFrZSBhIGNsdXN0ZXJlZCBiYXIgZ3JhcGguIFF1ZXN0aW9uLCB3aGljaCB0YWJsZSBpcyB0aGUgb25lIHRvIHVzZSwgdGFibGUyX25ldyBvciB0YWJsZTJfbmV3X3NwcmVhZD8KIAogKipBbnN3ZXI6KiogIFRoZSBvbmUgdG8gdXNlIGlzIGluIHRpZHkgZm9ybWF0LiBTbyB0YWJsZTJfbmV3LiAgTm90ZSB0aGUgdXNlIG9mIGFzLmZhY3RvcigpIGZ1bmN0aW9uLiAgVGhpcyBpcyBvdXIgbmV4dCB0b3BpYyBvZiBkaXNjdXNzaW9uLgogCmBgYHtyfQp0YWJsZTJfbmV3ICU+JSBnZ3Bsb3QoYWVzKHggPSBjb3VudHJ5LCB5ID0gcmF0ZSwgZmlsbCA9IGFzLmZhY3Rvcih5ZWFyKSkpICsKICBnZW9tX2JhcihzdGF0ID0gImlkZW50aXR5IiwgcG9zaXRpb24gPSAiZG9kZ2UiKSArCiAgdGhlbWVfbGlnaHQoKQpgYGAKIApPciB5b3UgY2FuIG1ha2UgdGhlIHBsb3QgdXNpbmcgeWVhciB0byBncm91cCB0aGUgYmFycy4KIApgYGB7cn0KdGFibGUyX25ldyAlPiUgZ2dwbG90KGFlcyh4ID0gYXMuZmFjdG9yKHllYXIpLCB5ID0gcmF0ZSwgZmlsbCA9IGNvdW50cnkpKSArCiAgZ2VvbV9iYXIoc3RhdCA9ICJpZGVudGl0eSIsIHBvc2l0aW9uID0gImRvZGdlIikgKwogIHRoZW1lX2xpZ2h0KCkKYGBgCmBgYAogCiA=