Prof. Eric A. Suess

So how should you complete your homework for this class?

Homework 8:

 Read: Chapter 12

 Do 12.2.1 Exercises 1, 2
 Do 12.3.3 Exercise 4
 Do 12.4.3 Exercise 1
library(tidyverse)

12.2.1

1.

Using prose, describe how the variables and observations are organised in each of the sample tables.

Answer:

In table1 each row is a (country, year) with variables cases and population.

table1

In table2, each row is country, year , variable (“cases”, “population”) combination, and there is a count variable with the numeric value of the combination.

In table3, each row is a (country, year) combination with the column rate having the rate of cases to population as a character string in the format “cases/rate”.

Table 4 is split into two tables, one table for each variable: table4a is the table for cases, while table4b is the table for population. Within each table, each row is a country, each column is a year, and the cells are the value of the variable for the table.

table4a
table4b

2.

Compute the rate for table2, and table4a + table4b. You will need to perform four operations:

Extract the number of TB cases per country per year. Extract the matching population per country per year. Divide cases by population, and multiply by 10000. Store back in the appropriate place. Which representation is easiest to work with? Which is hardest? Why?

Answer:

Using some code from Chapter 13. Relational data

Using table4a and table4b

12.3.3

4

Tidy the simple tibble below. Do you need to spread or gather it? What are the variables?

Answer:

We need to gather the data into two new columns, sex and count.

preg <- tribble(
  ~pregnant, ~male, ~female,
  "yes",     NA,    10,
  "no",      20,    12
)
preg

12.4.3

1.

What do the extra and fill arguments do in separate()? Experiment with the various options for the following two toy datasets.

tibble(x = c("a,b,c", "d,e", "f,g,i")) %>% 
  separate(x, c("one", "two", "three"))
Expected 3 pieces. Missing pieces filled with `NA` in 1 rows [2].

Examples:

tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
  separate(x, c("one", "two", "three"), extra = "drop")
tibble(x = c("a,b,c", "d,e,f,g", "h,i,j")) %>%
  separate(x, c("one", "two", "three"), extra = "merge")
tibble(x = c("a,b,c", "d,e", "f,g,i")) %>%
  separate(x, c("one", "two", "three"), fill = "right")
tibble(x = c("a,b,c", "d,e", "f,g,i")) %>%
  separate(x, c("one", "two", "three"), fill = "left")
LS0tCnRpdGxlOiAnU3RhdC4gNDUwIFNlY3Rpb24gMSBvciAyOiBIb21ld29yayA4JwpvdXRwdXQ6CiAgd29yZF9kb2N1bWVudDogZGVmYXVsdAogIHBkZl9kb2N1bWVudDogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCi0tLQoKKipQcm9mLiBFcmljIEEuIFN1ZXNzKioKClNvIGhvdyBzaG91bGQgeW91IGNvbXBsZXRlIHlvdXIgaG9tZXdvcmsgZm9yIHRoaXMgY2xhc3M/CgotIEZpcnN0IHRoaW5nIHRvIGRvIGlzIHR5cGUgYWxsIG9mIHlvdXIgaW5mb3JtYXRpb24gYWJvdXQgdGhlIHByb2JsZW1zIHlvdSBkbyBpbiB0aGUgdGV4dCBwYXJ0IG9mIHlvdXIgUiBOb3RlYm9vay4KLSBTZWNvbmQgdGhpbmcgdG8gZG8gaXMgdHlwZSBhbGwgb2YgeW91ciBSIGNvZGUgaW50byBSIGNodW5rcyB0aGF0IGNhbiBiZSBydW4uCi0gSWYgeW91IGxvYWQgdGhlIHRpZHl2ZXJzZSBpbiBhbiBSIE5vdGVib29rIGNodW5rLCBiZSBzdXJlIHRvIGluY2x1ZGUgdGhlICJtZXNzYWdlID0gRkFMU0UiIGluIHRoZSB7cn0sIHNvIHtyIG1lc3NhZ2UgPSBGQUxTRX0uCi0gTGFzdCB0aGluZyBpcyB0byBzcGVsbCBjaGVjayB5b3VyIFIgTm90ZWJvb2suICBFZGl0ID4gQ2hlY2sgU3BlbGxpbmcuLi4gb3IgaGl0IHRoZSBGNyBrZXkuCgpIb21ld29yayA4OgoKICAgICBSZWFkOiBDaGFwdGVyIDEyCgogICAgIERvIDEyLjIuMSBFeGVyY2lzZXMgMSwgMgogICAgIERvIDEyLjMuMyBFeGVyY2lzZSA0CiAgICAgRG8gMTIuNC4zIEV4ZXJjaXNlIDEKCgpgYGB7ciBtZXNzYWdlPUZBTFNFfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKYGBgCgojIDEyLjIuMQoKIyMgMS4KClVzaW5nIHByb3NlLCBkZXNjcmliZSBob3cgdGhlIHZhcmlhYmxlcyBhbmQgb2JzZXJ2YXRpb25zIGFyZSBvcmdhbmlzZWQgaW4gZWFjaCBvZiB0aGUgc2FtcGxlIHRhYmxlcy4KCioqQW5zd2VyOioqCgpJbiB0YWJsZTEgZWFjaCByb3cgaXMgYSAoY291bnRyeSwgeWVhcikgd2l0aCB2YXJpYWJsZXMgY2FzZXMgYW5kIHBvcHVsYXRpb24uCgpgYGB7cn0KdGFibGUxCmBgYAoKSW4gdGFibGUyLCBlYWNoIHJvdyBpcyBjb3VudHJ5LCB5ZWFyICwgdmFyaWFibGUgKOKAnGNhc2Vz4oCdLCDigJxwb3B1bGF0aW9u4oCdKSBjb21iaW5hdGlvbiwgYW5kIHRoZXJlIGlzIGEgY291bnQgdmFyaWFibGUgd2l0aCB0aGUgbnVtZXJpYyB2YWx1ZSBvZiB0aGUgY29tYmluYXRpb24uCgoKYGBge3J9CnRhYmxlMgpgYGAKCkluIHRhYmxlMywgZWFjaCByb3cgaXMgYSAoY291bnRyeSwgeWVhcikgY29tYmluYXRpb24gd2l0aCB0aGUgY29sdW1uIHJhdGUgaGF2aW5nIHRoZSByYXRlIG9mIGNhc2VzIHRvIHBvcHVsYXRpb24gYXMgYSBjaGFyYWN0ZXIgc3RyaW5nIGluIHRoZSBmb3JtYXQgImNhc2VzL3JhdGUiLgoKYGBge3J9CnRhYmxlMwpgYGAKClRhYmxlIDQgaXMgc3BsaXQgaW50byB0d28gdGFibGVzLCBvbmUgdGFibGUgZm9yIGVhY2ggdmFyaWFibGU6IHRhYmxlNGEgaXMgdGhlIHRhYmxlIGZvciBjYXNlcywgd2hpbGUgdGFibGU0YiBpcyB0aGUgdGFibGUgZm9yIHBvcHVsYXRpb24uIFdpdGhpbiBlYWNoIHRhYmxlLCBlYWNoIHJvdyBpcyBhIGNvdW50cnksIGVhY2ggY29sdW1uIGlzIGEgeWVhciwgYW5kIHRoZSBjZWxscyBhcmUgdGhlIHZhbHVlIG9mIHRoZSB2YXJpYWJsZSBmb3IgdGhlIHRhYmxlLgoKYGBge3J9CnRhYmxlNGEKdGFibGU0YgpgYGAKCgoKIyMgMi4KCkNvbXB1dGUgdGhlIHJhdGUgZm9yIHRhYmxlMiwgYW5kIHRhYmxlNGEgKyB0YWJsZTRiLiBZb3Ugd2lsbCBuZWVkIHRvIHBlcmZvcm0gZm91ciBvcGVyYXRpb25zOgoKRXh0cmFjdCB0aGUgbnVtYmVyIG9mIFRCIGNhc2VzIHBlciBjb3VudHJ5IHBlciB5ZWFyLgpFeHRyYWN0IHRoZSBtYXRjaGluZyBwb3B1bGF0aW9uIHBlciBjb3VudHJ5IHBlciB5ZWFyLgpEaXZpZGUgY2FzZXMgYnkgcG9wdWxhdGlvbiwgYW5kIG11bHRpcGx5IGJ5IDEwMDAwLgpTdG9yZSBiYWNrIGluIHRoZSBhcHByb3ByaWF0ZSBwbGFjZS4KV2hpY2ggcmVwcmVzZW50YXRpb24gaXMgZWFzaWVzdCB0byB3b3JrIHdpdGg/IFdoaWNoIGlzIGhhcmRlc3Q/IFdoeT8KCgoqKkFuc3dlcjoqKgoKVXNpbmcgc29tZSBjb2RlIGZyb20gQ2hhcHRlciAxMy4gUmVsYXRpb25hbCBkYXRhCgpgYGB7cn0KdGFibGUyCgp0YWJsZTJfY2FzZXMgPC0gdGFibGUyICU+JSBmaWx0ZXIodHlwZSA9PSAiY2FzZXMiKSAlPiUgcmVuYW1lKGNhc2VzID0gY291bnQpICU+JSBhcnJhbmdlKGNvdW50cnksIHllYXIpCnRhYmxlMl9jYXNlcwoKdGFibGUyX3BvcCA8LSB0YWJsZTIgJT4lIGZpbHRlcih0eXBlID09ICJwb3B1bGF0aW9uIikgJT4lIHJlbmFtZShwb3AgPSBjb3VudCkgJT4lIGFycmFuZ2UoY291bnRyeSwgeWVhcikKdGFibGUyX3BvcAoKdGFibGUyX25ldyA8LSB0YWJsZTJfY2FzZXMgJT4lIGlubmVyX2pvaW4odGFibGUyX3BvcCwgYnkgPSBjKCJjb3VudHJ5IiwieWVhciIpKQp0YWJsZTJfbmV3Cgp0YWJsZTJfbmV3ICU+JSBtdXRhdGUocmF0ZSA9IChjYXNlcy9wb3ApKjEwMDAwKSAlPiUKICBzZWxlY3QoY291bnRyeSwgeWVhciwgcmF0ZSkgJT4lCiAgYXJyYW5nZSh5ZWFyKSAlPiUKICBzcHJlYWQoeWVhciwgcmF0ZSkKYGBgCgpVc2luZyB0YWJsZTRhIGFuZCB0YWJsZTRiCgpgYGB7cn0KdGFibGU0YQp0YWJsZTRiCgp0YWJsZV9uZXcyIDwtIHRhYmxlNGEgJT4lIGlubmVyX2pvaW4odGFibGU0YiwgYnkgPSBjKCJjb3VudHJ5IikpCnRhYmxlX25ldzIKCnRhYmxlX25ldzJhIDwtIHRhYmxlX25ldzIgJT4lIG11dGF0ZSgKICByYXRlLjE5OTkgPSAoYDE5OTkueGAvYDE5OTkueWApKjEwMDAwLCAKICByYXRlLjIwMDAgPSAoYDIwMDAueGAvYDIwMDAueWApKjEwMDAwCiAgKSAlPiUKICBzZWxlY3QoY291bnRyeSwgcmF0ZS4xOTk5LCByYXRlLjIwMDApCnRhYmxlX25ldzJhCmBgYAoKIyAxMi4zLjMKCiMjIDQKClRpZHkgdGhlIHNpbXBsZSB0aWJibGUgYmVsb3cuIERvIHlvdSBuZWVkIHRvIHNwcmVhZCBvciBnYXRoZXIgaXQ/IFdoYXQgYXJlIHRoZSB2YXJpYWJsZXM/CgoqKkFuc3dlcjoqKgoKV2UgbmVlZCB0byBnYXRoZXIgdGhlIGRhdGEgaW50byB0d28gbmV3IGNvbHVtbnMsIHNleCBhbmQgY291bnQuCgpgYGB7cn0KcHJlZyA8LSB0cmliYmxlKAogIH5wcmVnbmFudCwgfm1hbGUsIH5mZW1hbGUsCiAgInllcyIsICAgICBOQSwgICAgMTAsCiAgIm5vIiwgICAgICAyMCwgICAgMTIKKQoKcHJlZwpgYGAKCgoKYGBge3J9CnByZWcgJT4lIGdhdGhlcihtYWxlLCBmZW1hbGUsIGtleSA9ICJzZXgiLCB2YWx1ZSA9ICJjb3VudCIpCmBgYAoKCiMgMTIuNC4zCgojIDEuCgpXaGF0IGRvIHRoZSBleHRyYSBhbmQgZmlsbCBhcmd1bWVudHMgZG8gaW4gc2VwYXJhdGUoKT8gRXhwZXJpbWVudCB3aXRoIHRoZSB2YXJpb3VzIG9wdGlvbnMgZm9yIHRoZSBmb2xsb3dpbmcgdHdvIHRveSBkYXRhc2V0cy4KCgoKCmBgYHtyfQp0aWJibGUoeCA9IGMoImEsYixjIiwgImQsZSxmLGciLCAiaCxpLGoiKSkgJT4lIAogIHNlcGFyYXRlKHgsIGMoIm9uZSIsICJ0d28iLCAidGhyZWUiKSkKCnRpYmJsZSh4ID0gYygiYSxiLGMiLCAiZCxlIiwgImYsZyxpIikpICU+JSAKICBzZXBhcmF0ZSh4LCBjKCJvbmUiLCAidHdvIiwgInRocmVlIikpCmBgYAoKRXhhbXBsZXM6CgpgYGB7cn0KdGliYmxlKHggPSBjKCJhLGIsYyIsICJkLGUsZixnIiwgImgsaSxqIikpICU+JQogIHNlcGFyYXRlKHgsIGMoIm9uZSIsICJ0d28iLCAidGhyZWUiKSwgZXh0cmEgPSAiZHJvcCIpCmBgYAoKCmBgYHtyfQp0aWJibGUoeCA9IGMoImEsYixjIiwgImQsZSxmLGciLCAiaCxpLGoiKSkgJT4lCiAgc2VwYXJhdGUoeCwgYygib25lIiwgInR3byIsICJ0aHJlZSIpLCBleHRyYSA9ICJtZXJnZSIpCmBgYAoKCmBgYHtyfQp0aWJibGUoeCA9IGMoImEsYixjIiwgImQsZSIsICJmLGcsaSIpKSAlPiUKICBzZXBhcmF0ZSh4LCBjKCJvbmUiLCAidHdvIiwgInRocmVlIiksIGZpbGwgPSAicmlnaHQiKQpgYGAKCgpgYGB7cn0KdGliYmxlKHggPSBjKCJhLGIsYyIsICJkLGUiLCAiZixnLGkiKSkgJT4lCiAgc2VwYXJhdGUoeCwgYygib25lIiwgInR3byIsICJ0aHJlZSIpLCBmaWxsID0gImxlZnQiKQpgYGAKCg==