--- title: "Chapter 13 - Relational Data" output: pdf_document: default word_document: default html_notebook: default --- ```{r message=FALSE} library(tidyverse) library(nycflights13) ``` # Airlines ```{r} flights airlines airports planes weather ``` ![](http://r4ds.had.co.nz/diagrams/relational-nycflights.png) Check is there are any duplicate tailnumbers. ```{r} planes %>% count(tailnum) %>% filter(n > 1) ``` Duplicated? ```{r} weather %>% count(year, month, day, hour, origin) %>% filter(n > 1) ``` Duplicates? Note there are duplicated dates for flights and tailnum in the flights dataset. This may be a problem. ```{r} flights %>% count(year, month, day, flight) %>% filter(n > 1) flights %>% count(year, month, day, tailnum) %>% filter(n > 1) ``` Join airline name to the flights data. ```{r} flights2 <- flights %>% select(year:day, hour, origin, dest, tailnum, carrier) flights2 flights2 %>% select(-origin, -dest) %>% left_join(airlines, by = "carrier") ``` Simple examples. ```{r} x <- tribble( ~key, ~val_x, 1, "x1", 2, "x2", 3, "x3" ) x y <- tribble( ~key, ~val_y, 1, "y1", 2, "y2", 4, "y3" ) y ``` ```{r} x %>% inner_join(y, by = "key") ``` Duplicate keys. ```{r} x <- tribble( ~key, ~val_x, 1, "x1", 2, "x2", 2, "x3", 1, "x4" ) x y <- tribble( ~key, ~val_y, 1, "y1", 2, "y2" ) y left_join(x, y, by = "key") ``` Both with duplicate keys. ```{r} x <- tribble( ~key, ~val_x, 1, "x1", 2, "x2", 2, "x3", 3, "x4" ) x y <- tribble( ~key, ~val_y, 1, "y1", 2, "y2", 2, "y3", 3, "y4" ) y left_join(x, y, by = "key") ```