--- title: 'Stat. 450 Section 1 or 2: Homework 7' output: word_document: default html_notebook: default pdf_document: default html_document: df_print: paged --- **Prof. Eric A. Suess** So how should you complete your homework for this class? - First thing to do is type all of your information about the problems you do in the text part of your R Notebook. - Second thing to do is type all of your R code into R chunks that can be run. - If you load the tidyverse in an R Notebook chunk, be sure to include the "message = FALSE" in the {r}, so {r message = FALSE}. - Last thing is to spell check your R Notebook. Edit > Check Spelling... or hit the F7 key. Homework 7: Read: Chapter 9, Chapter 10, Chapter 11 Do 10.5 Exercises 1, 2 Do 11.2.2 Exercise 2 Do 11.3.5 Exercises 6, 7 ```{r message=FALSE} library(tidyverse) ``` # 10.5 ## 1. At the Console, all of the variables are printed out. Note the labeling of the rows. In a notebook data.frames are printed in the same way as a tibble, but the row labels are not printed. You can use is_tibble() and class() functions to check what a data.frame is. ```{r} library(tidyverse) is_tibble(mtcars) class(mtcars) mtcars as.tibble(mtcars) ``` ```{r} library(nycflights13) is_tibble(flights) is_tibble(planes) is_tibble(airports) is_tibble(weather) class(flights) ``` ## 2. The main thing that is different is that with data.frame the reference to the variable can use only the first letter, the rest are assumed. This could lead to problems because more than one variable name may start with the same letter. The tibble returns a tibble all of the time, regardless of selecting one column or more than one column. In a data.frame if a single column is selected, a vector is returned, otherwise a data.frame is retured. This behavior could cause problems. ```{r} df <- data.frame(abc = 1, xyz = "a") df$x df[, "xyz"] df[, c("abc", "xyz")] ``` Converting the data.frame to a tibble. ```{r} df <- tibble(abc = 1, xyz = "a") df$x df[, "xyz"] df[, c("abc", "xyz")] ``` # 11.2.2 ## 2. Read the help files, it appears they have all of the same options. - col_names = TRUE - col_types = NULL - locale = default_locale() - na = c("", "NA") - quoted_na = TRUE - quote = "\"" - trim_ws = TRUE - n_max = Inf - guess_max = min(1000, n_max) - progress = show_progress() ```{r} ?read_csv ?read_tsv union(names(formals(read_csv)), names(formals(read_tsv))) intersect(names(formals(read_csv)), names(formals(read_tsv))) ``` # 11.3.5 ## 6. These solutions are from the [R for Data Science Solutions](https://jrnold.github.io/r4ds-exercise-solutions/data-import.html#exercise-11.3.5.6). Note the problem number has changed. UTF-8 is standard now, and ASCII has been around forever. For Asian languages Arabic and Vietnamese have ISO and Windows standards. The other major Asian scripts have their own: - Japanese: JIS X 0208, Shift JIS, ISO-2022-JP - Chinese: GB 2312, GBK, GB 18030 - Korean: KS X 1001, EUC-KR, ISO-2022-KR ## 7. Generate the correct format strings. ```{r} d1 <- "January 1, 2010" d2 <- "2015-Mar-07" d3 <- "06-Jun-2017" d4 <- c("August 19 (2015)", "July 1 (2015)") d5 <- "12/30/14" # Dec 30, 2014 t1 <- "1705" t2 <- "11:15:10.12 PM" ``` ```{r} parse_date(d1, "%B %d, %Y") parse_date(d2, "%Y-%b-%d") parse_date(d3, "%d-%b-%Y") parse_date(d4, "%B %d (%Y)") parse_date(d5, "%m/%d/%y") parse_time(t1, "%H%M") ``` ```{r} parse_time(t2, "%H:%M:%OS %p") ```