--- title: 'Stat. 450 Section 1: Homework 10' author: " Prof. Eric A. Suess" output: word_document: default html_document: df_print: paged html_notebook: default pdf_document: default --- So how should you complete your homework for this class? - First thing to do is type all of your information about the problems you do in the text part of your R Notebook. - Second thing to do is type all of your R code into R chunks that can be run. - If you load the tidyverse in an R Notebook chunk, be sure to include the "message = FALSE" in the {r}, so {r message = FALSE}. - Last thing is to spell check your R Notebook. Edit > Check Spelling... or hit the F7 key. Upload one file to Blackboard. Homework 10: Read: Chapter 14 Exercises: Do 14.2.5 Exercises 1, 3 Do 14.3.2.1 Exercise 2 (This problem should not be turned in bacause of the str_view() makes it so your notebook will not knit.) Do 14.4.6.1 Exercise 1 ```{r} library(pacman) p_load("tidyverse", "stringr") ``` # 14.2.5 ## 1. In code that doesn’t use stringr, you’ll often see paste() and paste0(). What’s the difference between the two functions? What stringr function are they equivalent to? How do the functions differ in their handling of NA? **Answer:** The paste() function includes a space at the end of the paste and paste0 does not have a space added at the end. ```{r} paste("first", "second") paste0("first", "second") ``` The stringr function that could be used in place of paste0() is str_c() ```{r} str_c("first", "second") ``` The str_c() returns an NA if any of the strings are NA. This is different from the paste() ```{r} paste("first", NA) str_c("first", NA) ``` ## 3. Use str_length() and str_sub() to extract the middle character from a string. What will you do if the string has an even number of characters? **Answer:** ```{r} len <- str_length("firost") # If len is even str_sub("first", ceiling(len/2), ceiling(len/2)) # If len is odd str_sub("firost", ceiling(len/2), ceiling(len/2)+1) # The mod function computes the remainder of a division problem. If the reminder is 0 then the length is even. middle_letter <- function(x){ if (str_length(x) %% 2 == 0) { middle_letter = str_sub(x, ceiling(len/2), ceiling(len/2)+1) } else { middle_letter = str_sub(x, ceiling(len/2), ceiling(len/2)) } return(middle_letter) } x <- c("first") middle_letter(x) ``` # 14.3.2.1 ## 2. Given the corpus of common words in stringr::words, create regular expressions that find all words that: - Start with “y”. - End with “x” - Are exactly three letters long. (Don’t cheat by using str_length()!) - Have seven letters or more. - Since this list is long, you might want to use the match argument to str_view() to show only the matching or non-matching words. **Answer:** ```{r} words str_view(words, "^y", match = TRUE) str_view(words, "x$", match = TRUE) # The way we are not supposed to do this. words_three <- as.data.frame(words) %>% mutate(words_length = str_length(words) ) %>% filter(words_length == 3) words_three dim(words_three) words_three2 <- str_view(words, "^...$", match = TRUE) words_three2 str_view(words, ".......", match = TRUE) ``` # 14.4.6.1 ## 1. Split up a string like "apples, pears, and bananas" into individual components. **Answer:** I think the best answer to this question uses the helper function boundary("word"). If you gave ", +(and +)?" in place of boundary("word") you should be sure you understand what it is doing. ```{r} x <- c("apples, pears, and bananas") str_split(x, ", ")[[1]] str_split(x, boundary("word"))[[1]] ```