Prof. Eric A. Suess

So how should you complete your homework for this class?

Homework 7:

 Read: Chapter 9, Chapter 10, Chapter 11
 Do 10.5 Exercises 1, 2
 Do 11.2.2 Exercise 2
 Do 11.3.5 Exercises 6, 7
library(tidyverse)

10.5

1.

At the Console, all of the variables are printed out. Note the labeling of the rows.

In a notebook data.frames are printed in the same way as a tibble, but the row labels are not printed.

You can use is_tibble() and class() functions to check what a data.frame is.

library(tidyverse)
is_tibble(mtcars)
[1] FALSE
class(mtcars)
[1] "data.frame"
mtcars
as.tibble(mtcars)
class(flights)
[1] "tbl_df"     "tbl"        "data.frame"

2.

The main thing that is different is that with data.frame the reference to the variable can use only the first letter, the rest are assumed. This could lead to problems because more than one variable name may start with the same letter.

The tibble returns a tibble all of the time, regardless of selecting one column or more than one column. In a data.frame if a single column is selected, a vector is returned, otherwise a data.frame is retured. This behavior could cause problems.

Converting the data.frame to a tibble.

11.2.2

2.

Read the help files, it appears they have all of the same options.

  • col_names = TRUE
  • col_types = NULL
  • locale = default_locale()
  • na = c(“”, “NA”)
  • quoted_na = TRUE
  • quote = “"”
  • trim_ws = TRUE
  • n_max = Inf
  • guess_max = min(1000, n_max)
  • progress = show_progress()
intersect(names(formals(read_csv)), names(formals(read_tsv)))
 [1] "file"      "col_names" "col_types" "locale"    "na"        "quoted_na" "quote"     "comment"  
 [9] "trim_ws"   "skip"      "n_max"     "guess_max" "progress" 

11.3.5

6.

These solutions are from the R for Data Science Solutions. Note the problem number has changed.

UTF-8 is standard now, and ASCII has been around forever.

For Asian languages Arabic and Vietnamese have ISO and Windows standards. The other major Asian scripts have their own:

  • Japanese: JIS X 0208, Shift JIS, ISO-2022-JP
  • Chinese: GB 2312, GBK, GB 18030
  • Korean: KS X 1001, EUC-KR, ISO-2022-KR

7.

Generate the correct format strings.

d1 <- "January 1, 2010"
d2 <- "2015-Mar-07"
d3 <- "06-Jun-2017"
d4 <- c("August 19 (2015)", "July 1 (2015)")
d5 <- "12/30/14" # Dec 30, 2014
t1 <- "1705"
t2 <- "11:15:10.12 PM"
parse_date(d1, "%B %d, %Y")
[1] "2010-01-01"
parse_date(d2, "%Y-%b-%d")
[1] "2015-03-07"
parse_date(d3, "%d-%b-%Y")
[1] "2017-06-06"
parse_date(d4, "%B %d (%Y)")
[1] "2015-08-19" "2015-07-01"
parse_date(d5, "%m/%d/%y")
[1] "2014-12-30"
parse_time(t1, "%H%M")
17:05:00
parse_time(t2, "%H:%M:%OS %p")
23:15:10.12
LS0tCnRpdGxlOiAnU3RhdC4gNDUwIFNlY3Rpb24gMSBvciAyOiBIb21ld29yayA3JwpvdXRwdXQ6CiAgd29yZF9kb2N1bWVudDogZGVmYXVsdAogIGh0bWxfbm90ZWJvb2s6IGRlZmF1bHQKICBwZGZfZG9jdW1lbnQ6IGRlZmF1bHQKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCi0tLQoKKipQcm9mLiBFcmljIEEuIFN1ZXNzKioKClNvIGhvdyBzaG91bGQgeW91IGNvbXBsZXRlIHlvdXIgaG9tZXdvcmsgZm9yIHRoaXMgY2xhc3M/CgotIEZpcnN0IHRoaW5nIHRvIGRvIGlzIHR5cGUgYWxsIG9mIHlvdXIgaW5mb3JtYXRpb24gYWJvdXQgdGhlIHByb2JsZW1zIHlvdSBkbyBpbiB0aGUgdGV4dCBwYXJ0IG9mIHlvdXIgUiBOb3RlYm9vay4KLSBTZWNvbmQgdGhpbmcgdG8gZG8gaXMgdHlwZSBhbGwgb2YgeW91ciBSIGNvZGUgaW50byBSIGNodW5rcyB0aGF0IGNhbiBiZSBydW4uCi0gSWYgeW91IGxvYWQgdGhlIHRpZHl2ZXJzZSBpbiBhbiBSIE5vdGVib29rIGNodW5rLCBiZSBzdXJlIHRvIGluY2x1ZGUgdGhlICJtZXNzYWdlID0gRkFMU0UiIGluIHRoZSB7cn0sIHNvIHtyIG1lc3NhZ2UgPSBGQUxTRX0uCi0gTGFzdCB0aGluZyBpcyB0byBzcGVsbCBjaGVjayB5b3VyIFIgTm90ZWJvb2suICBFZGl0ID4gQ2hlY2sgU3BlbGxpbmcuLi4gb3IgaGl0IHRoZSBGNyBrZXkuCgpIb21ld29yayA3OgoKICAgICBSZWFkOiBDaGFwdGVyIDksIENoYXB0ZXIgMTAsIENoYXB0ZXIgMTEKICAgICBEbyAxMC41IEV4ZXJjaXNlcyAxLCAyCiAgICAgRG8gMTEuMi4yIEV4ZXJjaXNlIDIKICAgICBEbyAxMS4zLjUgRXhlcmNpc2VzIDYsIDcKCmBgYHtyIG1lc3NhZ2U9RkFMU0V9CmxpYnJhcnkodGlkeXZlcnNlKQpgYGAKCiMgMTAuNQoKIyMgMS4KCkF0IHRoZSBDb25zb2xlLCBhbGwgb2YgdGhlIHZhcmlhYmxlcyBhcmUgcHJpbnRlZCBvdXQuICBOb3RlIHRoZSBsYWJlbGluZyBvZiB0aGUgcm93cy4KCkluIGEgbm90ZWJvb2sgZGF0YS5mcmFtZXMgYXJlIHByaW50ZWQgaW4gdGhlIHNhbWUgd2F5IGFzIGEgdGliYmxlLCBidXQgdGhlIHJvdyBsYWJlbHMgYXJlIG5vdCBwcmludGVkLgoKWW91IGNhbiB1c2UgaXNfdGliYmxlKCkgYW5kIGNsYXNzKCkgZnVuY3Rpb25zIHRvIGNoZWNrIHdoYXQgYSBkYXRhLmZyYW1lIGlzLgoKCmBgYHtyfQpsaWJyYXJ5KHRpZHl2ZXJzZSkKCmlzX3RpYmJsZShtdGNhcnMpCgpjbGFzcyhtdGNhcnMpCgptdGNhcnMKCmFzLnRpYmJsZShtdGNhcnMpCmBgYAoKYGBge3J9CmxpYnJhcnkobnljZmxpZ2h0czEzKQoKaXNfdGliYmxlKGZsaWdodHMpCmlzX3RpYmJsZShwbGFuZXMpCmlzX3RpYmJsZShhaXJwb3J0cykKaXNfdGliYmxlKHdlYXRoZXIpCgpjbGFzcyhmbGlnaHRzKQpgYGAKCiMjIDIuCgpUaGUgbWFpbiB0aGluZyB0aGF0IGlzIGRpZmZlcmVudCBpcyB0aGF0IHdpdGggZGF0YS5mcmFtZSB0aGUgcmVmZXJlbmNlIHRvIHRoZSB2YXJpYWJsZSBjYW4gdXNlIG9ubHkgdGhlIGZpcnN0IGxldHRlciwgdGhlIHJlc3QgYXJlIGFzc3VtZWQuICBUaGlzIGNvdWxkIGxlYWQgdG8gcHJvYmxlbXMgYmVjYXVzZSBtb3JlIHRoYW4gb25lIHZhcmlhYmxlIG5hbWUgbWF5IHN0YXJ0IHdpdGggdGhlIHNhbWUgbGV0dGVyLgoKVGhlIHRpYmJsZSByZXR1cm5zIGEgdGliYmxlIGFsbCBvZiB0aGUgdGltZSwgcmVnYXJkbGVzcyBvZiBzZWxlY3Rpbmcgb25lIGNvbHVtbiBvciBtb3JlIHRoYW4gb25lIGNvbHVtbi4gIEluIGEgZGF0YS5mcmFtZSBpZiBhIHNpbmdsZSBjb2x1bW4gaXMgc2VsZWN0ZWQsIGEgdmVjdG9yIGlzIHJldHVybmVkLCBvdGhlcndpc2UgYSBkYXRhLmZyYW1lIGlzIHJldHVyZWQuICBUaGlzIGJlaGF2aW9yIGNvdWxkIGNhdXNlIHByb2JsZW1zLgoKYGBge3J9CmRmIDwtIGRhdGEuZnJhbWUoYWJjID0gMSwgeHl6ID0gImEiKQpkZiR4CmRmWywgInh5eiJdCmRmWywgYygiYWJjIiwgInh5eiIpXQpgYGAKCkNvbnZlcnRpbmcgdGhlIGRhdGEuZnJhbWUgdG8gYSB0aWJibGUuCgpgYGB7cn0KZGYgPC0gdGliYmxlKGFiYyA9IDEsIHh5eiA9ICJhIikKZGYkeApkZlssICJ4eXoiXQpkZlssIGMoImFiYyIsICJ4eXoiKV0KYGBgCgojIDExLjIuMgoKIyMgMi4KClJlYWQgdGhlIGhlbHAgZmlsZXMsIGl0IGFwcGVhcnMgdGhleSBoYXZlIGFsbCBvZiB0aGUgc2FtZSBvcHRpb25zLgoKLSBjb2xfbmFtZXMgPSBUUlVFCi0gY29sX3R5cGVzID0gTlVMTAotIGxvY2FsZSA9IGRlZmF1bHRfbG9jYWxlKCkKLSBuYSA9IGMoIiIsICJOQSIpCi0gcXVvdGVkX25hID0gVFJVRQotIHF1b3RlID0gIlwiIgotIHRyaW1fd3MgPSBUUlVFCi0gbl9tYXggPSBJbmYKLSBndWVzc19tYXggPSBtaW4oMTAwMCwgbl9tYXgpCi0gcHJvZ3Jlc3MgPSBzaG93X3Byb2dyZXNzKCkKCmBgYHtyfQo/cmVhZF9jc3YKP3JlYWRfdHN2Cgp1bmlvbihuYW1lcyhmb3JtYWxzKHJlYWRfY3N2KSksIG5hbWVzKGZvcm1hbHMocmVhZF90c3YpKSkKCmludGVyc2VjdChuYW1lcyhmb3JtYWxzKHJlYWRfY3N2KSksIG5hbWVzKGZvcm1hbHMocmVhZF90c3YpKSkKYGBgCgojIDExLjMuNQoKIyMgNi4KClRoZXNlIHNvbHV0aW9ucyBhcmUgZnJvbSB0aGUgW1IgZm9yIERhdGEgU2NpZW5jZSBTb2x1dGlvbnNdKGh0dHBzOi8vanJub2xkLmdpdGh1Yi5pby9yNGRzLWV4ZXJjaXNlLXNvbHV0aW9ucy9kYXRhLWltcG9ydC5odG1sI2V4ZXJjaXNlLTExLjMuNS42KS4gIE5vdGUgdGhlIHByb2JsZW0gbnVtYmVyIGhhcyBjaGFuZ2VkLgoKVVRGLTggaXMgc3RhbmRhcmQgbm93LCBhbmQgQVNDSUkgaGFzIGJlZW4gYXJvdW5kIGZvcmV2ZXIuCgpGb3IgQXNpYW4gbGFuZ3VhZ2VzIEFyYWJpYyBhbmQgVmlldG5hbWVzZSBoYXZlIElTTyBhbmQgV2luZG93cyBzdGFuZGFyZHMuIFRoZSBvdGhlciBtYWpvciBBc2lhbiBzY3JpcHRzIGhhdmUgdGhlaXIgb3duOgoKLSBKYXBhbmVzZTogSklTIFggMDIwOCwgU2hpZnQgSklTLCBJU08tMjAyMi1KUAotIENoaW5lc2U6IEdCIDIzMTIsIEdCSywgR0IgMTgwMzAKLSBLb3JlYW46IEtTIFggMTAwMSwgRVVDLUtSLCBJU08tMjAyMi1LUgoKCgoKIyMgNy4KCkdlbmVyYXRlIHRoZSBjb3JyZWN0IGZvcm1hdCBzdHJpbmdzLgoKYGBge3J9CmQxIDwtICJKYW51YXJ5IDEsIDIwMTAiCmQyIDwtICIyMDE1LU1hci0wNyIKZDMgPC0gIjA2LUp1bi0yMDE3IgpkNCA8LSBjKCJBdWd1c3QgMTkgKDIwMTUpIiwgIkp1bHkgMSAoMjAxNSkiKQpkNSA8LSAiMTIvMzAvMTQiICMgRGVjIDMwLCAyMDE0CnQxIDwtICIxNzA1Igp0MiA8LSAiMTE6MTU6MTAuMTIgUE0iCmBgYAoKCmBgYHtyfQpwYXJzZV9kYXRlKGQxLCAiJUIgJWQsICVZIikKCnBhcnNlX2RhdGUoZDIsICIlWS0lYi0lZCIpCgpwYXJzZV9kYXRlKGQzLCAiJWQtJWItJVkiKQoKcGFyc2VfZGF0ZShkNCwgIiVCICVkICglWSkiKQoKcGFyc2VfZGF0ZShkNSwgIiVtLyVkLyV5IikKCnBhcnNlX3RpbWUodDEsICIlSCVNIikKYGBgCgoKYGBge3J9CnBhcnNlX3RpbWUodDIsICIlSDolTTolT1MgJXAiKQpgYGAKCgoKCg==