September 4, 2018

About me

Prof. Eric A. Suess, CSU East Bay

I am a long time R user. Started using S in 1988. Used S+ in graduate school 1990's. Wrote a book with my co-author, published 2010. Probability Simulations and Gibbs Sampling. webpage

Continuing to evolve as an R user with RStudio.

Love using R. I teach R as a Professor. I encourage the use of R. I encourage others to encourage others to use R.

  • Joe Rickert, RStudio, BARUG
  • Gabriella, IBM, R-Ladies
  • Navdeep, h2O, autoML

Ford GoBikes Oakland

I am going to walk you through my initial analysis of the 2018 Ford GoBike data.

Goal 0: Use R and RStudio to produce a reproducible analysis and the tidyverse.

Goal 1: Replicate the station map in Oakland.

Goal 2: See if there is a difference in bike usage by women and men?

Goal 3: Try to see how the bikes are used. How far are the bike ridden? How long are they ridden?

Goal 4: Try to implement the same analysis pulling data from the GBFS API using the gbfs R package.

Ford GoBikes Oakland

Driving around Oakland in the past few years you have probably seen the Ford GoBikes.

I went to the website and wanted to replicate the station map in a static visualization.

This lead me to the System Data.

And their Download for downloading .csv files.

General Bikeshare Feed Specification (GBFS)

Then I was lead to the General Bikeshare Feed Specification gbfs github.

Finally, to the R package gbfs.

First I tested out the .csv files

Lots of the same files with the same file name structure.

 fordgobike201801 <- read_csv(file="./data/201801-fordgobike-tripdata.csv")
 fordgobike201802 <- read_csv(file="./data/201802-fordgobike-tripdata.csv")

How to download them with a loop?

Ford GoBike stations

Oakland

My map of Oakland stations.

Bay Area stations

My map of Bay Area stations.

Wrangling

To make this visualization

  • had to change the type of one variable from char to int for two of the June and July, due to missing values
  • had to remove errors in the geolocation data
  • noticed some 120+ year old people, so removed them

Discoverd the R package gbfs

Further Work

This is a good start. There are many more things to do.

  • Answer some more questions about the durations of the rides.
  • I have become currius if it is possible to see how many rides start and end at the same station.
  • Make a dynamic plot of rides per month.

Thank you

Thank you Allan for the invitation.