Department of Statistics and Biostatistics

California State University, East Bay 

Winter 2015


Statistics 4970/6610: Data Visualization

Course Description Homework Important Dates Software
Syllabus Handouts Book Links
Blackboard podcasts Data Online Texts/Courses

Week 11:

Here are some other things that we have not gotten to in the class.

Website Spotlight: datavis.ca

Website Spotlight: Edward Tufte

Website Spotlight: ddpy

 

Thank you too the following former students.

1. I wanted to talk about social media and networks.  We did not get there.  Try gephi  This is an open source software package for the visualization of networks.

Check out SoMe Lab.  And Jeff Hemsley's website.  Thank you Jeff for showing me some of what you are interested in.

2. Rladies  Gabriela de Queiroz  Gabriela introduced me to FlowingData and she encouraged me to move forward with learning more about Visualization.  Everyone should join Rladies!

3. Analytics Hackathon  Charles Twitchell  Charles introduced me to more of the Bay Area MeetUps.  His interested in finance has lead me to the use of many more R libraries for finance and has lead me to learning python.

Networks

SAS and Teradata.

Turns out that students can use SAS on basically any platform now with SAS University.  I would suggest everyone give it a try.  And SAS Visual Analytics software is now part of the TeraData University

python

Some further things that I wanted to discuss related to python.

Next Quarter: APIs and cloud computing for Machine Learning

 


Week 10:

  • Final next week Wednesday 3/18/2015 8:00-9:50  (Correction, the final starts at 8:00pm.)

 

 

 

 

  • Activity Two: (Optional.  This is for those interested in python.)
  • Using python
  • Try my updated version of the authors python code.  My code does not currently work as well since it is not using google's API geocode-locations3.py

 

  • Activity Three:
  • Using R
  • Try the R code for the first part of Ch. 8 mapping.R

 

 

 

Last Day: Going Foward

Software

More D3 and js

Microsoft tools for Visualization

SAS tools for Visualization

 

SAS University

blogs

 

Further Education

 

 


Week 9:

 

  • Homework 8 has been posted.  Note that an additional Activity has been added, Activity Five.  See below

 

 

  • Project 2:  Pick your favorite plot from the book.  Find a dataset that you can used to make such a plot.  Produce the plot.  Explain what the data and variables represent.  Explain how the graph displays a visualization of the information in the data.

 

 

 

 

  • Activity Three:
  • Download the data from the Book website for Chapter 7. 
  • Make the Star Charts and Nightingale Charts  for the Crime data.

 

  • Activity Four:
  • Download the data from the Book website for Chapter 7. 
  • Make the Parallel Coordinates Charts for the Education data.
  • The libary ggparallel can be used in R to make parallel coordinate plots.  parallel_examples.R

 

  • Software Spotlight - ggobi

 

  • Activity Five:
  • Download the data from the Book website for Chapter 7. 
  • Make the plots for the Education data using Multidimensional Scaling and using Clustering.

 


Week 8:

 

 

 

  • Homework 7 has been posted.

 

 

  • Activity One:
  • From the Correlation presentation, from last week, work with the book price data to show the effect of Simpson's Paradox.  Use Minitab to fit the model.  Use tableau to fit the model
  • Do you see how to plot the individual lines on the scatterplot.  Do you see how to fit the model?

 

  • Activity Two:
  • Download the data from the Book website for Chapter 6. 
  • Make the histogram for the birth-rate data.
  • Try the option, brakes=5 and brakes=20

 

 

 

  • Activity Five:
  • Try the code in plot-tv-sizes.R  Note the use of the par() function in R.

 

  • Activity Six:
  • Try the hts01.R example code.

 

 


Week 7:

 

  • Activity One:
  • From today's presentation Correlation try the code for the Simpson's Paradox.
  • Do you see how the direction of the correlation changes with the confounding variable?

 

  • Activity Two:
  • Download the data from the Book website for Chapter 6. 
  • Make the scatterplot matrix for the crime data.

 

  • Activity Three:
  • Make the correlogram for the crime data.

 

  • Activity Four:
  • Try to recreate the dataset needed to make the plot from the Wall Street Journal graph, Chinese Labor Rifts Deepen by using the website Chinese Labor Bulletin Strike Map
  • Be sure to use area in your Bubble plot.
  • What do you see that might be problem from the website?

 

 

 

 

 


Week 6:

  • I have figured out how to create an Assignment in Blackboard under Course Materials.  Please try to turn in your Homework 5 in Blackboard.
  • Stat. 6610 Extra-Credit

 

  • Proportions2
  • Quiz 1 next week Wednesday Feb. 11  The quiz will cover the Chapters 1 - 5 in the book.
  • Today we will be in the computer lab.
  • We will use the Chapter 5 data from the Book website, at the bottom of the page.
  • FlowingData - Tutorial D3.js
  • FlowingData - Tutorial Treemap

 

  • Activity One: Protovis
  • Open donut.html  Right-click and Inspect Element.  Click on Sources and download the donut.html (double click on the file name) and protoviz-r3.2.js files to your computer.  In the directory where you have downloaded donut.html, make a directory js, and save protoviz-r3.2.js to that directory. 
  • Open donut.html in a text editor such as NoteTab Light.  You will need to remove the / before the js/protovis-r3.2.js
  • Replace the data in the donut.html file with the following data
    • var data = [200,150,100,90,80,70,50,20,10,1];
    • And find the color line:
    • var depthColors = pv.Scale.linear(0, 200)

 

  • Activity Two: Protovis
  • Open stacked-bar.html
  • Click on Sources and download the stacked-bar.html (double click on the file name) and protoviz-r3.2.js files to your computer.  In the directory where you have downloaded stacked-bar.html, make a directory js, and save protoviz-r3.2.js to that directory. 
  • Open stacked-bar.html in a text editor such as NoteTab Light.  You will need to remove the / before the js/protovis-r3.2.js
  • Change the first bar in the stacked-bar.html to 50, 45, 5
  • Change the title of the graph to
  • <h1>Approval Ratings for President Obama</h1>
  • Go to the 0to225.com and pick 3 new colors and implement the change.
  • Change the labels from "white" to "black".

 

  • Activity Three: D3.js mbostock/d3 gallery D3.js new gallery
  • Open Simple Scatter Chart Example  Click on the Open in a New Window link.  Right-click and Inspect Element.  Click on Sources and download the index.html and scatterchart.js files to your computer. 
  • Use a text editor to view the index.html file and the scatterchart.js file.  Replace the data in the scatterchart.js file with the following data
    • var data = [[4,9], [10,7], [15,2], [2,12]]

 

 

  • Activity Five: tableau
  • Take the data from the donut.html example and enter it into tableau.  donut.xlsx
  • var data = [172,136,135,101,80,68,50,29,19,41];
    var cats = ["Statistics", "Design", "Business", "Cartography", "Information Science", "Web Analytics", "Programming", "Engineering", "Mathematics", "Other"];
  • Make a pie chart.
  • What do the colors looks like?
  • Change the color palette to Orange.
  • Change the transparency to 50%.
  • In tableau, create Story and add a title and a lead in sentence.
  • An example, donut.twb

 

  • Activity Six: tableau
  • Make a Treemap for the post-data.txt that looks as close to the picture in the book on page 161.

 

  • Activity Seven: tableau
  • Make time plots for the expenditures.txt data that looks as close to the picture in the book on page 172.

 

 

  • Hint:  If you ever run into problems running your R code it might be due to some previous things being in your workspace.  Try removing all of the save objects with rm( list = ls() )  (Thank you Emily.)

 


Week 5:

 

  • TimeSeries2
  • Today we will be in the computer lab.
  • We will discuss the first Project, the Quiz, and Midterm this week.  (Current plan is for a short quiz next week on Wednesday.  The Midterm the following Wednesday.  The first project given out next week.)
  • We will use the Chapter 4 data from the Book website, at the bottom of the page.

 

  • Activity One: Hot Dog Eating - bar graphs
  • Activity Two: FlowingData subscribers - time plots
  • Activity Three: World Population - time plots
  • Activity Four: US Postage - step plot
  • Activity Five: Unemployment Data - LOESS
  • Activity Six: Johnson & Johnson stock price - decompose( ) function in R, ACF, PACF

 

 


Week 4:

 

 

  • Tools2
  • Today we will be in the computer lab.

 

 

 

 

 

 

  • Activity Five:
    • Try plot.ly
    • Try a heatmap example.

 

 


Week 3:


Week 2:

 

 

 

  • Monday will be the first day of class that we will be meeting in the computer lab SSc 146, not in the classroom.
  • All twitter suggestions are optional.
  • DataScraping
  • Spotlight Website - Data Driven Journalism
  • Spotlight Website - Automatic  (Thank you Josh.  This is so cool!!!)

 


Week 1:

  • Getting Started
  • Spotlight Software - Gapmider World Offline 
  • My time series plot of Chevron stock prices.  Made with tableau  cvx
  • Spotlight Software - tableau public
  • To find the videos about how to get started with tableau, go to the tableau public website and click on the top link HOW IT WORKS.  There is a nice video on this page.  To the right the fourth link is to Training.  This link takes you to many video about how to use tableau.  Please note the online training this Friday at 9:30 AM.  Hope you can find time to register and attend this online training.

 

  • For the first day of class we will be meeting in NSc 207, not in the computer lab.
  • Welcome
  • Spotlight Software - clic<tale