JSM 2018: Poster 181 - Classroom Demonstration: Deep Learning for Classification and Regression, Introduction to GPU Computing
Author
Prof. Eric A. Suess
Example: Compare Simple Linear Regresion to a single layer NN.
The cars dataset in R contains two variables stopping speed of cars in mph and dist in feet. Using speed to predict stopping distance, two models are fit. See the R code.
What function is used to normalize the data?
What percentage of the data is used for training? What percentage of the data is used for testing?
What is the fitted linear regression model?
What is the correlation between the linear regression predicted values and the values from the test data?
Sketch the NN model that is used to model stopping distance.
What kind of activation function was used in the ANN? Sketch a picture of what the activation function looks like.
What is the correlation between the ANN predicted values and the values from the test data?
Examine the scatterplot of speed by distance with the fitted models. Is the NN fitting a near linear function?
Which model would you use for prediction? Explain.
Answer:
Read in data and examine structure.
library("tidyverse")
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.1 ✔ stringr 1.5.2
✔ ggplot2 4.0.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
cars <-as.tibble(cars)
Warning: `as.tibble()` was deprecated in tibble 2.0.0.
ℹ Please use `as_tibble()` instead.
ℹ The signature and semantics have changed, see `?as_tibble`.
Side note: This is not done using best practices, the scale() function should only be applied to the training data not the entire dataset. This is a common practice in many machine learning books. This should be corrected.
library(NeuralNetTools)par(mar =numeric(4), family ='serif')plotnet(cars_model, alpha =0.6)
Predict the Test Data. Compare predicted values with the holdout values.
model_results <-compute(cars_model, cars_test[1])predicted_dist <- model_results$net.result# examine the correlation between predicted and actual valuescor(predicted_dist, cars_test$dist)
[,1]
[1,] 0.8033258
Plot the fitted models.
ggplot(data=cars_test, aes(x=speed, y=dist)) +geom_point(size =4) +geom_smooth(method='lm', formula=y~x, fill=NA) +geom_line(aes(y = predicted_dist)) +ggtitle("Test Data Fitted with a Linear Model (blue) and NN (black)") +scale_x_continuous(limits =c(-2.2, 2)) +scale_y_continuous(limits =c(-2, 3))
Example: Compare Simple Linear Regression to a Deep Learning, multilayer neural network.
par(mar =numeric(4), family ='serif')plotnet(cars_model, alpha =0.6)
Predict the Test Data. Compare predicted values with the holdout values.
model_results <-compute(cars_model, cars_test[1])predicted_dist <- model_results$net.result# examine the correlation between predicted and actual valuescor(predicted_dist, cars_test$dist)
[,1]
[1,] 0.857052
Plot the fitted models.
ggplot(data=cars_test, aes(x=speed, y=dist)) +geom_point(size =4) +geom_smooth(method='lm', formula=y~x, fill=NA) +geom_line(aes(y = predicted_dist)) +ggtitle("Test Data Fitted with a Linear Model (blue) and NN (black)") +scale_x_continuous(limits =c(-2.2, 2)) +scale_y_continuous(limits =c(-2, 3))