---
title: "NN Regression"
author: "JSM 2018: Poster 181 - Classroom Demonstration: Deep Learning for Classification and Regression, Introduction to GPU Computing"
output:
  pdf_document: default
  latex_engine: xelatex
  html_notebook: default
---


\begin{center}
Eric A. Suess

Department Of Statistics and Biostatistics

CSU East Bay

eric.suess@csueastbay.edu
\end{center}


# Example: Compare Simple Linear Regresion to a single layer NN.

The **cars** dataset in R contains two variables stopping *speed* of cars in mph and *dist* in feet.  Using speed to predict stopping distance, two models are fit.  See the R code.

a. What function is used to normalize the data?
b. What percentage of the data is used for *training*?  What percentage of the data is used for *testing*?
c. What is the fitted linear regression model?
d. What is the correlation between the linear regression predicted values and the values from the test data?
e. Sketch the NN model that is used to model stopping distance.
f. What kind of activation function was used in the ANN?  Sketch a picture of what the activation function looks like.
g. What is the correlation between the ANN predicted values and the values from the test data?
h. Examine the scatterplot of speed by distance with the fitted models.  Is the NN fitting a near linear function?
i. Which model would you use for prediction?  Explain.

**Answer:**

Read in data and examine structure.

```{r}
suppressMessages(library("tidyverse"))
```


```{r}
cars <- as.tibble(cars)
cars

str(cars)

cars %>% ggplot(aes(x=speed, y=dist)) + 
  geom_point(size = 4) +
  ggtitle("Cars data") 

```

Apply scaling to entire data frame.


```{r}
cars_norm <- cars %>% mutate(speed = scale(speed), dist=scale(dist))
cars_norm

str(cars_norm)

cars_norm %>% ggplot(aes(x=speed, y=dist)) + 
  geom_point(size = 4) + 
  ggtitle("Scaled cars data") +
  scale_x_continuous(limits = c(-2.2, 2)) +
  scale_y_continuous(limits = c(-2, 3))

```

Create training and test data.

**Side note:** This is not done using best practices, the scale() function should only be applied to the training data not the entire dataset.  This is a common practice in many machine learning books.  This should be corrected.

```{r}
set.seed(12345)

idx <- sample(1:50, 40)

cars_train <- cars_norm[idx, ]
str(cars_train)

cars_test <- cars_norm[-idx, ]
str(cars_test)

cars_train %>% ggplot(aes(x=speed, y=dist)) + 
  geom_point(size = 4) + 
  ggtitle("Training Data") +
  scale_x_continuous(limits = c(-2.2, 2)) +
  scale_y_continuous(limits = c(-2, 3))

cars_test %>% ggplot(aes(x=speed, y=dist)) + 
  geom_point(size = 4) + 
  ggtitle("Test Data") +
  scale_x_continuous(limits = c(-2.2, 2)) +
  scale_y_continuous(limits = c(-2, 3))
```

Fit a simple linear regression.  Train a linear regression model.  Predict the Test Data.  Compare predicted values with the holdout values.

```{r}
cars_lm <- cars_train %>% lm(dist ~ speed, data = .)

summary(cars_lm)

predicted_lm_dist <- predict(cars_lm, cars_test)

# examine the correlation between predicted and actual values
cor(predicted_lm_dist, cars_test$dist)  
```

Fit a NN.  Train a neural network model.  Compare the R code.  It is very similar.

```{r}
library(neuralnet)

set.seed(12345)

cars_model <- cars_train %>% neuralnet(formula = dist ~ speed, 
        act.fct = "logistic", hidden = 3, linear.output=TRUE)

plot(cars_model)
```

Nice plot with the plotnet() function.

```{r}
library(NeuralNetTools)

par(mar = numeric(4), family = 'serif')
plotnet(cars_model, alpha = 0.6)
```

Predict the Test Data.  Compare predicted values with the holdout values.

```{r}
model_results <- compute(cars_model, cars_test[1])

predicted_dist <- model_results$net.result

# examine the correlation between predicted and actual values
cor(predicted_dist, cars_test$dist)  
```

Plot the fitted models.

```{r}
ggplot(data=cars_test, aes(x=speed, y=dist)) + 
  geom_point(size = 4) +
  geom_smooth(method='lm', formula=y~x, fill=NA) +
  geom_line(aes(y = predicted_dist)) +
  ggtitle("Test Data Fitted with a Linear Model (blue) and NN (black)") +
  scale_x_continuous(limits = c(-2.2, 2)) +
  scale_y_continuous(limits = c(-2, 3))
```

# Example: Compare Simple Linear Regression to a Deep Learning, multilayer neural network.  

a. Do you think this model will orverfit?  
b. What does parsimonious mean?  
c. Suggest a better measure for goodness-of-fit.

```{r}
cars_model <- cars_train %>% neuralnet(formula = dist ~ speed, 
        act.fct = "logistic", hidden = c(10,5), linear.output=TRUE)

plot(cars_model)
```

Nice plot with the plotnet() function.

```{r}
par(mar = numeric(4), family = 'serif')
plotnet(cars_model, alpha = 0.6)
```

Predict the Test Data.  Compare predicted values with the holdout values.

```{r}
model_results <- compute(cars_model, cars_test[1])

predicted_dist <- model_results$net.result

# examine the correlation between predicted and actual values
cor(predicted_dist, cars_test$dist)  
```

Plot the fitted models.

```{r}
ggplot(data=cars_test, aes(x=speed, y=dist)) + 
  geom_point(size = 4) +
  geom_smooth(method='lm', formula=y~x, fill=NA) +
  geom_line(aes(y = predicted_dist)) +
  ggtitle("Test Data Fitted with a Linear Model (blue) and NN (black)") +
  scale_x_continuous(limits = c(-2.2, 2)) +
  scale_y_continuous(limits = c(-2, 3))
```