---
title: "Stat. 654 Quiz keras"
author: "Your Name Here"
date: ""
format: 
  html:
    embed-resources: true
---

## Feed-Forward Neural Network for the Google Tensorflow Playground XOR Data

### Clone the [TFPlayground](https://github.com/hyounesy/TFPlaygroundPSA) Github repository into your R Project folder.

To clone the repository you can use RStudio

File > New Project > Version Control > Git

and paste the URL of the repository into the Git Repository URL box.  Then select a folder to clone the repository into.  

Click the Green button and copy the ulr: *https://github.com/hyounesy/TFPlaygroundPSA.git*

Then paste the URL into the Git Repository URL box.  Select a folder to clone the repository into.  Click the Create Project button.  

Use the data in *../data/tiny/xor_25/input.txt* to create a feed-forward neural network to classify the data.  Use the *keras* package to create the model.  

### Load the required libraries

```{r}
library(tidyverse)
library(tidymodels)
library(readr)
library(janitor)
library(forcats)
library(keras)
```

### Load the data

```{r}
input <- read_delim("data/tiny/xor_25/input.txt", 
     delim = "\t", escape_double = FALSE, 
     trim_ws = TRUE)

input <- input |> 
  clean_names() |> 
  mutate(label = if_else(label == -1, 0, 1) ) |>   # or use 0L, !L
  mutate(label = as.integer(label)) |>
  tibble()

head(input)
```

### Split the data into training and testing sets


```{r}
n <- nrow(input)

input_parts <- input |>
  initial_split(prop = 0.8)

train <- input_parts |>
  training()

test <- input_parts |>
  testing()

list(train, test) |>
  map_int(nrow)
```

### Visualize the data

```{r}
train |> 
  ggplot(aes(x = x1, y = x2, color = factor(label))) +
  geom_point()
```

```{r}
test |> 
  ggplot(aes(x = x1, y = x2, color = factor(label))) +
  geom_point()
```

### Using keras and tensorflow

**Note** that the functions in the keras package are expecting the data to be in a matrix object and not a tibble.  So as.matrix is added at the end of each line.

Do not forget to remove the ID variable pid.

```{r}
x_train <- train %>% select(-pid, -label) |> select(x1, x2) |> as.matrix()
y_train <- train %>% select(label) %>% as.matrix() %>% to_categorical() 

x_test <- test %>% select(-pid, -label) |> select(x1, x2) |> as.matrix()
y_test <- test %>% select(label) %>% as.matrix() %>% to_categorical() 

dim(x_train)
dim(x_test)

dim(y_train)
dim(y_test)
```

### Set Architecture

With the data in place, we now set the architecture of our neural network.

keras [activation](https://search.r-project.org/CRAN/refmans/keras/html/activation_relu.html)


```{r}
model <- keras_model_sequential()
model |>  
  layer_dense(units = 8, activation = 'relu', input_shape = 2) |> 
  layer_dense(units = 3, activation = 'relu') |> 
  layer_dense(units = 2, activation = 'softmax')
model |> summary()
```

Next, the architecture set in the model needs to be compiled.

```{r}
model %>% compile(
  optimizer = "rmsprop",
  loss = "binary_crossentropy",
  metrics = "accuracy"
)
```

### Train the Artificial Neural Network

Lastly we fit the model and save the training progress in the *history* object.

**Try** changing the *validation_split* from 0 to 0.2 to see the *validation_loss*.


```{r}
history <- model %>% fit(
  x = x_train, y = y_train,
  epochs = 400,
  batch_size = 20,
  validation_split = 0.2
)

plot(history) +
  ggtitle("Training a neural network based classifier on the iris data set") +
  theme_bw()
```

### Evaluate Network Performance

The final performance can be obtained like so.


```{r}
perf <- model %>% evaluate(x_train, y_train)
print(perf)
```

```{r}
perf <- model %>% evaluate(x_test, y_test)
print(perf)
```


### Evaluate Network Performance

The final performance can be obtained like so.


```{r}
perf <- model %>% evaluate(x_train, y_train)
print(perf)
```

```{r}
perf <- model %>% evaluate(x_test, y_test)
print(perf)
```