---
title: "Stat. 654 Quiz keras python"
author: "Your Name Here"
date: ""
format: 
  html:
    embed-resources: true
---

## Feed-Forward Neural Network for the Google Tensorflow Playground XOR Data

### Clone the [TFPlayground](https://github.com/hyounesy/TFPlaygroundPSA) Github repository into your R Project folder.

To clone the repository you can use RStudio

File > New Project > Version Control > Git

and paste the URL of the repository into the Git Repository URL box.  Then select a folder to clone the repository into.  

Click the Green button and copy the ulr: *https://github.com/hyounesy/TFPlaygroundPSA.git*

Then paste the URL into the Git Repository URL box.  Select a folder to clone the repository into.  Click the Create Project button.  

Use the data in *../data/tiny/xor_25/input.txt* to create a feed-forward neural network to classify the data.  Use the *keras* package to create the model.  

### Load the required libraries


```{python}
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
```


### Load the data

```{python}
input = pd.read_csv("data/tiny/xor_25/input.txt", 
     sep="\t", escapechar="\\", 
     header=0)
     
print(input.head())
```

```{python}
input = input.rename(columns=lambda x: x.strip())
input = input.drop(columns=['pid'])
input['label'] = np.where(input['label'] == -1, 0, 1)
input['label'] = input['label'].astype(int)
input = pd.DataFrame(input)
print(input.head())

```

### Split the data into training and testing sets

```{python}
n = input.shape[0]
input_parts = train_test_split(input, test_size=0.2, random_state=42)
train = input_parts[0]
test = input_parts[1]
print("Number of rows in training set: ", train.shape[0])
print("Number of rows in testing set: ", test.shape[0])
```


### Visualize the data

```{python}
sns.scatterplot(data=train, x='X1', y='X2', hue='label')
plt.show()
```


```{python}
sns.scatterplot(data=test, x='X1', y='X2', hue='label')
plt.show()
```

### Using keras and tensorflow

**Note** that the functions in the keras package are expecting the data to be in a matrix object and not a tibble.  So as.matrix is added at the end of each line.

Do not forget to remove the ID variable pid.

```{python}
print(train.head())

x_train = train.drop(columns=['label']).values
y_train = train['label'].values
x_test = test.drop(columns=['label']).values
y_test = test['label'].values
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)
```

### Set Architecture

With the data in place, we now set the architecture of our neural network.

keras [activation](https://search.r-project.org/CRAN/refmans/keras/html/activation_relu.html)

Use all 7 variables.

```{python}
model = keras.Sequential()
model.add(keras.layers.Dense(units=8, activation='relu', input_shape=(7,)))
model.add(keras.layers.Dense(units=3, activation='relu'))
model.add(keras.layers.Dense(units=1, activation='sigmoid'))
model.summary()
```


Next, the architecture set in the model needs to be compiled.

```{python}
model.compile(optimizer='rmsprop', 
              loss='binary_crossentropy', 
              metrics=['accuracy'])
```

### Train the Artificial Neural Network

Lastly we fit the model and save the training progress in the *history* object.

**Try** changing the *validation_split* from 0 to 0.2 to see the *validation_loss*.

```{python}
history = model.fit(x_train, y_train, 
                    epochs=400, 
                    batch_size=20, 
                    validation_split=0.2)
```


### Evaluate Network Performance

The final performance can be obtained like so.

```{python}
perf = model.evaluate(x_train, y_train)
print(perf)
```


```{python}
perf = model.evaluate(x_test, y_test)
print(perf)
```