--- title: "Stat. 654 Quiz keras python" author: "Your Name Here" date: "" format: html: embed-resources: true --- ## Feed-Forward Neural Network for the Google Tensorflow Playground XOR Data ### Clone the [TFPlayground](https://github.com/hyounesy/TFPlaygroundPSA) Github repository into your R Project folder. To clone the repository you can use RStudio File > New Project > Version Control > Git and paste the URL of the repository into the Git Repository URL box. Then select a folder to clone the repository into. Click the Green button and copy the ulr: *https://github.com/hyounesy/TFPlaygroundPSA.git* Then paste the URL into the Git Repository URL box. Select a folder to clone the repository into. Click the Create Project button. Use the data in *../data/tiny/xor_25/input.txt* to create a feed-forward neural network to classify the data. Use the *keras* package to create the model. ### Load the required libraries ```{python} import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns import tensorflow as tf from tensorflow import keras from sklearn.model_selection import train_test_split from sklearn.preprocessing import OneHotEncoder ``` ### Load the data ```{python} input = pd.read_csv("data/tiny/xor_25/input.txt", sep="\t", escapechar="\\", header=0) print(input.head()) ``` ```{python} input = input.rename(columns=lambda x: x.strip()) input = input.drop(columns=['pid']) input['label'] = np.where(input['label'] == -1, 0, 1) input['label'] = input['label'].astype(int) input = pd.DataFrame(input) print(input.head()) ``` ### Split the data into training and testing sets ```{python} n = input.shape[0] input_parts = train_test_split(input, test_size=0.2, random_state=42) train = input_parts[0] test = input_parts[1] print("Number of rows in training set: ", train.shape[0]) print("Number of rows in testing set: ", test.shape[0]) ``` ### Visualize the data ```{python} sns.scatterplot(data=train, x='X1', y='X2', hue='label') plt.show() ``` ```{python} sns.scatterplot(data=test, x='X1', y='X2', hue='label') plt.show() ``` ### Using keras and tensorflow **Note** that the functions in the keras package are expecting the data to be in a matrix object and not a tibble. So as.matrix is added at the end of each line. Do not forget to remove the ID variable pid. ```{python} print(train.head()) x_train = train.drop(columns=['label']).values y_train = train['label'].values x_test = test.drop(columns=['label']).values y_test = test['label'].values print(x_train.shape) print(x_test.shape) print(y_train.shape) print(y_test.shape) ``` ### Set Architecture With the data in place, we now set the architecture of our neural network. keras [activation](https://search.r-project.org/CRAN/refmans/keras/html/activation_relu.html) Use all 7 variables. ```{python} model = keras.Sequential() model.add(keras.layers.Dense(units=8, activation='relu', input_shape=(7,))) model.add(keras.layers.Dense(units=3, activation='relu')) model.add(keras.layers.Dense(units=1, activation='sigmoid')) model.summary() ``` Next, the architecture set in the model needs to be compiled. ```{python} model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy']) ``` ### Train the Artificial Neural Network Lastly we fit the model and save the training progress in the *history* object. **Try** changing the *validation_split* from 0 to 0.2 to see the *validation_loss*. ```{python} history = model.fit(x_train, y_train, epochs=400, batch_size=20, validation_split=0.2) ``` ### Evaluate Network Performance The final performance can be obtained like so. ```{python} perf = model.evaluate(x_train, y_train) print(perf) ``` ```{python} perf = model.evaluate(x_test, y_test) print(perf) ```