---
title: "Building a simple neural network using Keras and Tensorflow"
author: "Prof. Eric A. Suess"
toc: true
format: 
  html:
    embed-resources: true
---

Thank you
----------

A big thank you to Leon Jessen for posting his code on github.

[Building a simple neural network using Keras and Tensorflow](https://github.com/leonjessen/keras_tensorflow_on_iris/blob/master/README.md)

I have forked his project on github and put his code into an R Notebook so we can run it in class.

Motivation
----------

The following is a minimal example for building your first simple artificial neural network using Keras and TensorFlow for R.

[TensorFlow for R by RStudio lives here](https://keras3.posit.co/articles/getting_started.html).

Getting started - Install Keras and TensorFlow for R
----------------------------------------------------

You can install the Keras for R package from CRAN as follows:
```{r}
#| eval: false
# install.packages("keras3")
```

TensorFlow is the default backend engine. TensorFlow and Keras can be installed as follows:

```{r}
#| eval: false
# library(keras3)
# install_keras()
```

Naturally, we will also need `Tidyverse`:

```{r}
#| eval: false
# install.packages("tidyverse")
```

Once installed, we simply load the libraries

```{r}
library("keras3")
library("tidyverse")
```

Artificial Neural Network Using the Iris Data Set
-------------------------------------------------

Right, let's get to it!

### Data

The famous (Fisher's or Anderson's) `iris` data set contains a total of 150 observations of 4 input features `Sepal.Length`, `Sepal.Width`, `Petal.Length` and `Petal.Width` and 3 output classes `setosa` `versicolor` and `virginica`, with 50 observations in each class. The distributions of the feature values looks like so:

```{r}
iris |> 
  as_tibble() |> 
  pivot_longer(
    cols = -Species, 
    names_to = "feature", 
    values_to = "value"
  ) |>
  ggplot(aes(x = feature, y = value, fill = Species)) +
  geom_violin(alpha = 0.5, scale = "width") +
  theme_bw()
```

Our aim is to connect the 4 input features to the correct output class using an artificial neural network. For this task, we have chosen the following simple architecture with one input layer with 4 neurons (one for each feature), one hidden layer with 4 neurons and one output layer with 3 neurons (one for each class), all fully connected:

![architecture_visualisation.png](./img/architecture_visualisation.png)

Our artificial neural network will have a total of 35 parameters: 4 for each input neuron connected to the hidden layer, plus an additional 4 for the associated first bias neuron and 3 for each of the hidden neurons connected to the output layer, plus an additional 3 for the associated second bias neuron. I.e. $4 \times 4+4+4 \ times 3+3=35$

### Prepare data

We start with slightly wrangling the iris data set by renaming and scaling the features and converting character labels to numeric:

```{r}
set.seed(265509)
nn_dat <- iris |> 
  as_tibble() |>
  mutate(sepal_length = scale(Sepal.Length),
         sepal_width  = scale(Sepal.Width),
         petal_length = scale(Petal.Length),
         petal_width  = scale(Petal.Width),          
         class_label  = as.numeric(Species) - 1) |> 
    select(sepal_length, sepal_width, petal_length, petal_width, class_label)

nn_dat |> head(3)
```

Then, we create indices for splitting the iris data into a training and a test data set. We set aside 20% of the data for testing:

```{r}
test_fraction   <- 0.20
n_total_samples <- nrow(nn_dat)
n_train_samples <- ceiling((1 - test_fraction) * n_total_samples)
train_indices   <- sample(n_total_samples, n_train_samples)
n_test_samples  <- n_total_samples - n_train_samples
test_indices    <- setdiff(seq(1, n_train_samples), train_indices)
```

Based on the indices, we can now create training and test data

```{r}
x_train <- nn_dat |> 
  select(-class_label) |> 
  as.matrix() |> 
  (\(m) m[train_indices, ])()

y_train <- nn_dat |> 
  slice(train_indices) |> 
  pull(class_label) |> 
  to_categorical(num_classes = 3)

x_test <- nn_dat |> 
  select(-class_label) |> 
  as.matrix() |> 
  (\(m) m[test_indices, ])()

y_test <- nn_dat |> 
  slice(test_indices) |> 
  pull(class_label) |> 
  to_categorical(num_classes = 3)
```

### Set Architecture

With the data in place, we now set the architecture of our artifical neural network:

```{r}
model <- keras_model_sequential()
model |> 
  layer_dense(units = 4, activation = 'relu', input_shape = 4) |> 
  layer_dense(units = 3, activation = 'softmax')
model |> summary()
```


Next, the architecture set in the model needs to be compiled:

```{r}
model |> compile(
  loss      = 'categorical_crossentropy',
  optimizer = optimizer_rmsprop(),
  metrics   = c('accuracy')
)
```

### Train the Artificial Neural Network

Lastly we fit the model and save the training progres in the `history` object:

```{r}
history <- model |> fit(
  x = x_train, y = y_train,
  epochs = 200,
  batch_size = 20,
  validation_split = 0
)
plot(history) +
  ggtitle("Training a neural network based classifier on the iris data set") +
  theme_bw()
```

### Evaluate Network Performance

The final performance can be obtained like so:

```{r}
perf <- model |> evaluate(x_test, y_test)
print(perf)
```

```{r}
classes <- iris |> 
  as_tibble() |> 
  pull(Species) |> 
  unique()

y_pred <- model |> 
  predict(x_test)  |> 
  op_argmax(axis = -1) |>
  as.numeric() - 1

y_true <- nn_dat |> 
  slice(test_indices) |> 
  pull(class_label)

tibble(
  y_true  = classes[y_true + 1], 
  y_pred  = classes[y_pred + 1],
  Correct = factor(ifelse(y_true == y_pred, "Yes", "No"))
) |>
  ggplot(aes(x = y_true, y = y_pred, colour = Correct)) +
  geom_jitter() +
  theme_bw() +
  ggtitle(label = "Classification Performance of Artificial Neural Network",
          subtitle = str_c("Accuracy = ",round(perf$accuracy,3)*100,"%")) +
  xlab(label = "True iris class") +
  ylab(label = "Predicted iris class")
```


```{r}
library(gmodels)

CrossTable(y_pred, y_true,
           prop.chisq = FALSE, prop.t = FALSE, prop.r = FALSE,
           dnn = c('predicted', 'actual'))

```


### Conclusion

I hope this illustrated just how easy it is to get started building artificial neural network using Keras and TensorFlow in R. With relative ease, we created a 3-class predictor with an accuracy of 100%. This was a basic minimal example. The network can be expanded to create Deep Learning networks and also the entire TensorFlow API is available.

Enjoy and Happy Learning!

Leon

**Thanks again Leon, this was awsome!!!**