--- title: "Getting Started" author: "Prof. Eric A. Suess" format: revealjs: self-contained: true --- ## Getting Started In Chapters 2, 3, 4 the *network architecture*, *compilation* and *fitting* steps are discussed. The first examples are of feedforward neural networks. In the network architecture the *layer_dense()* and the *activation function* is used. In the compilation steps the *loss*, *the optimizer*, and the *metric* is specified. ## Tensors multidimensional arrays - 1D - 2D - 3D - 4D ## Examples of Tensors - Vector data - **2D tensors** of shape (samples, features) - Timeseries data or sequence data - **3D tensors** of shape (samples, timesteps, features) - Images - **4D tensors** of shape (samples, height, width, channels) or (samples, channels, height, width) - Video - **5D tensors** of shape (samples, frames, height, width, channels) or (samples, frames, channels, height, width) ## Array reshaping In R **columns** are the primary way to look at arrays. In Tensorflow **rows** are primary and the *array_reshape()* is used to prepare data for use with keras/tensorflow. ## A geometric interpretation of deep learning Be sure to read this section. The description of what neural networks try to do is very interesting. See Figure 5.9 ## Gradients Derivatives in Calculus measure the rate of change of a function. In multidimensional space derivatives are called gradients. Gradients are used to minimize/maximize functions. ## Stochastic Gradient Decent Four step process. 1. Draw a random batch of training samples x and corresponding targets y. 2. Run the network on x to obtain predictions y_pred. 3. Compute the loss of the network on the batch, a measure of the mismatch between y_pred and y. 4. Compute the gradient of the loss with regard to the network’s parameters (a backward pass). 5. Move the parameters a little in the opposite direction from the gradient—for example, W = W - (step * gradient)—thus reducing the loss on the batch a bit. ## Backpropagation Algorithm A neural network is composed of multiple tensor operations. The backpropagation algorithm used the **Chain Rule** to implement the algorithm. ## Examples in Chapter 4 Try to get the examples from Chapter 4 to run. 1. Binary classification, movie reviews: The data preparation uses the Bag-of-Words model. All words in the reviews for the Dictionary or Bag-of-Words. Each word is given a number value. 2. Multi-class classification, newswires 3. Regression, predicting house prices.