2024-01-17
This course will be about Data Science and the use of Statistical Learning/Machine Learning/Artificial Intelligence to analyze data.
Our focus will be on using the setup of Machine Learning with training data, test data, and validation data.
For classification problems we will discuss accuracy using confusion tables.
We will also be using modern R packages to fit the models we will be learning about.
We will be using traditional R packages and the tidyverse and tidymodels.
We will be using h2O.ai and h2O Driverless AI.
We will be using Tensorflow for R and keras for R.
We may discuss Spark for R.
Classifiers and Regression/Prediction
Clustering
Dimension Reduction
There are many many excellent references that will be useful for this class.
There are some references provided on the syllabus.
There will be many links given on the website.
My current favorite podcast about data science is DataFramed.