Today we are going to learn about the automated exploration of the variables in a data set. This type of data analysis is usually called Automated Exploratory Data Analysis (AutoEDA).
AutoEDA gives a quick view into the high level details of a data set. When presented with a new data set running some AutEDA algorithms can be very useful to learn about the data.
- How many rows of data are in the data set?
- Which variables are numeric? Which are Categorical and need to be coded as factors in R?
- How many missing observations are there in the data set? In each of the variables? Is imputation possible?
- What are the distributions of each variable? Are there outliers?