--- title: "Features" author: "Eric A. Suess" date: "2/15/2021" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Time Series Features Statistics computed on time series data are called time series features. ```{r cars} library(pacman) p_load(tidyverse, fpp3) ``` We will examine the time series data in the Tourism dateset. ```{r} data(tourism) tourism tourism %>% features(Trips, quantile) ``` Note the list can be used for many statistics. ```{r} tourism %>% features(Trips, list(mean = mean, median = median, sd = sd, min = min, max = max)) ``` Lets look at the first year of the data. ```{r} tourism %>% filter(Quarter <= yearquarter("1998 Q4")) %>% features(Trips, quantile) ``` ## ACF features The correlations in time are all time series features. ```{r} tourism %>% ACF() ``` ```{r} tourism %>% filter(Region == "Adelaide", State == "South Australia", Purpose == "Holiday") %>% ACF() tourism %>% filter(Region == "Adelaide", State == "South Australia", Purpose == "Holiday") %>% ACF() %>% autoplot() tourism %>% filter(Region == "Adelaide", State == "South Australia", Purpose == "Holiday") %>% autoplot() ``` ## Compute most of the features. ```{r} tourism %>% features(Trips, feat_stl) ``` ```{r} tourism %>% filter(Region == "Adelaide", State == "South Australia", Purpose == "Holiday") %>% features(Trips, feat_stl) ``` ## Use the features to identify a time series We can then use these features in plots to identify what type of series are heavily trended and what are most seasonal. ```{r} tourism %>% features(Trips, feat_stl) %>% ggplot(aes(x = trend_strength, y = seasonal_strength_year, col = Purpose)) + geom_point() + facet_wrap(vars(State)) ``` Find the year with the maximum seasonal strength. ```{r} tourism %>% features(Trips, feat_stl) %>% filter(seasonal_strength_year == max(seasonal_strength_year)) %>% left_join(tourism, by = c("State", "Region", "Purpose")) %>% ggplot(aes(x = Quarter, y = Trips)) + geom_line() + facet_grid(vars(State, Region, Purpose)) ``` ## Full feature set ```{r} tourism_features <- tourism %>% features(Trips, feature_set(pkgs = "feasts")) tourism_features ``` For the homework we will look at the PBS data. ```{r} PBS %>% features(Cost, list(mean = mean, sd = sd)) ```