Stat. 674 Midterm

Author

<<< Put your name here >>>

Instructions:

The midterm investigates three time series datasets, simulated white noise, hh_budget and aus_retail. The questions ask about the ACF, decomposition methods, forecasting, and the use of training and test datasets to measure forecast accuracy.

library(pacman)
p_load(tidyverse, fpp3)

Question 1

Simulate a white noise time series with 250 data points. Plot the time series and the ACF of the time series. Is there are trend? Is there a seasonal pattern? Are there any meaningful statistically significant correlations?

Use a seed of 1234.

Answer

<<< Write your answer here. >>>

Provide your code here.

Question 2

Try the X11, SEATS, and STL Decomposition methods on the Household Budget data, hh_budget to estimate the tends in Wealth for the different countries in the dataset.

  1. Which methods work? If not, why does the method fail?
  2. Is there are seasonal component in these times series?

Answer

<<< Write your answer here. >>>

Provide your code here.

data(hh_budget)
head(hh_budget)
# A tsibble: 6 x 8 [1Y]
# Key:       Country [1]
  Country    Year  Debt    DI Expenditure Savings Wealth Unemployment
  <chr>     <dbl> <dbl> <dbl>       <dbl>   <dbl>  <dbl>        <dbl>
1 Australia  1995  95.7  3.72        3.40   5.24    315.         8.47
2 Australia  1996  99.5  3.98        2.97   6.47    315.         8.51
3 Australia  1997 108.   2.52        4.95   3.74    323.         8.36
4 Australia  1998 115.   4.02        5.73   1.29    339.         7.68
5 Australia  1999 121.   3.84        4.26   0.638   354.         6.87
6 Australia  2000 126.   3.77        3.18   1.99    350.         6.29

Hint: Plot the time series, set up the model, etc.

Question 3

Try different forecasting methods to forecast 12 steps into the future the Turnover in the Liquor Industry in New South Wales, Australia using the aus_retail dataset.

Try the following: MEAN, RW, TSLM(Turnover ~ trend()), TSLM(Turnover ~ trend() + season()), NAIVE, SNAIVE

  1. Try all of the methods and determine a best method by visual inspection of forecasts for one year.
  2. Now split the data into training and testing subsets of the data. Use the data until Jan 2017 as the training data. Using the method you have selected measure its error for forecasting the testing data, which is 2018 data.

Hint: Read Section 5.8

Answer

<<< Write your answer here. >>>

Provide your code here.

data(aus_retail)
head(aus_retail)
# A tsibble: 6 x 5 [1M]
# Key:       State, Industry [1]
  State                        Industry            `Series ID`    Month Turnover
  <chr>                        <chr>               <chr>          <mth>    <dbl>
1 Australian Capital Territory Cafes, restaurants… A3349849A   1982 Apr      4.4
2 Australian Capital Territory Cafes, restaurants… A3349849A   1982 May      3.4
3 Australian Capital Territory Cafes, restaurants… A3349849A   1982 Jun      3.6
4 Australian Capital Territory Cafes, restaurants… A3349849A   1982 Jul      4  
5 Australian Capital Territory Cafes, restaurants… A3349849A   1982 Aug      3.6
6 Australian Capital Territory Cafes, restaurants… A3349849A   1982 Sep      4.2
# Step 1 Tidy
aus_retail_sw <- aus_retail %>% filter(State == "New South Wales" & str_detect(Industry, "^L"))

# Step 2 Visualize


# Step 3 Specify


# Step 4 Evaluate


# Hint: fit %>% gg_tsresiduals()

# Step 5 Visualize