library(pacman)
p_load(tidyverse, fpp3)
Stat. 674 Midterm
Instructions:
The midterm investigates three time series datasets, simulated white noise, hh_budget and aus_retail. The questions ask about the ACF, decomposition methods, forecasting, and the use of training and test datasets to measure forecast accuracy.
Question 1
Simulate a white noise time series with 250 data points. Plot the time series and the ACF of the time series. Is there are trend? Is there a seasonal pattern? Are there any meaningful statistically significant correlations?
Use a seed of 1234.
Answer
<<< Write your answer here. >>>
Provide your code here.
Question 2
Try the X11, SEATS, and STL Decomposition methods on the Household Budget data, hh_budget to estimate the tends in Wealth for the different countries in the dataset.
- Which methods work? If not, why does the method fail?
- Is there are seasonal component in these times series?
Answer
<<< Write your answer here. >>>
Provide your code here.
data(hh_budget)
head(hh_budget)
# A tsibble: 6 x 8 [1Y]
# Key: Country [1]
Country Year Debt DI Expenditure Savings Wealth Unemployment
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Australia 1995 95.7 3.72 3.40 5.24 315. 8.47
2 Australia 1996 99.5 3.98 2.97 6.47 315. 8.51
3 Australia 1997 108. 2.52 4.95 3.74 323. 8.36
4 Australia 1998 115. 4.02 5.73 1.29 339. 7.68
5 Australia 1999 121. 3.84 4.26 0.638 354. 6.87
6 Australia 2000 126. 3.77 3.18 1.99 350. 6.29
Hint: Plot the time series, set up the model, etc.
Question 3
Try different forecasting methods to forecast 12 steps into the future the Turnover in the Liquor Industry in New South Wales, Australia using the aus_retail dataset.
Try the following: MEAN, RW, TSLM(Turnover ~ trend()), TSLM(Turnover ~ trend() + season()), NAIVE, SNAIVE
- Try all of the methods and determine a best method by visual inspection of forecasts for one year.
- Now split the data into training and testing subsets of the data. Use the data until Jan 2017 as the training data. Using the method you have selected measure its error for forecasting the testing data, which is 2018 data.
Hint: Read Section 5.8
Answer
<<< Write your answer here. >>>
Provide your code here.
data(aus_retail)
head(aus_retail)
# A tsibble: 6 x 5 [1M]
# Key: State, Industry [1]
State Industry `Series ID` Month Turnover
<chr> <chr> <chr> <mth> <dbl>
1 Australian Capital Territory Cafes, restaurants… A3349849A 1982 Apr 4.4
2 Australian Capital Territory Cafes, restaurants… A3349849A 1982 May 3.4
3 Australian Capital Territory Cafes, restaurants… A3349849A 1982 Jun 3.6
4 Australian Capital Territory Cafes, restaurants… A3349849A 1982 Jul 4
5 Australian Capital Territory Cafes, restaurants… A3349849A 1982 Aug 3.6
6 Australian Capital Territory Cafes, restaurants… A3349849A 1982 Sep 4.2
# Step 1 Tidy
<- aus_retail %>% filter(State == "New South Wales" & str_detect(Industry, "^L"))
aus_retail_sw
# Step 2 Visualize
# Step 3 Specify
# Step 4 Evaluate
# Hint: fit %>% gg_tsresiduals()
# Step 5 Visualize