Models2

Author

Prof. Eric A. Suess

Published

February 17, 2025

Model Google’s stock price.

library(pacman)
p_load(tidyverse, fpp3)

Re-index based on trading days

google_stock <- gafa_stock |>
  filter(Symbol == "GOOG", year(Date) >= 2015) |>
  mutate(day = row_number()) |>
  update_tsibble(index = day, regular = TRUE)

google_stock

Filter the year of interest

google_2015 <- google_stock |> filter(year(Date) == 2015)
google_2015 

Fit the models

google_fit <- google_2015 |>
  model(
    Mean = MEAN(Close),
    `Naïve` = NAIVE(Close),
    Drift = NAIVE(Close ~ drift())
  )
google_fit 

Produce forecasts for the trading days in January 2016

google_jan_2016 <- google_stock |>
  filter(yearmonth(Date) == yearmonth("2016 Jan"))

google_fc <- google_fit |> 
  forecast(new_data = google_jan_2016)
google_fc

Plot the forecasts

google_fc |>
  autoplot(google_2015, level = NULL) +
  autolayer(google_jan_2016, Close, color = "black") +
  labs(x = "Day", y = "Closing Price (US$)",
       title = "Google stock prices (Jan 2015 - Jan 2016)") +
  guides(colour = guide_legend(title = "Forecast"))

augment(google_fit)

Residual diagnostics

A good forecasting method will yield innovation residuals with the following properties:

  1. The innovation residuals are uncorrelated. If there are correlations between innovation residuals, then there is information left in the residuals which should be used in computing forecasts.
  2. The innovation residuals have zero mean. If they have a mean other than zero, then the forecasts are biased.
  3. The innovation residuals have constant variance.
  4. The innovation residuals are normally distributed.
autoplot(google_2015, Close) +
  labs(x = "Day", y = "Closing Price (US$)",
       title = "Google Stock in 2015")

Fit the Naive model (jus t the mean) and augment the dataset with the residuals and innovations.

aug <- google_2015 |>
  model(NAIVE(Close)) |>
  augment()

autoplot(aug, .innov) +
  labs(x = "Day", y = "Residual",
       title = "Residuals from naïve method")
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).

Normal?

aug |>
  ggplot(aes(x = .innov)) +
  geom_histogram() +
  labs(title = "Histogram of residuals")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).

aug |>
  ACF(.innov) |>
  autoplot() +
  labs(title = "ACF of residuals")

google_2015 |>
  model(NAIVE(Close)) |>
  gg_tsresiduals()
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing non-finite outside the scale range
(`stat_bin()`).

Portmanteau test

From a French word describing a suitcase or coat rack carrying several items of clothing.

Test if there is at least one lagged autocorrelation is different from zero.

Box-Pierce

aug |> features(.innov, box_pierce, lag = 10)

Ljung-Box

aug |> features(.innov, ljung_box, lag = 10)