Seasonal Plots

Author

Eric A. Suess

Published

January 27, 2025

Today we are going to take a look at a number of the time series datasets, tibbles, that are used in the fpp3 book that show seasonal patterns.

A seasonal pattern is one that is exhibited over and over again at regular intervals.

library(fpp3)
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr
── Attaching packages ──────────────────────────────────────────── fpp3 1.0.0 ──
✔ tibble      3.2.1     ✔ tsibble     1.1.5
✔ dplyr       1.1.4     ✔ tsibbledata 0.4.1
✔ tidyr       1.3.1     ✔ feasts      0.3.2
✔ lubridate   1.9.3     ✔ fable       0.3.4
✔ ggplot2     3.5.1     ✔ fabletools  0.4.2
── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
✖ lubridate::date()    masks base::date()
✖ dplyr::filter()      masks stats::filter()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval()  masks lubridate::interval()
✖ dplyr::lag()         masks stats::lag()
✖ tsibble::setdiff()   masks base::setdiff()
✖ tsibble::union()     masks base::union()

The book the author used the following code to create the a10 tsibble.

Most poeple consider the code here to be “sort of bad” code. Why?

Answer: The assignment to a new object in R is usually done at the start of a pipeline of code not at the end. It is easy to missing the assignment at the end.

PBS |>
  filter(ATC2 == "A10") |>
  select(Month, Concession, Type, Cost) |>
  summarise(TotalC = sum(Cost)) |>
  mutate(Cost = TotalC / 1e6) -> a10

Beware of the is in the book. It may be hard to find code your are looking for because of this. Better practice.

a10 <- PBS |>
  filter(ATC2 == "A10") |>
  select(Month, Concession, Type, Cost) |>
  summarise(TotalC = sum(Cost)) |>
  mutate(Cost = TotalC / 1e6)    # 1e6 = 1000000
a10
a10 |> autoplot()
Plot variable not specified, automatically selected `.vars = TotalC`

write.csv(a10, "a10.csv")
a10 |> gg_season()  # plots first column
Plot variable not specified, automatically selected `y = TotalC`

a10 |>
  gg_season(Cost, labels = "both") +
  labs(y = "$ million",
       title = "Seasonal plot: antidiabetic drug sales")

a10 |> gg_subseries()  # plots first column
Plot variable not specified, automatically selected `y = TotalC`

a10 |>
  gg_subseries(Cost) +
  labs(y = "$ million",
       title = "Seasonal subseries plot: antidiabetic drug sales")

Notice that there is an upward trend in the data. So the year is an important predictor of the TotalC.

Also notice the effect of the tend on the ACF.

a10 |> gg_tsdisplay(TotalC)

Tourism

Filter the data to only look at trips where the purpose was a Holiday.

holidays <- tourism |>
  filter(Purpose == "Holiday") |>
  group_by(State) |>
  summarise(Trips = sum(Trips))

holidays
holidays |> autoplot(Trips) +
  labs(y = "thousands of trips",
       title = "Australian domestic holiday nights")

Compare two of the time series.

Victory is very well behaved Quarterly data.

holidays |> filter(State == "Victoria") |> 
  gg_tsdisplay()
Plot variable not specified, automatically selected `y = Trips`

holidays |> filter(State == "Victoria") |> write.csv("victoria.csv")

holidays |> filter(State == "Victoria") |> ACF()
Response variable not specified, automatically selected `var = Trips`
holidays |> filter(State == "Victoria") |> ACF() |> autoplot()
Response variable not specified, automatically selected `var = Trips`

ACT is not so well behaved.

holidays |> filter(State == "ACT") |> 
  gg_tsdisplay()
Plot variable not specified, automatically selected `y = Trips`

Scatterplots

Now, summing over all of the different Purposes within the States.

visitors <- tourism |>
  group_by(State) |>
  summarise(Trips = sum(Trips))

visitors 

Consider the cross-correlation between the different time series from different States.

visitors |>
  ggplot(aes(x = Quarter, y = Trips)) +
  geom_line() +
  facet_grid(vars(State), scales = "free_y") +
  labs(y = "Number of visitor nights each quarter (millions)")

The ggpairs() function assumes the different time series are down separate columns. Note that is changes a tidy tsibble into a non-tidy tsibble.

visitors |>
  pivot_wider(values_from=Trips, names_from=State) |>
  GGally::ggpairs(columns = 2:9)
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2

Now consider the autocorrelation with since time series.

Can you see the autocorrelation every 4 Quarters in the Victoria time series?

visitors |>
  pivot_wider(values_from=Trips, names_from=State) |> 
  select("Victoria") |> 
  gg_lag(geom = "point")
Plot variable not specified, automatically selected `y = Victoria`

visitors |>
  pivot_wider(values_from=Trips, names_from=State) |> 
  select("Victoria") |> 
  ACF() |> 
  autoplot()
Response variable not specified, automatically selected `var = Victoria`

Can you see that there is not so clear of a seasonal pattern in the ACT time series.

visitors |>
  pivot_wider(values_from=Trips, names_from=State) |> 
  select("ACT") |> 
  gg_lag(geom = "point")
Plot variable not specified, automatically selected `y = ACT`

visitors |>
  pivot_wider(values_from=Trips, names_from=State) |> 
  select("ACT") |> 
  ACF() |> 
  autoplot()
Response variable not specified, automatically selected `var = ACT`