Energy Demand Prediction - Hourly

This notebook aims to predict the aggregate hourly electricity demand from a selected list of locations in the City of Helsinki using Facebook Prophet.

Table of contents

  1. Data Fetching
  2. Data Wrangling
  3. Exploratory Data Analysis
  4. Models: Prophet
    1. Loaded modules and preparing data
    2. Prophet: Description
    3. Default Prophet model
    4. Better Prophet model
      1. Fitting the model
      2. Cross-validation
    5. Model selection

Data Fetching

Data Wrangling

The first thing to do is to load the hourly data that was pulled using the scripts in the data fetching section.

Next, we clean the data and restructure it into a more suitable format.

Next, we further clean the data. As there may be locations whose data are not in hourly resolution, we need to reformat those dataframes so that all of them are in the hourly resolution.

Next, we combine all dataframes in locs into 1 big dataframe df, in which each column is a location.

Now, we can already spot that there are NaN entries in the dataframe. In other words, for those locations at those timestamps, there was no available data. Although it may be arbitrary, we decided to only take the columns (locations) in which there are less than 100 missing entries.

However, our data might still have missing timestamps, which is why we insert rows whose index are the missing timestamps.

As one solution to the missing entries problem, we decided to interpolate the data linearly, that is, for each column (location), the missing entries are filled with values that follow a linear trend to other entries. There are other interpolation methods as well, but we decided to go with this method. This interpolation should not affect the inference much as for each column there are only at most 100 missing entries, and the dataframe has more than 25000 rows.

After the interpolation step, we sum over all columns (locations) as we only wish to predict the hourly "total" electricity demand.

As a final step in this section, we save the aggregated data.

Exploratory data analysis

We start by reading in the data.

First, we look at some descriptive statistics about the data.

Note that this data is on the hourly resolution; therefore, the max value is much lower than the average (or even the minimum) value of the daily data. Next, we try to visualize the time series to gain some insights.

We could already identify some general patterns from the data above. For example, we notice that the demand peaks in winter-spring then bottoms out during summer. We should zoom in on different parts of the series.

It looks like that during the holidays (winter - new year), there is low electricity demand. In addition, during the weekend, the demand is lower than in the weekdays. Next, we investigate the summer months.

Compared to the plot above, it is clear that the demand in the summer is lower than the other seasons. There is a decreasing trend in demand in the first half of the year, and there is an increasing trend in the second half of the year. This situation should be the same for 2017:

Indeed it is. Next, we zoom in on a random week.

The plot above further emphasizes the weekly trend as well as reveals the daily trends: the demand tend to peak locally at around noon (Finnish time). The trends and seasonalities above (except for the daily seasonality) are also explained in the Exploratory Data Analysis for daily data section.

Since this is a time series, we should look at its autocorrelation function plot to identify any patterns that could potentially help us build common models such as ARIMA.

As is identified above and further confirmed with these two plots, this time series has multiple seasonality and complex seasonality patterns. For this kind of time series, ARIMA (a common time series model) will not work very well.

Here, we also try to decompose the time series with the STL method. The period is $24\times 7$ hours because from most of the plots above (and from the previous EDA section on daily data), there is a weekly cycle, and there are $24\times 7$ hours in a week. Of course, there are other seasonalities in this data, but the period mentioned above should demonstrate well this decomposition.

Now, we are pretty confident that the hourly electricity demand time series consists of three main components: A (yearly) trend that reaches bottom in July, a complex seasonality component (with multiple other seasonality sub-components), and holiday effects.

Models: Prophet

Loaded modules and preparing data

First, we load in the hourly data created in the wrangling step.

Next, we rename the columns for use in the models below. The models below expect the data to have a ds column (representing the timestamps) and a y column (representing the values of the time series).

We divide the data into three sets: train, validation, and test sets. The test set starts from the timestamp 2019-12-01. In the remaing portion of the data, the first 80% of it is used for training, and the remaining 20% is used for validation. However, we may choose not to generate the validation set if it's not necessary.

Prophet: Description

Prophet is a open-source project from Facebook that aims to model time series data and make reliable forecasts. Essentially, a Prophet model is an additive model with several components: a seasonal component with multiple levels (e.g., yearly, weekly, daily, and user-defined seasonalities), a trend component which can be highly non-linear, and holiday effects. Prophet fits models with Stan, a probabilistic programming framework. Prophet estimates the parameters with maximum a posteriori estimates by default.

The model can typically be expressed with the following formula:

$$ \begin{aligned} y_t=g(t)+s(t)+h(t) \end{aligned} $$

where:

For more information, please see the article (Taylor et al., 2018).

Default Prophet model

First, we try the default Prophet model, that is, we do not specify any further options and use the default arguments for its fit method. In a default Prophet model for hourly time series, there is a yearly seasonality, weekly seasonality, and daily seasonality, The growth is linear and the components are additive.

After fitting the model_default on the train dataset, we use it to make forecasts on the validation dataset.

The plot below shows the fitted values (blue curve) against the true data (black dots). The blue bands are 95% uncertainty intervals.

As we can see, the fit is not satisfactory as the model underestimates and overestimates the data at regular intervals (the blue line is consistently lower and higher than the black "curve" (dots), even if the 95% intervals cover the black dots.

We should zoom in on a time period to investigate the fits.

Additionally, we investigate the autocorrelation plot of the residuals, i.e., true value minus predicted value.

In a satisfactory (ARIMA) model, the autocorrelations should be very low (they should lie within the blue 95% confidence band). However, in the plot above, not only are the autocorrelations high, they also lie outside the band. Since we did not include any autoregressive terms, this bad fit is to be expected.

From all plots above, we can conclude that the model regularly underestimates as well as overestimates the data, and that the model is not satisfactory.

Next, we plot some components of the models: the trend and the seasonalities.

Remarks: the default model correctly identifies that during a day, the electricity demand typically reaches its daily peak around noon (the time zone of the data is in UTC, but the origin of the data is in UTC+2 or UTC+3 time zone). It also correctly identifies that in a year, the demand typically reaches its lowest in the summer. In contrast, the demand is typically very high in the winter. In addition, in a week, the model correctly identifies that the demand is lower in weekends and higher in weekdays.

Better Prophet model

Fitting the model

After a careful look at the data, we observe that in the summer, the within-day variation in electricity demand seems to be smaller than that in the winter. This suggests that this variation has a multiplicative effect. In addition, we notice that the demand is always lower in the weekend than during the weekdays. In addition, within each year, we see that there is a decreasing trend in the first half of the year and an increasing trend in the second half of the year.

In light of the observations above, we add our custom seasonalities. First of all, we add seasonalities for weekends and weekdays for each season (winter, spring, summer, autumn). However, we transform the data by logarithmizing the values because we wish the seasonalities to have a multiplicative effect on each other as log of a product is a sum:

$$ \begin{aligned} y_t&=g(t)\times s(t)\times h(t)\\ \Rightarrow \log{y_t}&=\log{g(t)}+\log{s(t)}+\log{h(t)} \end{aligned} $$

Although the model consistently overpredicts in the summer, the forecasts are quite satisfactory in other seasons. Judging from the last plot above, the residual autocorrelations are much lower than those of the previous model.

Cross-validation

Cross-validation on time series data is different from that on common non time-series data because there is a time structure which must not be broken with random partitioning used in normal cross-validation methods. This is why Prophet has its own cross-validation methods.

The cross-validation procedure is as follows. First, the model is trained on the first 8766 hours of the data (initial). Then, it forecasts the next 720 hours of data (horizon). After that, the model is trained on the first 8766 hours plus the next 1440 hours (period by default) and make predictions on the next 720 hours after that 360 hours. This procedure continues (training data has 1440-hour increments and forecasts are made on the next 720 hours) until there is not enough data left to do this.

We focus on the mean absolute percentage error (MAPE). The absolute percentage error is the absolute value of the proportion of the difference between the true value and the predicted value to the true value. MAPE is a mean of those errors:

$$ \begin{aligned} \text{MAPE}=\frac{1}{n}\sum_{t=1}^n\left|\frac{y-\hat{y}}{y}\right| \end{aligned} $$

This error function can be measured in percentages and is thus easy to understand. The lower this error is, the better the prediction.

Model selection

We could finally evaluate the better model's performance on the test set by retraining it on the aggregation of the train set and the validation set.

Note that for Prophet models, the wider the blue bands (relative to the predictions), the more likely the model is overfit. Here, we see that this is not the case. Now, we evaluate the model on the test set:

Visually, it seems like the model is a good fit although it overpredicts on Christmas holidays. Now we need to check the error of the model on the test set.

We could also repeat the same procedure for the default model above.

The MAPE of the better model is half that of the default model. The RMSE is also much lower. We successfully improved on the default model.

The purpose of the validation set was to help tune the hyperparameters of the Prophet models. This is not of great importance to this project, but we could try doing it in the future.