In this article, we’ll introduce the key concepts related to time series.

We’ll be using the same data set as in the previous article: Open Power System Data (OPSD) for Germany. The data can be downloaded here

Start by importing the following packages :

### General import
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
import statsmodels.api as sm

### Time Series
from statsmodels.tsa.ar_model import AR
from sklearn.metrics import mean_squared_error
from pandas.tools.plotting import autocorrelation_plot
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller
#from statsmodels.tsa.sarimax_model import SARIMAX

### LSTM Time Series
from keras.models import Sequential  
from keras.layers import Dense  
from keras.layers import LSTM  
from keras.layers import Dropout 
from sklearn.preprocessing import MinMaxScaler  

Then, load the data :

df = pd.read_csv('opsd_germany_daily.csv', index_col=0)
df.head(10)

image

Then, make sure to transform the dates into datetime format in pandas :

df.index = pd.to_datetime(df.index)

I. Key concepts and definitions

1. Auto-correlation

The auto-correlation is defined as the correlation of the series over time, i.e how much the value at time depends on the value at time for all .

  • The auto-correlation of order 1 is :
  • The auto-correlation of order j is :
  • The auto-covariance of order 1 is :
  • The auto-covariance of order j is :

Empirically, the auto-correlation can be estimated by the sample auto-correlation :

Where :

To plot the auto-correlation and the partial auto-correlation, we can use statsmodel package :

fig, axes = plt.subplots(1, 2, figsize=(15,8))

fig = sm.graphics.tsa.plot_acf(df['Consumption'], lags=400, ax=axes[0])
fig = sm.graphics.tsa.plot_pacf(df['Consumption'], lags=400, ax=axes[1])

image

We observe a clear trend. The value od consumption at time is negatively correlated with the values 180 days ago, and positively correlated with the values 360 days ago.

2. Partial Auto-correlation

The partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It is a regression of the series against its past lags.

How can we correct auto-correlation ? Take for example :

Therefore, if you substract the first to the second with a coefficient equal to the auto-correlation :

for

Therefore, if we want to make a regression without auto-correlation :

Why would we want to remove the auto-correlation?

  • to derive the OLS estimator of the parameters for example
  • because there is a bias otherwise since would depend on

3. Stationarity

Stationarity of a time series is a desired property, reached when the joint distribution of does not depend on . In other words, the future and the present should be quite similar. Stationary time series do therefore not have underlying trends or seasonal effect.

image

What kind of events makes a series non-stationary?

  • a trend, i.e increasing sales over time
  • a seasonality, i.e more sales during the summertime than wintertime

We usually want our series to be stationary even before applying any predictive model!

How can we test if a time series is stationary?

  • look at the plots (as above)
  • look at summary statistics and box plots as in the previous article. A simple trick is to cut the data set in 2, look at mean and variance for each split, and plot the distribution of values for both splits.
  • perform statistical tests, using the (Augmented) Dickey-Fuller test

Unit roots

Let’s cover into more details the Dickey-Fuller test. To do so, we need to introduce the notion of unit root. A unit root is a stochastic trend in a time series, sometimes called a random walk with drift. If a series has a unit root, it makes it unpredictable due to a systematic pattern.

Let’s consider an autoregressive (we’ll dive deeper later in to this) :

We define the characteristic equation as :

. If is a root to this equation, then the process is said to have a unit root. Equivalently, the process is said to be integrated of order 1 : .

In other words, there is a unit root if the previous values keep having a 1:1 impact on the current value. If we consider a simple autoregressive model AR(1) : , the process has a unit root when .

If a process has a unit root, then it is non-stationary, i.e the moments of the process depend on .

A process is a weakly dependent process, also called integrated of order 0 ( ) if taking the first different of the model is enough to make the series stationary :

Dickey-Fuller Test

The Dicker Fuller test is used to assess if a unit root is present in an autoregressive process :

There is a unit root and the process is not stationary.

There is no unit root and the process is stationary.

For example, in an AR(1) model where , the hypothesis are :

The hypothesis would mean an explosive process, and is therefore not considered. When , then .

In practice, we consider the following equation :

We have and test .

Augmented Dickey-Fuller Test

The Augmented Dickey-Fuller Test (ADF) is an augmented version of the Dickey-Fuller test in the sense that it can test for a more complex set of time series models. For example, consider an ADF on an AR(p) process :

And the null hypothesis : .

4. Ergodicity

Ergodicity is the process by which we forget the initial conditions. This is reached when auto-correlation of order tends to as tends to .

According to the ergodicity theorem, when a time series is strictly stationary and erdogic, and when , then

5. Exogeneity

Exogeneity describes the relation between the residuals and the explanatory variables. The exogeneity is said to be strict if :

and for all t.

The exogenity is said to be contemporary when :

which is a weaker assumption, but satisfies the consistency hypothesis.

6. Long term effect

Let’s consider again the model : . In that case, we can estimate the long term effect as :

We can test the Granger causality using a Fisher test :

. Under this hypothesis, no past value of would allow to predict .

Conclusion : I hope you found this article useful. Don’t hesitate to drop a comment if you have a question.


Like it? Buy me a coffeeLike it? Buy me a coffee

Leave a comment