In this article, we’ll introduce the key concepts related to time series.

We’ll be using the same data set as in the previous article: Open Power System Data (OPSD) for Germany. The data can be downloaded here

Start by importing the following packages :

```
### General import
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
import statsmodels.api as sm
### Time Series
from statsmodels.tsa.ar_model import AR
from sklearn.metrics import mean_squared_error
from pandas.tools.plotting import autocorrelation_plot
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import adfuller
#from statsmodels.tsa.sarimax_model import SARIMAX
### LSTM Time Series
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
```

Then, load the data :

```
df = pd.read_csv('opsd_germany_daily.csv', index_col=0)
df.head(10)
```

Then, make sure to transform the dates into `datetime`

format in pandas :

```
df.index = pd.to_datetime(df.index)
```

# I. Key concepts and definitions

## 1. Auto-correlation

The auto-correlation is defined as the correlation of the series over time, i.e how much the value at time depends on the value at time for all .

- The auto-correlation of order 1 is :
- The auto-correlation of order j is :
- The auto-covariance of order 1 is :
- The auto-covariance of order j is :

Empirically, the auto-correlation can be estimated by the sample auto-correlation :

Where :

To plot the auto-correlation and the partial auto-correlation, we can use `statsmodel`

package :

```
fig, axes = plt.subplots(1, 2, figsize=(15,8))
fig = sm.graphics.tsa.plot_acf(df['Consumption'], lags=400, ax=axes[0])
fig = sm.graphics.tsa.plot_pacf(df['Consumption'], lags=400, ax=axes[1])
```

We observe a clear trend. The value od consumption at time is negatively correlated with the values 180 days ago, and positively correlated with the values 360 days ago.

## 2. Partial Auto-correlation

The partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It is a regression of the series against its past lags.

How can we correct auto-correlation ? Take for example :

Therefore, if you substract the first to the second with a coefficient equal to the auto-correlation :

for

Therefore, if we want to make a regression without auto-correlation :

Why would we want to remove the auto-correlation?

- to derive the OLS estimator of the parameters for example
- because there is a bias otherwise since would depend on

## 3. Stationarity

**Stationarity** of a time series is a desired property, reached when the joint distribution of does not depend on . In other words, the future and the present should be quite similar. Stationary time series do therefore not have underlying trends or seasonal effect.

What kind of events makes a series non-stationary?

- a trend, i.e increasing sales over time
- a seasonality, i.e more sales during the summertime than wintertime

We usually want our series to be stationary even before applying any predictive model!

How can we test if a time series is stationary?

- look at the plots (as above)
- look at summary statistics and box plots as in the previous article. A simple trick is to cut the data set in 2, look at mean and variance for each split, and plot the distribution of values for both splits.
- perform statistical tests, using the (Augmented) Dickey-Fuller test

### Unit roots

Let’s cover into more details the Dickey-Fuller test. To do so, we need to introduce the notion of *unit root*. A unit root is a stochastic trend in a time series, sometimes called a random walk with drift. If a series has a unit root, it makes it unpredictable due to a systematic pattern.

Let’s consider an autoregressive (we’ll dive deeper later in to this) :

We define the characteristic equation as :

. If is a root to this equation, then the process is said to have a unit root. Equivalently, the process is said to be integrated of order 1 : .

In other words, there is a unit root if the previous values keep having a 1:1 impact on the current value. If we consider a simple autoregressive model AR(1) : , the process has a unit root when .

If a process has a unit root, then it is non-stationary, i.e the moments of the process depend on .

A process is a weakly dependent process, also called integrated of order 0 ( ) if taking the first different of the model is enough to make the series stationary :

### Dickey-Fuller Test

The Dicker Fuller test is used to assess if a unit root is present in an autoregressive process :

There is a unit root and the process is **not** stationary.

There is no unit root and the process is stationary.

For example, in an AR(1) model where , the hypothesis are :

The hypothesis would mean an explosive process, and is therefore not considered. When , then .

In practice, we consider the following equation :

We have and test .

### Augmented Dickey-Fuller Test

The Augmented Dickey-Fuller Test (ADF) is an augmented version of the Dickey-Fuller test in the sense that it can test for a more complex set of time series models. For example, consider an ADF on an AR(p) process :

And the null hypothesis : .

## 4. Ergodicity

**Ergodicity** is the process by which we forget the initial conditions. This is reached when auto-correlation of order tends to as tends to .

According to the ergodicity theorem, when a time series is strictly stationary and erdogic, and when , then

## 5. Exogeneity

**Exogeneity** describes the relation between the residuals and the explanatory variables. The exogeneity is said to be strict if :

and for all t.

The exogenity is said to be contemporary when :

which is a weaker assumption, but satisfies the consistency hypothesis.

## 6. Long term effect

Let’s consider again the model : . In that case, we can estimate the long term effect as :

We can test the Granger causality using a Fisher test :

. Under this hypothesis, no past value of would allow to predict .

Conclusion: I hope you found this article useful. Don’t hesitate to drop a comment if you have a question.

Like it? Buy me a coffee

## Leave a comment