This post shows how to use Empirical Mode Decomposition (EMD), which decomposes a non-stationary univariate time series into several components of different time scales. These components are called intrinsic mode functions (IMFs).

Empirical Mode Decomposition

Empirical Mode Decomposition (EMD) is a signal processing technique that decomposes a non-stationary signal into a finite number of Intrinsic Mode Functions (IMFs). Every IMF contains the highest frequency of the signal in the previous iteration, thus enabling high-frequency noise rejection recursively.

It is assumed that a time series \(y_t\) is composed of a pure oscillation component \(c_t\) of proper rotation, with mean zero and a residual term (the trend or the baseline signal) \(r_t\) from which the oscillation component is removed.
\[\begin{align} y_t = c_t + r_t \end{align}\]
The EMD expresses the original time series \(y_t\) as the sum of the IMFs and the \(r_t\),
\[\begin{align} y_t = \sum_{i=1}^{N} IMF(i)_t + r_t \end{align}\]
IMF is defined as the function that satisfies the following two requirements:

The number of extrema and the number of zero-crossings in the dataset must either be equal or differ at most by one
The mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero.

There are several packages for the EMD, and among them, to use the PyEMD package (https://pypi.org/project/EMD-signal/), install it in the following way

pip install EMD-signal
 

Simple Usage of PyEMD package

The following code uses PyEMD library to extract the IMFs and a residual series by EMD.

#!pip install EMD-signal
from PyEMD import EMD, Visualisation 
 
import yfinance as yf
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size': 14})
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
 
#------------------------------------
# monthly stock data: AAPL
#------------------------------------
yts = yf.download('AAPL','2011-01-01','2022-12-31', 
                  interval='1mo')['Adj Close']
date = yts.index.to_numpy()
y    = yts.to_numpy()
 
#------------------------------------
# Run EMD & Extract imfs and residue
#------------------------------------
emd = EMD()
emd.emd(y)
imfs, res = emd.get_imfs_and_residue()
 
#------------------------------------
# Plots of IMFs
#------------------------------------
vis = Visualisation()
vis.plot_imfs(imfs=imfs, residue=res, t=date, 
              include_residue=True)
vis.show()
 
Colored by Color Scripter
cs

The following figure shows a set of IMFs and a residual.

By default, the number of IMFs is not known a priori and must be determined from the signal itself, but the emd() function has an optional argument as max_imf (default: -1), which is the number of IMFs to decompose. A negative value means all.

In general, it is recommended that the length of the signal should be at least four times the number of IMFs that are desired. Therefore, if a signal has N data points and M IMFs are desired, then the minimum number of data required for EMD is 4M. For example, if you want to decompose a signal into 5 IMFs using EMD, then the minimum number of data points required is 20. As always, there are exceptions \(\blacksquare\)

SHLee AI Financial Model

Pages

Python: Empirical Mode Decomposition

Empirical Mode Decomposition

Simple Usage of PyEMD package

No comments:

Post a Comment