Python: Empirical Mode Decomposition

This post shows how to use Empirical Mode Decomposition (EMD), which decomposes a non-stationary univariate time series into several components of different time scales. These components are called intrinsic mode functions (IMFs).


Empirical Mode Decomposition



Empirical Mode Decomposition (EMD) is a signal processing technique that decomposes a non-stationary signal into a finite number of Intrinsic Mode Functions (IMFs). Every IMF contains the highest frequency of the signal in the previous iteration, thus enabling high-frequency noise rejection recursively.

It is assumed that a time series \(y_t\) is composed of a pure oscillation component \(c_t\) of proper rotation, with mean zero and a residual term (the trend or the baseline signal) \(r_t\) from which the oscillation component is removed.
\[\begin{align} y_t = c_t + r_t \end{align}\]
The EMD expresses the original time series \(y_t\) as the sum of the IMFs and the \(r_t\),
\[\begin{align} y_t = \sum_{i=1}^{N} IMF(i)_t + r_t \end{align}\]
IMF is defined as the function that satisfies the following two requirements:
  1. The number of extrema and the number of zero-crossings in the dataset must either be equal or differ at most by one
  2. The mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero.

There are several packages for the EMD, and among them, to use the PyEMD package (https://pypi.org/project/EMD-signal/), install it in the following way

pip install EMD-signal
 



Simple Usage of PyEMD package


The following code uses PyEMD library to extract the IMFs and a residual series by EMD.

#!pip install EMD-signal
from PyEMD import EMD, Visualisation 
 
import yfinance as yf
import matplotlib.pyplot as plt
plt.rcParams.update({'font.size'14})
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
 
#------------------------------------
# monthly stock data: AAPL
#------------------------------------
yts = yf.download('AAPL','2011-01-01','2022-12-31'
                  interval='1mo')['Adj Close']
date = yts.index.to_numpy()
y    = yts.to_numpy()
 
#------------------------------------
# Run EMD & Extract imfs and residue
#------------------------------------
emd = EMD()
emd.emd(y)
imfs, res = emd.get_imfs_and_residue()
 
#------------------------------------
# Plots of IMFs
#------------------------------------
vis = Visualisation()
vis.plot_imfs(imfs=imfs, residue=res, t=date, 
              include_residue=True)
vis.show()
 
cs



The following figure shows a set of IMFs and a residual.

Python: Empirical Mode Decomposition


By default, the number of IMFs is not known a priori and must be determined from the signal itself, but the emd() function has an optional argument as max_imf (default: -1), which is the number of IMFs to decompose. A negative value means all.

In general, it is recommended that the length of the signal should be at least four times the number of IMFs that are desired. Therefore, if a signal has N data points and M IMFs are desired, then the minimum number of data required for EMD is 4M. For example, if you want to decompose a signal into 5 IMFs using EMD, then the minimum number of data points required is 20. As always, there are exceptions \(\blacksquare\)


No comments:

Post a Comment