Index Tracking
ETF selects a small number or subset of constituents of BM index to mimic it. Since ETF does not contain all constituents of BM index (full replication), tracking error (TE) take places. Furthermore, the optimal subset is not fixed but variable according to the market developments so that a frequent or periodic rebalancing is required.
We will use ROI optimization package so, for detailed information, refer to the following post.
The number of constituents of BM index is so large that the full replication is impossible due to the transaction costs and liquidity problem. Therefore, Index tracking is finding the optimal combination of subset securities for minimizing tracking errors and its objective function is formulated as follows.
\[\begin{align} TE = \frac{1}{T} \sum_{t=1}^{T} \left( \sum_{i=1}^{N} \left( w_i r_{it} - R_t^I \right)^2 \right) \end{align}\] Here, \(R_t^I\) adn \(r_{it}\) are time \(t \) returns of BM index and its constituents respectively and \(w_i\) is the weight of \(i\) constituent.
Using vector-matrix notation, the above problem is reformulated with its constraints as follows. \[\begin{align} &\min_{w} \frac{1}{T} || Rw - R^I ||_2^2 \\ \text{subject to}& \\ &e^T w = 1 \\ &\eta_i Z_i \leq w_i < Z_i \delta_i \\ &\sum_{t=1}^{N} Z_i = K \\ &Z_i = 0 \quad or \quad 1, \quad i=1,2,...,N \end{align}\] Here, \(N\) is the number of constituents of BM index and \(K\) is the number of constituents of ETF. \(R^I=(R_1^I,R_2^I,…,R_T^I )^T\) is a \(T×1\) vector of BM index return and \(R=(R_1,R_2,…,R_T)\) is a \(T×N\) matrix which is concatenated with all \(T×1\) vector of \(R_i=(r_i1,r_i2,…,r_iT )^T\) horizontally. \(w=(w_1,w_2,…,w_N )^T\) is a \(T×1\) vector of allocation weights.
Seeing the above constraints, first condition is so called budget constraint which means all capital is invested into ETF portfolio. Second condition denotes the lower and upper bound for allocation weights. Third condition is a cardinality constraints that \(Z_i\) may take on 0 or 1 and sum of it is \(K\). This constraints means only \(K\) securities from all \(N\) are invested.
But this problem is considered a difficult problem because cardinality constraints make this NP hard problem, in other words, \(\sum_{t=1}^{N} Z_i = K\) make this problem highly dimensional discrete problem.. This means only when we calculate all combinations by using mixed integer programming, we can select the optimal combination. But the number of combination is too large to calculate it. For this reason, this problem is also called the sparse index tracking problem. Of course, recently Fengmin, Xu, and Xue (2015) suggest \(L_{1/2}\) Regularization for this problem.
For this post, we use the sparseIndexTracking R package for the sparse index tracking and also use the ROI.plugin.ecos R package for index tracking and finally compare these two results.
Second-order conic programming (SOCP)
For index tracking, we use the ROI and ROI.plugin.ecos which provide a solver for the second-order cone programming (SOCP).
What is a SOCP and what is the relationship between SOCP and index tracking?
Second-order cone programming (SOCP) problem is a convex optimization problem in which a linear function is minimized over the intersection of an affine linear manifold with the Cartesian product of second-order cones.
Index tracking problem can be rewritten into the SOCP format and ROI.plugin.ecos or other index tracking solver need SOCP format as input format. Therefore we need to transform our index tracking errors minimization problem into second-order conic programming problem.
We present the original and transformed problems respectively. You can easily find the concept of SOCP in the context of index tracking problem. For example, we try to mimic the benchmark index by minimizing tracking error.
The original TE problem is
\[\begin{align} &\min_{w} \sqrt{\sum_{t=1}^{T} \left( \sum_{i=1}^{N} \left( R_t^I - w_i r_{it} \right)^2 \right)} \\ \text{subject to}& \\ &e^T w = 1 \\ &w > 0 \\ \end{align}\]
Here, \(w = (w_1 , w_2 , ..., w_N) \) and \(r = (r_1, r_2, ..., r_N) \).
The transformed TE problem as SOCP is
\[\begin{align} &\min_{w} t \\ \text{subject to}& \\ &\sqrt{\sum_{t=1}^{T} \left( \sum_{i=1}^{N} \left( R_t^I - w_i r_{it} \right)^2 \right)} \le t \\ &e^T w = 1+t \\ &w > 0 \\ \end{align}\]
Here, \(w = (w_1 , w_2 , ..., w_N, t) \) and \(r = (r_1, r_2, ..., r_N, 1) \).
It is worth noting that definitions of \(w\) and \(r\) are different between two equations. The second equation also includes \(t\) as a control variable. Second equation treats the first equation's objective function as an additional constraint. For convenience, two equations omit \(\frac{1}{T}\) since it is a constant and use a square root for formal expression.
Although the definition of SOCP seems somewhat difficult, we can easily observe the characteristics of SOCP from the above two formulations. The bottom line is that the convex objective function can be transformed into a constraint and the original objective function is replaced by a linear function.
R package
Using ROI and ROI.plugin.ecos, we can perform the index tracking minimization. But in this case, since there is no cardinality constraints, we need to select the subset of securities in advance and use the SOCP format.
But sparseIndexTracking R package is easy to use since its arguments are y and X as data. It also implements the cardinality constraints by adjusting the regularization parameter (\(\lambda\)). The higher the \(\lambda\), the more the coefficients are shrinked towards zero.
R code
The following R code implements two index tracking problems. We use data which is embedded in sparseIndexTracking R package. For expositional purpose, we assume the universe of stock as consisted of 30 because it is difficult to demonstrate the results as a table or figure when using all 386 stocks. But after understanding the main contents, we also deal with the case of 386 stocks.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | #============================================================== # Financial Econometrics & Derivatives, ML/DL # using R, Python, Keras, Tensorflow # by Sang-Heon Lee # # https://shleeai.blogspot.com #-------------------------------------------------------------- # Index Tracking Error Minimization # using ROI.ecos and sparseIndexTracking #============================================================== graphics.off() # clear all graphs rm(list = ls()) # remove all files from your workplace library(sparseIndexTracking) library(ROI) library(ROI.plugin.ecos) #------------------------------------------------ # Data #------------------------------------------------ # load stock index data data(INDEX_2010) y = as.vector(INDEX_2010$SP500) X = as.matrix(INDEX_2010$X) # comment it when full data is used X <- X[,1:30] nobs = length(y); nX = ncol(X) #------------------------------------------------ # 1) Using ROI and ROI.ecos #------------------------------------------------ #-------------------------------------------- # the original form #-------------------------------------------- # w = c( w1, w2, w3)' # Xn = c(Xn1, Xn2, Xn3) # # min sqrt( (y1 - X1'*w)^2 + (y2 - X2'*w)^2 # + (y3 - X3'*w)^2 + (y4 - X4'*w)^2 # + (y5 - X5'*w)^2 # ) # s.t. # w1 + w2 + w3 = 1 # w1, w2, w3 > 0 #-------------------------------------------- #-------------------------------------------- # --> Rewritten into the SOCP form #-------------------------------------------- # w = c( w1, w2, w3, t)' # Xn = c(Xn1, Xn2, Xn3, 1) # # minimize t # s.t. # sqrt( (y1 - X1'*w)^2 + (y2 - X2'*w)^2 # + (y3 - X3'*w)^2 + (y4 - X4'*w)^2 # + (y5 - X5'*w)^2 # ) <= t # w1 + w2 + w3 = 1 # w1, w2, w3 > 0 #-------------------------------------------- #-------------------------------------------- # Index tracking error minimization # using second order cone programming #-------------------------------------------- A <- rbind(c( rep(0,nX), -1), cbind(X,0)) soc <- OP(objective = L_objective(c(rep(0,nX), 1)), constraints = c( C_constraint(A, K_soc(nobs+1), c(0,y)), L_constraint(c(rep(1,nX), 0), "==", 1)) ) soc_sol <- ROI_solve(soc, solver = "ecos") wgt_roi <- soc_sol$solution[1:nX] #------------------------------------------------ # 2) Using sparseIndexTracking #------------------------------------------------ # fit portfolio under error measure ETE # (Empirical Tracking Error) # Unconstrained # wgt_sps <- spIndexTrack(X, y, lambda = 1e-180, u = 1, # measure = 'ete', thres = 1e-180) # Constrained wgt_sps <- spIndexTrack(X, y, lambda = 1e-7, u = 1, measure = 'ete') #------------------------------------------------ # 3) Comparison for allocation weights #------------------------------------------------ round(cbind(wgt_roi, wgt_sps),4) | cs |
With arguments for an unconstrained problem (\(\lambda=1e-180\)) and a subset of stocks (\(n=30\)) as an assumed universe, running the above R code results in the following weight allocations of two R packages: ROI with ROI.plugin.ecos and sparseIndexTracking. Since no regularization is applied, two results are same.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | > #------------------------------------------------ > # 3) Comparison for allocation weights > #------------------------------------------------ > > round(cbind(wgt_roi, wgt_sps),4) wgt_roi wgt_sps 1436513D UN Equity 0.0270 0.0270 1500785D UN Equity 0.0220 0.0220 1518855D US Equity 0.0319 0.0319 9876566D UN Equity 0.0607 0.0607 A UN Equity 0.0149 0.0149 AA UN Equity 0.0426 0.0426 AAPL UW Equity 0.0444 0.0444 ABC UN Equity 0.0151 0.0151 ABT UN Equity 0.1330 0.1330 ADBE UW Equity 0.0114 0.0114 ADM UN Equity 0.0127 0.0127 ADP UW Equity 0.1440 0.1440 ADSK UW Equity 0.0113 0.0113 AEE UN Equity 0.0453 0.0453 AEP UN Equity 0.0158 0.0159 AES UN Equity 0.0074 0.0074 AET UN Equity 0.0132 0.0132 AFL UN Equity 0.0413 0.0413 AGN UN Equity 0.0145 0.0146 AIG UN Equity 0.0002 0.0002 AIV UN Equity 0.0452 0.0452 AIZ UN Equity 0.0202 0.0202 AKAM UW Equity 0.0000 0.0000 ALL UN Equity 0.0348 0.0348 ALTR UW Equity 0.0172 0.0172 AMAT UW Equity 0.0336 0.0336 AMGN UW Equity 0.0411 0.0411 AMP UN Equity 0.0503 0.0503 AMT UN Equity 0.0437 0.0437 AMZN UW Equity 0.0051 0.0051 | cs |
For the sparse index tracking, with arguments for a constrained problem (\(\lambda=1e-6\)) and a subset of stocks (\(n=30\)) as an assumed universe, running the above R code results in the following weight allocations of two R package: ROI with ROI.plugin.ecos and sparseIndexTracking. We can easily find that the sparse index tracking demonstrates the selection effect because (\(\lambda=1e-6\) invokes a regularization.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | > #------------------------------------------------ > # 3) Comparison for allocation weights > #------------------------------------------------ > > round(cbind(wgt_roi, wgt_sps),4) wgt_roi wgt_sps 1436513D UN Equity 0.0270 0.0397 1500785D UN Equity 0.0220 0.0000 1518855D US Equity 0.0319 0.0379 9876566D UN Equity 0.0607 0.0656 A UN Equity 0.0149 0.0000 AA UN Equity 0.0426 0.0445 AAPL UW Equity 0.0444 0.0510 ABC UN Equity 0.0151 0.0000 ABT UN Equity 0.1330 0.1598 ADBE UW Equity 0.0114 0.0000 ADM UN Equity 0.0127 0.0000 ADP UW Equity 0.1440 0.1783 ADSK UW Equity 0.0113 0.0000 AEE UN Equity 0.0453 0.0652 AEP UN Equity 0.0158 0.0000 AES UN Equity 0.0074 0.0000 AET UN Equity 0.0132 0.0000 AFL UN Equity 0.0413 0.0473 AGN UN Equity 0.0145 0.0000 AIG UN Equity 0.0002 0.0000 AIV UN Equity 0.0452 0.0543 AIZ UN Equity 0.0202 0.0000 AKAM UW Equity 0.0000 0.0000 ALL UN Equity 0.0348 0.0418 ALTR UW Equity 0.0172 0.0000 AMAT UW Equity 0.0336 0.0507 AMGN UW Equity 0.0411 0.0499 AMP UN Equity 0.0503 0.0595 AMT UN Equity 0.0437 0.0543 AMZN UW Equity 0.0051 0.0000 | cs |
The two figures below show the weight allocations of two cases. When there is no regularization for cardinality constraint, two results are same.
When there is a regularization for cardinality constraint, two results are different since sparse index tracking select a subset of securities from 30 universe.
When we use all 386 securities, the folloiwng two figures are obtained.
In the above case of all data, we can observe some discrepancies in allocation weights but overall distribution of weights are similar. As variables are too many, some numerical error is largely cumulated.
But for more precise calculations, we think that investigation with hyperparameters (\(\lambda\) and so on) varying is also needed.
These two approaches are complementary because sparse index tracking does not consider economically significant variables but statistically significant variables. \(\blacksquare\)
No comments:
Post a Comment