R: Speed Up R Code using Rcpp and RcppArmadillo

This post demonstrates how to incorporate C++ code to enhance the running speed of R code, through the integration of Rcpp and RcppArmadillo. The Nelson-Siegel yield curve model serves as an illustrative example.




Rcpp and RcppArmadillo to use C++ code



Rcpp offers a powerful means to accelerate R code by seamlessly integrating C++ code for targeted segments. In this post, I demonstrate how to use RcppArmadillo, a specialized C++ library for linear algebra and scientific computing, to address tasks such as matrix multiplication and inversion.

Using the Nelson-Siegel yield curve estimation problem as a practical example, I illustrate the process. It's important to note that this post covers only a small fraction of the capabilities of Rcpp and RcppArmadillo due to limitations in my knowledge.


R code


The following R code implements the Nelson-Siegel model with a decay parameter of 0.06.

# We load the Rcpp package to be able to use C++ code in R.
library(Rcpp)
 
graphics.off(); rm(list = ls())
 
setwd('your folder')
 
# data is located in the following address
# 'https://github.com/Financial-Times/yield-curve-sonification/blob/7d131970377380f118fb9a0e1626f5ed1ecd35ca/yield-curve-monthly-data.csv'
 
yield <- as.matrix(read.csv('yield-curve-monthly-data.csv')[,-1])
mat <- c(1361224366084120240360)
 
 
#===========================================================
# R code
#===========================================================
 
# period-by-period Estimation
est_beta_given_la_ts <- function(la, y_all, m) {
    
    nobs <- nrow(y_all)
    beta <- matrix(NA, nobs, 3)
    
    # fixed factor loading matrix given lambda
    C <- cbind(
        rep(1,length(m)), 
        (1-exp(-la*m))/(la*m), 
        (1-exp(-la*m))/(la*m)-exp(-la*m))
    
    # period-by-period OLS estimation given lambda
    for(t in 1:nobs) {
        beta[t,] <- solve(t(C)%*%C)%*%t(C)%*%y_all[t,]
    }
    
    return(beta)
}
 
# NS yield factors given lambda
out_beta <- est_beta_given_la_ts(0.0609, yield, mat)
 
x11(width = 16/2, height = 9/2.3)
matplot(out_beta, type="l", lty=1, lwd=3, col = 1:3, main = "R")
legend("bottomright", pch = 16, col = 1:3, cex = 0.9,
       bty = "n", horiz = TRUE, legend = c("L","S","C"))
 
cs



RCpp code


The following R code implements the corresponding RCpp code for the pure R code of the Nelson-Siegel model we previously coded. We utilize the cppFunction() R function for this purpose. When using functionalities such as matrix or vector operations in RcppArmadillo, we specify depends = "RcppArmadillo". The cppFunction() function accepts the input argument as a character vector of C++ code.

#===========================================================
# RCpp code
#===========================================================
 
cppFunction(depends = "RcppArmadillo"'
  
// period-by-period Estimation  
arma::mat f_est_beta_given_la_ts(
          double la, arma::mat y_all, arma::vec m) {
 
    int nobs = y_all.n_rows;
    int nmat = m.size();
    arma::mat beta(nobs, 3);
 
    // fixed factor loading matrix given lambda
    arma::mat C(nmat, 3);
    for(int i=0; i<nmat; ++i) {
        C(i,0) = 1;
        C(i,1) = (1-exp(-la*m[i]))/(la*m[i]);
        C(i,2) = C(i,1) - exp(-la*m[i]);
    }
 
    // period-by-period OLS estimation given lambda
    for(int t=0; t<nobs; ++t) {
        arma::vec y = y_all.row(t).t();
        beta.row(t) = solve(C.t()*C , C.t()*y).t();
    }
 
    return beta;
}')
 
# NS yield factors given lambda
out_beta <- f_est_beta_given_la_ts(0.0609, yield, mat)
 
x11(width = 16/2, height = 9/2.3)
matplot(out_beta, type="l", lty=1, lwd=3, col = 1:3, main = "RCpp")
legend("bottomright", pch = 16, col = 1:3, cex = 0.9,
       bty = "n", horiz = TRUE, legend = c("L","S","C"))
 
cs


As depicted in the following figure, both codes generate identical yield curve factor estimates, represented as time series of coefficients.



Computation Time Comparison


The microbenchmark R package enables comparison of average computation time. The results indicate that the RCpp code exhibits a speed enhancement of approximately 20 times compared to the R code.

#-----------------------------------------------------------
# Runtime comparison
#-----------------------------------------------------------
library(microbenchmark)
 
microbenchmark::microbenchmark(
    "R"   = est_beta_given_la_ts(0.0609, yield, mat),
    "Cpp" = f_est_beta_given_la_ts(0.0609, yield, mat),
    times = 10^3)
 
cs


Unit: microseconds
 expr    min      lq      mean  median      uq     max neval
    R 3932.0 4394.75 5112.5933 4635.95 5100.85 65725.2  1000
  Cpp  158.3  181.55  203.0094  189.55  201.60  1224.1  1000
> 
cs


While this advantage may not be significant for tasks with already short computation times, integrating RCpp code can prove beneficial in substantially reducing computation time for computationally demanding tasks that require several hours or days to complete.

No comments:

Post a Comment