rJava with User-defined R Functions in Eclipse

This post shows how to call user-defined functions in R script from Eclipse Java with rJava package. This work will improve code readability and minimize the likelihood of errors in such a way that it reduces multiples lines of R codes.



Introduction


We've learned how to insert R commands to Eclipse Java in the previous post. But as the number of lines of R commands is too many, overall code readability and maintenance could be deteriorated. In this case, it is preferred to use R script which contains user-defined or built-in R functions. This means that essentially one line of code will do rather than multiple lines of commands.

Suppose that we performs Lasso regression in Java using R script.
Before going into the details, overall setting for rJava in Eclipse is required, which is discussed in the previous post.


Functions in R Script to be Called


R scripts for Lasso estimation is written as a separate file. In particular, glmnet package is used. Like this, when using another package library, one more rJava command is also needed, which is discussed later.

  • shlee_RLib.R

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
library(glmnet)
 
test_glmnet <- function(nvar) {
    
    # artificial data
    x = matrix(rnorm(100*nvar), 100, nvar)
    y = rnorm(100)
    
    # Lasso estimation
    fit1 = glmnet(x, y, alpha = 1)
    
    # coefficient matrix with lambda = 0.01, 0.05
    sm.coef <- as.matrix(unlist(coef(fit1, s=c(0.01,0.05))))
 
    return(as.matrix(sm.coef))
}
 
cs


With source() command, test_glmnet() function in shlee_RLib.R (or your favorite name) can be called in another R script as follows.

1
2
3
4
5
6
rm(list = ls()) # remove all files from your workspace
 
source("D:/SHLEE/rJava/code/shlee_RLib.R")
 
test_glmnet(10)
 
cs


test_glmnet(10) command returns the following results, which are coefficient vectors of two Lasso models (\(\lambda = 0.01, 0.05\)).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
> test_glmnet(10)
                      1            2
(Intercept) -0.16453256 -0.159861599
V1          -0.22338346 -0.179309971
V2          -0.01422389  0.000000000
V3           0.03766415  0.000000000
V4           0.03100075  0.000000000
V5          -0.03618045 -0.003826916
V6           0.06392980  0.021847332
V7           0.05765196  0.022836526
V8           0.02248489  0.000000000
V9          -0.09365914 -0.046769476
V10         -0.19398003 -0.180170304
> 
cs



Calling User-defined Functions in Eclipse Java


At first, let's make an Eclipse class java file in which R functions are called using rJava. We name it CRJava2.java for example. In this file, write the following java code.

  • CRjava2.java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
//=========================================================================#
//Financial Econometrics & Derivatives, ML/DL using R, Python, Tensorflow  
//by Sang-Heon Lee 
//
//https://shleeai.blogspot.com
//-------------------------------------------------------------------------#
//rJava example with user-defined function which reduces so much lines.
//=========================================================================#
package aRjava;
 
import org.rosuda.JRI.Rengine;
import org.rosuda.JRI.REXP;
 
// Run Config -> Environment 3 copy and paste
// That's all there is to it and nothing else is needed.
 
public class CRjava2 {
    public static void main(String[] args) {
 
        // Launch and Start rJava
        Rengine re=new Rengine(new String[] { "--vanilla" }, falsenull);
 
        //------------------------------------------------------------------
        // Very Important !!!!!!!!!!!!!!!!!!
        //------------------------------------------------------------------
        // User-defined function use glmnet library.
        //
        // Without this command, Java produces the following error.
        // :: Error in library(glmnet) : 
        //    there is no package called 'glmnet'
        //
        // To sidestep this error, following command is recommended.
        //
        // glmnet package is located 
        // at C:/Users/shlee/Documents/R/win-library/4.0
        //------------------------------------------------------------------
        re.eval(".libPaths('C:/Users/shlee/Documents/R/win-library/4.0')");
 
        // R file with its local directory
        // in which user-defined functions are written
        re.eval("source('D:/SHLEE/rJava/code/shlee_rlib.R')");
 
        // Input parameters
        int nvar = 5int ncol = 2;
 
        // Call user-defined R function
        REXP x = re.eval("test_glmnet("+nvar+")");
 
        // 1) Result : raw output
        System.out.println("1) REXP result : raw output");
        System.out.println(x);
 
        // 2) Results : rounded output
        System.out.println("\n2) REXP result : formatted output using 2D array");
 
        // R matrix t --> Java 2D array
        double[][] mout = x.asDoubleMatrix();
 
        for(int i = 0; i<nvar; i++) {
            for(int j = 0; j<ncol; j++) {
                System.out.print(String.format("%2.5f", mout[i][j]) + "   ");
            }
            System.out.println("");
        }
 
        // end rJava
        re.end();
    }
}
 
cs



In fact, it is interesting that the essential part of the above Java code is calling test_glmnet(). We can save many lines of code, which also depends on the extent or size of calculations or estimations.

1
2
REXP x = re.eval("test_glmnet("+nvar+")");
 
cs


Point to Note


There is one important thing to know. When other (not built-in) libraries such as glmnet are included, the following rJava command is necessary. It is important.

1
2
re.eval(".libPaths('C:/Users/shlee/Documents/R/win-library/4.0')");
 
cs


Without the above rJava command, Eclipse returns an error message with "there is no package called 'glmnet'".


First Run and Errors


When we run the above Java code, we encounter the following errors. Hence, we need to do some settings for CRjava2.java file.

But this first running this project is important because after this trial, Run Configuration (which will be explained later) can identify this project.

Setting for New Class Java File


Two settings on new added class file are necessary. After right mouse clicking on CRjava2.java (left file explorer : aRjava/src/aRjava/CRjava2.java), select Run AsRun Configurations.

In Arguments tab, VM arguments is filled as follows (Use copy and paste from aRjava setting).


  • VM arguments : -Djava.library.path=C:\Users\shlee\Documents\R\win-library\4.0\rJava\jri\x64

In Environment tab, Add three directories in the following way (Use copy and paste from aRjava setting with buttons).
  • LD_LIBRARY_PATH : C:\Program Files\R\R-4.0.3\bin;C:\Program Files\R\R-4.0.3\library;C:\Users\shlee\Documents\R\win-library;
  • PATH : C:\Program Files\R\R-4.0.3\bin\x64;C:\Users\shlee\Documents\R\win-library\rJava\jri\x64;
  • R_HOME : C:\Program Files\R\R-4.0.3


Now the setting for the added class file is done completely.


Running and Results


When we rerun CRjava2.java file, We can obtain correct results.


We can find that results from only R and Eclipse with rJava are same.

Having done the overall environment setting already, we have only to add two settings on new added class file simply.

From this post, we can make Java code with rJava compact from calling user-defined R functions efficiently. This will help reduce many lines to essentially one line and enhance code readability. \(\blacksquare\)


No comments:

Post a Comment