6. Writing Renjin Extensions¶
This chapter can be considered as Renjin’s equivalent of the Writing R Extensions manual for GNU R. Here we discuss how you create extensions, or packages as they are referred to in R, for Renjin. Packages allow you to group logically related functions and data together in a single archive which can be reused and shared with others.
Renjin packages do not differ much from packages for GNU R. One notable difference is that Renjin treats unit tests as first class citizens, i.e. Renjin includes a package called hamcrest that provides functionality for writing unit tests right out of the box. We encourage you to include as many unit tests with your package as possible.
One feature currently missing for Renjin packages is the ability to document your R code. You can use Javadoc to document your Java classes and methods.
6.1. Package directory layout¶
The files in a Renjin package should be organized in a directory structure that adheres to the Maven standard directory layout. A directory layout that will cover most Renjin packages is as follows:
projectdir/
src/
main/
java/
...
R/
resources/
test/
java/
...
R/
resources/
NAMESPACE
pom.xml
The table Directories in a Renjin package gives a short description of the directories and files in this layout.
Directory | Description |
---|---|
src/main/java | Java source code (*.java files) |
src/main/R | R source code (*.R files) |
src/main/resources | Files and directories to be copied to the root of the generated JAR file |
src/test/java | Unit tests written in Java using JUnit |
src/test/R | Unit tests written in R using Renjin’s Hamcrest package |
src/test/resource | Files available to the unit tests (not copied into the generated JAR file) |
NAMESPACE | Almost equivalent to R’s NAMESPACE file |
pom.xml | Maven’s Project Object Model file |
The functionality of the DESCRIPTION file used by GNU R packages is replaced by a Maven pom.xml (POM) file. In this file you define the name of your package and any dependencies, if applicable. The POM file is used by Renjin’s Maven plugin to create the package. This is the subject of the next section.
6.2. Renjin Maven plugin¶
Whereas you would use the commands R CMD check
, R CMD build
, and R
CMD INSTALL
to check, build (i.e. package), and install packages for GNU R,
packages for Renjin are tested, packaged, and installed using a Maven plugin.
The following XML file can be used as a pom.xml template for all Renjin
packages:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.acme</groupId> <artifactId>foobar</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <!-- general information about your package --> <name>Package name or title</name> <description>A short description of your package.</description> <url>http://www.url.to/your/package/website</url> <licenses> <!-- add one or more licenses under which the package is released --> <license> <name>Apache License version 2.0</name> <url>http://www.apache.org/licenses/LICENSE-2.0.html</url> </license> </licenses> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <renjin.version>3.5-beta76</renjin.version> </properties> <dependencies> <!-- the script engine is convenient even if you do not use it explicitly --> <dependency> <groupId>org.renjin</groupId> <artifactId>renjin-script-engine</artifactId> <version>${renjin.version}</version> </dependency> <!-- the hamcrest package is only required if you use it for unit tests --> <dependency> <groupId>org.renjin</groupId> <artifactId>hamcrest</artifactId> <version>${renjin.version}</version> <scope>test</scope> </dependency> </dependencies> <repositories> <repository> <id>bedatadriven</id> <name>bedatadriven public repo</name> <url>https://nexus.bedatadriven.com/content/groups/public/</url> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>bedatadriven</id> <name>bedatadriven public repo</name> <url>https://nexus.bedatadriven.com/content/groups/public/</url> </pluginRepository> </pluginRepositories> <build> <plugins> <plugin> <groupId>org.renjin</groupId> <artifactId>renjin-maven-plugin</artifactId> <version>${renjin.version}</version> <executions> <execution> <id>build</id> <goals> <goal>namespace-compile</goal> </goals> <phase>process-classes</phase> </execution> <execution> <id>test</id> <goals> <goal>test</goal> </goals> <phase>test</phase> </execution> </executions> </plugin> </plugins> </build> </project>
This POM file provides a lot of information:
- fully qualified name of the package, namely com.acme.foobar;
- package version, namely 1.0-SNAPSHOT;
- package dependencies and their versions, namely the Renjin Script Engine and the hamcrest package (see the next section);
- BeDataDriven’s public repository to look for the dependencies if it can’t find them locally or in Maven Central;
Important
Package names is one area where Renjin takes a different approach to GNU R
and adheres to the Java standard of using fully qualified names. The
package in the example above must be loaded using its fully qualified name,
that is with library(com.acme.foobar)
or require(com.acme.foobar)
.
The group ID (com.acme in this example) is traditionally a domain over
which only you have control. The artifact ID should have only lower case
letters and no strange symbols. The term artifact is used by Maven to
refer to the result of a build which, in the context of this chapter, is
always a package.
Now you can use Maven to test, package, and install your package using the following commands:
- mvn test
- run the package tests (both the Java and R code tests)
- mvn package
- create a JAR file of the package (named foobar-1.0-SNAPSHOT.jar in the
example above) in the
target
folder of the package’s root directory - mvn install
- install the artifact (i.e. package) into the local repository
- mvn deploy
- upload the artifact to a remote repository (requires additional configuration)
- mvn clean
- clean the project’s working directory after a build (can also be combined
with one of the previous commands, for example:
mvn clean install
)
6.3. Package NAMESPACE file¶
Since R version 2.14, packages are required to have a NAMESPACE
file and the
same holds for Renjin. Because of dynamic searching for objects in R, the use of
a NAMESPACE
file is good practice anyway. The NAMESPACE
file is used to
explicitly define which functions should be imported into the package’s
namespace and which functions the package exposes (i.e. exports) to other
packages. Using this file, the package developer controls how his or her package
finds functions.
Usage of the NAMESPACE
in Renjin is almost exactly the same as in GNU R
with one addition: Renjin accepts the directive importClass()
for importing
Java classes into the package namespace.
Here is an overview of the namespace directives that Renjin supports:
export(f)
orexport(f, g)
- Export an object
f
(singular form) or multiple objectsf
andg
(plural form). You can add as many objects to this directive as you like. exportPattern("^[^\\.]")
- Export all objects whose name does not start with a period (‘.’).
Although any regular expression can be used in this directive, this is by
far the most common one. It is considered to be good practice not to use
this directive and to explicitly export objects using the
export()
directive. import(foo)
orimport(foo, bar)
- Import all exported objects from the package named
foo
(andbar
in the plural form). Like theexport()
directive, you can add as many objects as you like to this directive. importFrom(foo, f)
orimportFrom(foo, f, g)
- Import only object
f
(andg
in the plural form) from the package namedfoo
. S3method(print, foo)
- Register a print (S3) method for the
foo
class. This ensures that other packages understand that you provide a functionprint.foo()
that is a print method for classfoo
. Theprint.foo()
does not need to be exported. importClassesFrom(package, classA)
- import S4 classes
importMethodsFrom(package, methodA)
- import S4 methods
exportClasses(fooClass)
- export S4 classes (and Reference Classes). You can add as many classes you like to this directive
exportMethods(barMethod)
- export S4 methods. You can add as many methods you like to this directive
importClass(com.acme.myclass)
- A namespace directive which is unique to Renjin and which allows Java
classes to be imported into the package namespace. This directive is
actually a function which does the same as Renjin’s
import()
function that was introduced in the chapter Importing Java classes into R code.
To summarize: the R functions in your package have access to all R functions
defined within your package (also those that are not explicitly exported since
Java has its own mechanism to control the visibility of classes) as
well as the Java classes imported into the package names using the
importClass
directive. Other packages only have access to the R objects
that your package exports as well as to the public Java classes.
If you are creating a package that uses java code and you want to expose an R api
for handling package consumers you need to put the import statement in the R function
that access the java code. Using the example of the Customer
java class from the
Importing Java classes into R code chapter:
Lets say you want to expose a createCustomer function so that package consumers does not
have to interact directly with your java classes. E.g. instead of doing Customer$new()
we create the following factory function
createCustomer <- function(name, age) {
import(beans.Customer)
Customer$new(name = name, age = age)
}
Note that we need to put the import(beans.Customer)
directive inside the function
for this to work. We also need to add export(createCustomer)
to the NAMESPACE
file which will
enable package consumers to do:
library('com.acme:customerbean')
bobby <- createCustomer(name = "Bobby", age = 26)
6.4. Using the hamcrest package to write unit tests¶
Renjin includes a built-in package called hamcrest for writing unit tests using the R language. The package and its test functions are inspired by the Hamcrest framework. From hamcrest.org: Hamcrest is a framework for writing matcher objects allowing ‘match’ rules to be defined declaratively. The Wikipedia article on Hamcrest gives a good and short explanation of the rationale behind the framework.
If you are familiar with the ‘expectation’ functions used in the testthat package for GNU R, then you will find many similarities with the assertion and matcher functions in Renjin’s hamcrest package.
A test is a single R function with no arguments and a name that starts with
test.
. Each test function can contain one or more assertions and the test
fails if at least one of the assertions throws an error. For example, using the
package defined in the previous section:
library(hamcrest)
library(com.acme.foobar)
test.df <- function() {
df <- data.frame(x = seq(10), y = runif(10))
assertThat(df, instanceOf("data.frame"))
assertThat(dim(df), equalTo(c(10,2)))
}
Test functions are stored in R script files (i.e. files with extension .R
or .S
) in the src/test/R
folder of your package. Each file should start
with the statement library(hamcrest)
in order to attach the hamcrest
package to the search path as well as a library()
statement to load your
own package. You can put test functions in different files to group them
according to your liking.
The central function is the assertThat(actual, expected)
function which takes
two arguments: actual
is the object about which you want to make an assertion
and expected
is the matcher function that defines the rule of the assertion.
In the example above, we make two assertions about the data frame df
, namely
that it should have class data.frame and that its dimension is equal to the
vector c(10, 2)
(i.e. ten rows and two columns). The following sections
describe the available matcher functions in more detail.
6.4.1. Testing for (near) equality¶
Use equalTo()
to test if actual
is equal to expected
:
assertThat(actual, equalTo(expected))
Two objects are considered to be equal if they have the same length and if
actual == expected
is TRUE
.
Use identicalTo()
to test if actual
is identical to expected
:
assertThat(actual, identicalTo(expected))
Two objects are considered to be identical if identical(actual, expected)
is
TRUE
. This test is much stricter than equalTo()
as it also checks that the
type of the objects and their attributes are the same.
Use closeTo()
to test for near equality (i.e. with some margin of error as
defined by the delta
argument):
assertThat(actual, closeTo(expected, delta))
This assertion only accepts numeric vectors as arguments and delta
must
have length 1. The assertion also throws an error if actual
and
expected
do not have the same length. If their lengths are greater than 1,
the largest (absolute) difference between their elements may not exceed
delta
.
6.4.2. Testing for TRUE or FALSE¶
Use isTrue()
and isFalse()
to check that an object is identical to TRUE
or FALSE
respectively:
assertThat(actual, isTrue())
assertTrue(actual) # same, but shorter
assertThat(actual, identicalTo(TRUE)) # same, but longer
6.4.3. Testing for class inheritance¶
Use instanceOf()
to check if an object inherits from a class:
assertThat(actual, instanceOf(expected))
An object is assumed to inherit from a class if inherits(actual, expected)
is
TRUE
.
Tip
Renjin’s hamcrest package also exists as a GNU R package with the same name available at https://github.com/bedatadriven/hamcrest. If you are writing a package for both Renjin and GNU R, you can use the hamcrest package to check the compatibility of your code by running the test files in both Renjin and GNU R.
6.4.4. Understanding test results¶
When you run mvn test
within the directory that holds the POM file (i.e.
the root directory of your package), Maven will execute both the Java and R
unit tests and output various bits of information including the test results.
The results for the Java tests are summarized in a section marked with:
-------------------------------------------------------
T E S T S
-------------------------------------------------------
and which will summarize the test results like:
Results :
Tests run: 5, Failures: 1, Errors: 0, Skipped: 0
The results of the R tests are summarized in a section marked with:
-------------------------------------------------------
R E N J I N T E S T S
-------------------------------------------------------
The R tests are summarized per R source file which will look similar to the following example:
Running tests in /home/foobar/mypkg/src/test/R
Running function_test.R
No default packages specified
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.898
Note that the number of tests run is equal to the number of test.*
functions in the R source file + 1 as running the test file is also counted as
a test.
7. Example extension projects¶
Below is a short list of fully functioning Renjin packages:
renjin-maven-package-example A simple example showing a renjin extension (package) with java integration.
renjinSamplesAndTests Various simple examples of Renjin extensions.
xmlr A XML DOM package developed with Reference Classes using the GNU R directory layout.