The gentle art of making ... programs: Mixing Java and dynamic scripting languages

I'm done with the assembly language (so far), so it's time to publish some high(er)-level stuff.

Problem

Several years ago I was a team member of a project which implemented a framework targeting insurance systems. The framework is based on IBM's specification framework, which in turn is the basis of IAA. Among other things, the specification framework can be used to form a network of agreements and various elements attached to agreements (requests, properties, roles, calculations and rules), possibly without any coding. Many objects in the framework extend StructuredActual (e.g. Agreements and Roles), so this class is commonly found in the interfaces. All "external" components are modeled as role players (e.g. a party acting as a policy holder) and are not part of the framework per se.

Agreements are "instances" of another concept called product. Product defines properties and other elements associated with an instance of agreement. The product itself is parametrized with the data from configuration database or XML file. This means we can build a network of agreements without coding a single line of Java. Data can be automatically persisted without any intervention of programmers. On the other hand, things like derived properties (e.g. things relative to current time) and rules must be coded in Java or there must be some other language to describe their behavior. Initially we decided to go with plain Java, but left the door open for other options. The reasons behind Java-only-approach were:

Performance. With Java we can have optimal performance.
Debugability. Since everything is in Java, it's quite easy to figure out what's going on with a debugger.
Avoidance of proprietary languages. We didn't want to invent our own yet-another programming language.
Skills. We know Java and it's easier to hire people to work with the framework since it's based on one of the most popular languages.

Basically there's nothing wrong with this approach, but there are some things which could make life easier. For example, class Agreement is a kind of chameleon class, and the properties of it are configured thru the product without any coding work. This in turn means that the properties must be modeled as instances of class Property (carrying information such as type, default value, whether it's value is mandatory or not, ...). Since these properties live inside a map, and each instance of class Agreement can have it's own properties, we can't have normal, JavaBean type accessors for properties. Instead, the properties are accessed like this:

Double value = agreement.getPropertyValue( "premium" );

It gets a little bit uglier when multiple properties are accessed for example in a derived property formula (the logic behind actual property). Here's a sample

public class NetPriceFormula extends Formula<Double> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ) *
context.getPropertyValue( "discountPercent" ) / 100.0;
}
}

And if we are dealing with java.math.BigDecimal, it gets much uglier:

public class NetPriceFormula extends Formula<BigDecimal> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ).
multiply( context.getPropertyValue( "discountPercent" ) ).
divide( new BigDecimal( "100.0" ) );
}
}

At the end of the day, the formula should look like this:

context.premium * context.discountPercent / 100

Also, I didn't like the idea of having zillions of really small classes. That's why I decided to give a try to several scripting languages.

Using script languages with Java

To define a DSL for the framework, I tried Javascript, Clojure and Groovy in order to implement simple property access. They each have their strengths and weaknesses, but in this context I was only interested in syntax, datatypes, threading, performance and required skills. The results are:

Clojure

Syntax. Altough I have worked with LISP in the past, I have to say I really didn't enjoy the syntax. I found it horrible.
Datatypes. Lists and some other things do not map well to Java.
Threading. Clojure seems to use unmanaged threads inside the scripting engine which makes it impossible to use it within Java EE.
Required skills. How many programmers know Clojure syntax?

Javascript

Lack of metaprogramming. It's not possible to intercept property accessor calls with the version what's bundled with Java 6. At least not easily.
Datatypes. The typing system of Javascript do match well with Java and you don't have control over it.
Performance was not optimal.
Threading. Looks like Javascript is not using unmanaged threads.
Required skills. Many developers write Javascript anyhow, so skill-wise it could have been a perfect match.

Groovy

Metaprogramming is supported. This enabled easy property access and addition of several well-known pseudo properties. In general, Groovy seems to be highly extensible language.
Datatypes. Groovy uses Java objects. Most Groovy datatype extensions are done with in a non-intrusive manner using metaclasses.
Threading. Looks like Groovy is not using unmanaged threads.
Excellent performance.
Required skills. The syntax and types are somewhat similar to Java's. And closures are anyhow coming with Java 8.
Groovy is object-oriented. This is irrelevant, but since I like to tantalize functional programming fundamentalists, I couldn't help mentioning it.
Groovy is led by SpringSource. I don't feel comfortable whenever I hear a word Spring.

The drawbacks of Clojure are murderous, especially the use of unmanaged threads.The thing with uncontrolled use of threads may cause deadlocks and certainly problems with application server facilities. Think about situation where you have a synchronized method calling script which again is making callback to Java synchronized method; if the thread is the same it causes no problems, but with another thread you are in deadlock. Due to these threading issues I didn't measure performance aspects of Clojure at all. I decided to dump it. Javascript was performing poorly and the datatypes are somewhat limited and incompatible with Java, and thus Javascript was also dumped. Only Groovy survived initial filtering.

Using Groovy and Java together

Metaclasses

Groovy has some excellent features which support dynamism. For example, metaclasses enable nifty property accessors (see Using invokeMethod and getProperty for more information). Here's how property access can be customized using Groovy metaclasses:

def getterClosure = { name ->
switch( name ) {
case "objectId" :
return delegate.getObjectReference( )?.getId( );
default :
return delegate.getPropertyValue( name )
}
}

def setterClosure = { name,value ->
property = delegate.getPropertyOfKind( name );
type = property.getSpec( ).getType( );
if( type == value.class ) {
property.setValue( value )
}
else {
property.setValue( value != null ? convert( value,type ) : null )
}
}

Agreement.metaClass.getProperty = getterClosure
Agreement.metaClass.setProperty = setterClosure
Role.metaClass.getProperty = getterClosure
Role.metaClass.setProperty = setterClosure

This means all properties of Agreement and Role are accessed by closures defined above. Whenever property is read from Agreement or Role object, Groovy calls closure assigned to getProperty of the object's metaclass. For example, the getterClosure maps property read agreement.premium to method call agreement.getPropertyValue( "premium" ). Likewise agreement.objectId is mapped to delegate.getObjectReference( )?.getId( ). Notice the use of safe navigation operator (?) after getObjectReference call. This returns null and doesn't call getId if the return value of getObjectReference is null. Now, after evaluating the script above, I can access properties of Role and Agreement objects like this:

context.netPrice = context.premium * context.discountPercent / 100

Groovy implementation of derived properties, episode I

I started Groovy implementation of derived properties with functions. The functions looked like this:

def DerivedProperty_netPrice( context ) { context.premium * context.discountPercent / 100 }

I evaluated this generated script once and invoked it using javax.script.Invocable.invokeFunction. I ran unit tests and it was all good. I was also happy with the performance. However, when I added tests, all of a sudden something unexpected happened: invokeFunction threw a NoSuchMethodException. Thank good I had sources, which enabled me to debug what's going on. I had a hunch that it may have something to do with garbage collection since it happened after certain amount of looping all the time. By debugging the GroovyScripEngineImpl, I found my function in globalClosures map. It was hanging there, but map actually contained a SoftReference, which was cleared by JVM without any warnings and thus it returned null. However, during my debugging session I noticed that classes do not behave the same way. I decided to embed actual logic inside a class.

Groovy implementation of derived properties, episode II

After having problems with functions, I changed my implementation to generate classes instead of functions. Here's a sample:

@groovy.transform.Immutable
class DependentPropertyFormula_netPrice_1_0 {
java.lang.Double calculate( context ) {
context.premium * context.discountPercent / 100
}
}
new DependentPropertyFormula_netPrice_1_0( )

Again, in the beginning I evaluate the code snippet above, and keep the return value. The return value of ScriptEngine.eval call is an instance returned in the last line of the generated script. I have to store the instance in order to call method and to prevent garbage collection of my precious object. The invocation of method calculate changed from javax.script.Invocable.invokeFunction to javax.script.Invocable.invokeMethod, but the change was trivial. The code now looks like this:

public class GroovyDerivedPropertyFormula<T> implements DerivedPropertyFormula<T> {
private final Class<T> type;
private final Invocable invocable;
private final Object instance;

GroovyDerivedPropertyFormula( String kind,Class<T> type,String version,ScriptEngine engine,String code ) {
this.invocable = ( Invocable ) engine;
this.type = type;
String classCode = String.format(
"@groovy.transform.Immutable \n" +
"class DependentPropertyFormula_%s_%s { \n" +
" %s calculate( context ) {
" context.premium * context.discountPercent / 100
" }
"} \n" +
"new DependentPropertyFormula_%s_%s( )",
kind,version,type.getName( ),kind,version );
this.instance = engine.eval( classCode );
}

@Override
public T calculate( StructuredActual context ) {
return type.cast( invocable.invokeMethod( instance,"calculate",context ) );
}
}

This Java object is instantiated just once and thus it needs to be thread-safe. Same applies to Groovy object, although context given as a parameter doesn't have to be thread-safe.

Deployment of artifacts

When Java code is used to implement derived properties and other object types, the code must be deployed within the application. This may be a problem if multiple applications are using same product model. But when the code is stored in the database together with other data within the product model, the code is shipped automatically where ever it's needed. Cool!

A word of caution

Although it's easy to use scripting languages from Java, don't expect you can debug your script in proper context. It may be possible, but it's not going to be easy. That's why I wouldn't use scripting for nothing but very simple things.

Conclusion

As you can see, mixing Java and dynamic languages like Groovy is quite simple. I implemented all object types so that it's possible to implement actual business logic using Groovy and it took just several hours. If you, like most of programmers do, have any plans to implement a rule engine why don't you give a try to Groovy?

The gentle art of making ... programs

Monday, December 17, 2012

Mixing Java and dynamic scripting languages