The gentle art of making ... programs: June 2012

Thursday, June 21, 2012

Functional programming fundamentalism

In the near past I had a "little" discussion with a group of functional programming fundamentalists. The discussion, considering mainly F#, lasted for months and in fact it's still alive to some extent. Those posts actually strenghtened my opion about functional programming fundamentalists: most of them simply don't know well enough "the base" so that they could do any working solutions with OO-languages (or any?) utilizing multiple cores. For example, volatile variables were considered mostly mysterious. They presented several code snippets which were based on wrong assumptions about what concurrency and parallelism mean and when parallelism can actually happen. Quite often the answer is: "it just works". But does it work most of the time because of good (or bad) luck?
While we were filling the thread with posts, I found some really "nice" examples by Googling F# activities. The search resulted in several "diamonds", this being one of my favorite (from Misusing mutable state with F# Asynchronous Workflows):

"So it could be that the increment operation gets executed on two or more threads at exactly the same time. And if two threads read the variable at the same time, and then increment it before saving it back, we effectively lose one of our increment operations.

This is all fairly unlikely ..."

How about having that fella programming your banking solution? Not! If you know how locks, volatile variables and atomic types work, you can see in a second what's wrong with his program.

Fundamentalists swear in the name of immutability; all operations are acting on immutable data structures. In practice, applications need to have some side effects like it or not. Simon Peyton-Jones, a major contributor to the functional programming language Haskell, said the following: "In the end, any program must manipulate state. A program that has no side effects whatsoever is a kind of black box. All you can tell is that the box gets hotter." Admitting that you need side effects means is a good start. I think the key thing to understand is visibility: when a thread modifies a memory location, how, and when, is that change seen by other threads? (If you need more details about and cannot get sleep easily at nights, search for "Memory Barriers: a Hardware View for Software Hackers"). Sooner or later you will find yourself synchronizing access to shared mutable state and you'd better understand how it works. Let's consider a sample: a web crawler in F# by Tomas Petricek (http://fssnip.net/65). When it's time to check whether given URL is already visited, he falls back to object oriented, concurrent data structure called ConcurrentQueue. But he knew the data structure must be synchronizing one.

Functional programming languages very often use compact syntax which actually makes it really hard to understand. Fundamentalists think it's cool, but to me those programs often look like fonetic symbols of puking.

All this ranting doesn't mean that I don't understand the value of functional programming principles. Yes I do, and actually use them in many languages, but I don't believe in pure functional languages or even hybrids without understanding the base. Anyhow, be aware that functional features are being stolen from old languages like LISP and are now or in the near future in modern OO languages (like Java and C#). I would say that in the end of day, these are the big winners. For example, Scala will probably loose many followers when Java 8 is released with it's (not so sexy?) lambda expressions and built-in map/reduce (via java.lang.Iterable), etc. For those of you who don't know what the lambdas in Java will look like, here's a short sample:

import java.util.*;

public class Test {
public static void main( String... args ) {
Collection<Integer> list = Arrays.asList( 1,2,3,4,5 );
System.out.println( list.filter( x -> x > 2 ) );
}
}

Using OO- and functional techniques together is definitely OK and I believe it is an approach what will eventually win. Now you can still work with your favorite mainstream language (one of top five in Tiobe listing) and have excellent libraries for concurrent programming. As a bonus, you don't have to struggle with version imcompatibilities (like in Scala). Last but not least is the fact, that functional programming paradigm has no proper modelling supports, such as UML.

So, do yourself a favor by stopping playing with functional programming languages and starting to learn your mainstream language properly (including how to do concurrent programming with it).

Wednesday, June 20, 2012

What is this blog for?

In the beginning of year 2012 I moved to CIO office. After half a year of paperwork I decided to discharge my Microsoft Office bloat and created this blog. Posts are full of bits and bytes, so if you are looking for easy stories about social media and how everybody is having fun with it, this is not your place. On the other hand, if you are looking for practical tips about writing enterprise class Javascript, concurrenct programming in Java, mixing Java and scripting languages, ranting about functional programming languages or Spring, this is your place.

Adjusting time with BCE

This time we go deeper to BCE while we go thru another case: adjusting time seen by the applications. If you are not familiar with bytecode engineering (BCE), see my previous post for more information. If my previous post was too technical, please do a favor for yourself and skip this post, cause this time we really go deeper.

The problem

Sometimes programs want to "move" in time. For example, bills are created and after certain period of time it's time to check whether all bills are paid, and if not, start collecting money for unpaid bills. The period between two events can be days, weeks, months or even years. But you wan't to test them right away. Yeah, I know some of you are already thrilling to say something about unit tests, but we are not talking about small components "on the loose" here; we are now talking about full fledged Java EE applications running in Java EE application server.

Needles to say, it's sometimes appropriate to use current time in algorithms and these may well be the problematic snippets of logic. If you change the system time, everything is changed at once, and for example certificates in you appserver may expire (and server stops working). It's no go. Another solution proposed by some of my colleagues is to use custom type (maybe java.util.Date derivative) which is consulting "something" to get requested data offset. However, this latter approach sucks because:

How does it consult something? How does it differentiate the very same libraries embedded in applications?
It requires that you own the source code and can modify it (read: you must modify it)
The code "flows" all the way to production where it's not needed/wanted and you need special protection to keep it disabled in production environment
It doesn't work with 3rd party components (it's unlikely that they are using your custom type)

The perfect solution would be using just plain java.util.Date, but still be able to adjust time selectively. What I mean by selectively? How about changing the behavior for one application in a cluster and keep the rest intact? Enter BCE!

Using BCE to "move in time"

The idea to use BCE to implement ability to move in time was born after I had a conversation with my colleague. He was suggesting BCE to intercept native calls, but later on (after trying it) I realized it's not the way to go. Instead, I decided to instrument calls "at the client side", meaning I decided to modify all calls themselves instead of just modifying java.lang.System.currentTimeMillis (which is native method). This approach seemed to work and is still in use.

In this posting we go through the trickiest parts of time machine agent. We start with the easiest hook: intercepting calls to java.lang.System.currentTimeMillis. The ASM code intercepting those calls looks like this:

if( opcode == INVOKESTATIC && owner.equals( "java/lang/System") &&

name.equals( "currentTimeMillis" ) ) {

// Generate a currentTimeMillis call

super.visitMethodInsn( opcode,owner,name,desc );

// Push timeOffset on the stack

visitLdcInsn( new Long( timeOffset ) );

// Add the two topmost operands

visitInsn( LADD );

// Only current time remains on the stack

}

What's going on here? We intercept all calls which are static, are invoked against class of type java.lang.System and have method name currentTimeMillis. The very first thing we do is the invocation of actual method. Now the top of the operand stack (TOS) is current time. Then we push time offset on the stack and request addition of two longs (LADD). As a result we have now adjusted current time by the time offset. When this instrumentation code is run on any class calling java.lang.System.currentTimeMillis, it will add given offset to current time. So the following line of code:

System.out.println( new Date( ) );

actually executes this bytecode:

0: getstatic #32; //Field java/lang/System.out:Ljava/io/PrintStream;
3: invokestatic #47; //Method java/lang/System.currentTimeMillis:()J
6: ldc2_w #20; //long 18000000l
9: ladd
10: invokevirtual #49; //Method java/io/PrintStream.println:(J)V

Note how the time offset, five hours, maps to constant 1800000l, which is then added to current time (moving "current" time five hours to the future).

However, programmers often use more abstract way to ask for current time, for example by constructing an instance of type java.util.Date or calling java.util.Calendar.getInstance. To adjust time returned by default constructor of java.util.Date, we do it like this:

if( opcode == INVOKESPECIAL && owner.equals( "java/util/Date" ) &&

name.equals( "<init>" ) && desc.equals( "()V" ) ) {

// Bytecode NEW is already executed

// Stack contains now a reference to java.util.Date (slots reserved 1)

// Note that constructor call is not made yet.

// Dup the reference (slots used +1)

// Stack contains now a reference to java.util.Date

// Note that constructor call is not made yet.
// Dup the reference

dup( );

// Now let a constructor call happen (consumes one slot)

super.visitMethodInsn( opcode,owner,name,desc );

// Dup the reference

dup( );

// Invoke getTime method

visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","getTime", "()J");

// Push offset to the stack

visitLdcInsn( new Long( timeOffset ) );

// Add the operands

visitInsn( LADD );

// And finally call setTime method

visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","setTime", "(J)V");

// One reference to java.util.Date remains on the stack

}

As you can see, the method is now much more complicated. The comments along the lines tell what's going on there, so I don't repeat myself here. And now this line of Java code:

System.out.println( new Date( ) );

produces bytecode:

0: getstatic #32; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #14; //class java/util/Date
6: dup
7: dup
8: invokespecial #15; //Method java/util/Date."<init>":()V
11: dup
12: invokevirtual #19; //Method java/util/Date.getTime:()J
15: ldc2_w #20; //long 18000000l
18: ladd
19: invokevirtual #25; //Method java/util/Date.setTime:(J)V
22: invokevirtual #38; //Method java/io/PrintStream.println:(Ljava/lang/Object;)V

Full implementation of the time machine also requires interception of java.util.Calendar.getInstance. However, since it's basically similar as java.util.Date constructor case, I don't go thru it here.

Controlling which classes to instrument

In the beginning I was preaching about selectivity. Now it's time to approve my words.

My Java agent, like any other Java agent, takes one parameter. My agent uses this single string to define one or more regular expressions, which define whether a class is instrumented (match) or not (no match). So, right now the javaagent-switch is defined like this:

-javaagent:c:\temp\tm.jar=paci/.*

This results in instrumentation of classes which are in a package hierarchy starting with "paci". Here's our test program:

package paci;

import java.util.Date;

import org.joda.time.DateTime;

public class TimemachineTest {

public static void main(String[] args) {

System.out.println( new Date( ) );

System.out.println( new DateTime( ) );

}

Because of this "selection" or restriction, this is what happens:

Wed Jun 20 21:45:12 EEST 2012
2012-06-20T16:45:12.279+03:00

Plain java.util.Date call works, but Joda Time (org.joda.time.DateTime) doesn't. Let's modify our instrumentation targets to include also Joda Time packages. The javaagent-switch now looks like this:

-javaagent:c:\temp\tm.jar=paci/.*:org/joda/.*

Vóila, all of a sudden we start getting correct results with Joda Time, too. This is what gets printed out

Wed Jun 20 21:55:34 EEST 2012
2012-06-20T21:55:34.417+03:00

What if you have the same library in multiple applications, but only want to instrument some of them? The answer is to use java.security.ProtectionDomain parameter of transform method. From protection domain you can get the code source and its location. All you have to do is to parse the URL and decide whether to instrument or not. Here's a sample code source URL printed out by my agent:

file:/C:/programs/joda-time-2.1-dist/joda-time-2.1/joda-time-2.1.jar

But it breaks my application server!

Oh yeah, that's what happened to me. Since WebSphere Application Server or actually some of its components use ASM for its own purposes, it's possible to blow out the server with version mismatch. This in turn means classloader hacking. It's doable and not hard to implement, but it's another story.

Full source of the agent can be found here

Okie dokie, we are set with our time machine exercise.

Monday, June 18, 2012

Real world bytecode engineering (BCE)

I'm a big fan of bytecode engineering. You can do wonderful things with it if you just know how. Here's a case, episode one of the series to come. There may be other ways to solve this problem, but since I didn't find any way to catch floating point exceptions in Java, I decided to implement it with BCE.

The problem

A group of people is developing mathematical program which is quite complex and large. The calculations are done, due to performance reasons, with normal IEEE 754 double precision numbers (i.e. doubles). In addition, operations of java.lang.Math are not available without using primitive types. With java.math.BigDecimal these floating point exceptions (FPEs) would actually blow out the application immediately, but that’s not the case with primitive types. Consider the following fragment:

public class BigDecimalArithmetic {

public static void main( String... args ) {

double a = 5.0, b = 0.0, c = 10.0;

System.out.println( c / ( a / b + 5 ) );

}

Magically this program prints out 0.0 without any exceptions. Why? Since 5.0 / 0.0 results in infinity, infinity plus 5 is still infinity and finally 5.0 divided by infinity is 0.0. I would say this is quite often not what you want. Now consider 100 000 lines of code full of this kind of logic. Houston, we have a problem.

How to find problematic places?

To catch the problem we can test if the calculation resulted in infinity. That's what we do:

public class BigDecimalArithmetic {

public static void main( String... args ) {

double a = 5.0, b = 0.0, c = 10.0;

double r1 = a / b;

if( Double.isInfinite( r1 ) )

throw new ArithmeticException( "division by zero" );

System.out.println( c / ( r1 + 5 ) );

}

Now we know if the calculation results in infinity and if so, we throw an ArithmeticException. However, we still don’t catch the situation when calculation resulted in a NaN. NaN can be produced, for example by taking square root of -1. To catch a NaN situation, once again we modify our program. This is what we have now:

public class BigDecimalArithmetic {

public static void main( String... args ) {

double a = 5.0, b = 0.0, c = 10.0;

double r1 = a / b;

if( Double.isInfinite( r1 ) )

throw new ArithmeticException( "division by zero" );

if( Double.isNan( r1 ) )

throw new ArithmeticException( "result is NaN" );

System.out.println( c / ( r1 + 5 ) );

}

We have completed one division operation and we sure we are still doing OK with the results. Now consider executing some real world mathematics. The number of operations is huge and to know exactly where you went wrong means excessive if-blocks; the code would be bloated with exception handling logic. And afterall, we have to modify our code base to catch those special situations.

Finding problems without touching the code?

This really sounds good, since the number of lines of code is simply too much to manually go thru. How do we modify behavior without touching the code? The simple answer is bytecode engineering. We modify bytecode of classes at runtime just after the classes are loaded but before the bytecode is verified. Once the class is loaded, and the bytecode is changed on the fly, it will behave just like those checks (isNan and isInfinite) were ubiquitously present. Bytecode engineering (BCE) is not business as usual for many Java developers and without proper tools it's really hard. My favorite library for BCE is ASM from ObjectWeb. I'm also using Andrey Loskutov's Bytecode Outline. It runs as Eclipse plugin. I think it's a must-have tool for ASM developers. Remember, I never said BCE is easy, but with a little assembler or C background you can probably manage with it. See ObjectWeb for more information. But finally we dive into BCE. Fasten your seatbelts!

Diving into wonderful world of bytecode engineering

Now we know what we should do after each FP operation and we have selected our BCE library. So lets start producing some bytecode with ASM. We start by introducing a private method which will generate the bytecode for the checks. Also, keep in mind that this method is called only during the class loading time and not after that. The method looks like this:

private void generateChecks( String operation ) {

dup2( );

// Is it infinite?

mv.visitMethodInsn( INVOKESTATIC,"java/lang/Double","isInfinite","(D)Z" );

Label label1 = new Label( );

mv.visitJumpInsn( IFEQ,label1 );

mv.visitLdcInsn( operation );

mv.visitMethodInsn( INVOKESTATIC,"paci/samples/bce/FPEListener",

"resultIsInfinite","(Ljava/lang/String;)V" );

mv.visitLabel( label1 );

dup2( );

// Is it NaN?

mv.visitMethodInsn( INVOKESTATIC,"java/lang/Double","isNan","(D)Z" );

Label label2 = new Label( );

mv.visitJumpInsn( IFEQ,label2 );

mv.visitLdcInsn( operation );

mv.visitMethodInsn( INVOKESTATIC,"paci/samples/bce/FPEListener",

"resultIsInfinite","(Ljava/lang/String;)V" );

mv.visitLabel( label2 );

}

Upon class loading process ASM will call visitInsn method for each and every instruction in the classfile. In my overridden version of visitInsn I have a simple switch-case statement which will generate some extra bytecode for selected operations. My visitInsn method now looks like this:

@Override

public void visitInsn( int opcode ) {

mv.visitInsn( opcode );

switch( opcode ) {

case DDIV :

generateChecks( "DIVIDE" );

break;

case DMUL :

generateChecks( "MULTIPLY" );

break;

case DADD :

generateChecks( "ADD" );

break;

case DSUB :

generateChecks( "SUBTRACT" );

break;

}

First I let the ASM to generate the operation itself (e.g ddiv, dmul, ...). If the operation is dealing with double precision arithmetic, I will call my already well understood generateChecks method. And remember, this only happens during classloading time. This method will generate the checks right after each operation. Now, if a class dividing two doubles is instrumented with our bytecode modifier, at runtime the bytecode will look like this:

21: ddiv

22: dup2

23: invokestatic #22; //Method java/lang/Double. isInfinite:(D)Z

26: ifeq 34

29: ldc #24; //String DIVIDE

31: invokestatic #30; //Method paci/samples/bce/FPEListener.resultIsInfinite:(Ljava/lang/String;)V

34: dup2

35: invokestatic #33; //Method java/lang/Double.isNaN:(D)Z

38: ifeq 46

41: ldc #24; //String DIVIDE

43: invokestatic #36; //Method paci/samples/bce/FPEListener.resultIsNaN:(Ljava/lang/String;)V

Line 21 performs actual division operation (ddiv) which pushes the result on the top of the stack. This is how JVM operation generally work. Then we duplicate top two stack words because we cannot consume the result of the computation with our java.lang.Double.isNaN call at line 23. If the result of isInfinite is zero (false) we jump to line 34 and duplicate the top two stack words again for the same reason as before. Next lines perform the similar check for NaN. If computation didn't cause any floating point exception, the top of the stack still contains the result of double division. Note that bytecode produced by my bytecode modifier is not strictly representable in Java! This is my best guess for the same behavior:

private static void test( ) {

double a = 5, b = 6;

double result = a / b;

if( Double.isInfinite( result ) )

FPEListener.resultIsInfinite( "DIVIDE" );

if( Double.isNaN( result ) )

FPEListener.resultIsNan( "DIVIDE" );

}

But if we look at the bytecode generated by the compiler, we can see it's not using dup2 as I do with my ASM code. Instead, it has replaced the dup2's with local variable store and load operations (dstore and dload). Here's the bytecode for method test:

private static void test();

Code:

0: ldc2_w #15; //double 5.0d

3: dstore_0

4: ldc2_w #17; //double 6.0d

7: dstore_2

8: dload_0

9: dload_2

10: ddiv

11: dstore 4

13: dload 4

15: invokestatic #19; //Method java/lang/Double.isInfinite:(D)Z

18: ifeq 26

21: ldc #25; //String DIVIDE

23: invokestatic #27; //Method paci/samples/bce/FPEListener.resultIsInfinite:(Ljava/lang/String;)V

26: dload 4

28: invokestatic #33; //Method java/lang/Double.isNaN:(D)Z

31: ifeq 39

34: ldc #25; //String DIVIDE

36: invokestatic #36; //Method paci/samples/bce/FPEListener.resultIsNan:(Ljava/lang/String;)V

39: return

Since mathematicians also use some methods of java.lang.Math, I also decided to add support for every method in java.lang.Math which takes a single double as a parameter and returns a double. This was so easy to do with ASM since it provides a hook to handle all method calls. My overridden method looks like this:

@Override

public void visitMethodInsn( int opcode,String owner,String name,String desc ) {

super.visitMethodInsn( opcode,owner,name,desc );

// Intercept all java.lang.Math calls which are static, take one

// double as an argument and return double

if( opcode == INVOKESTATIC && owner.equals( "java/lang/Math" ) &&

desc.equals( "(D)D" ) ) {

generateChecks( name.toUpperCase( ) );

}

Verification

After some mysterious coding session we can now verify the results. If we run for example thes lines:

double a = 50.0, b = 10.0, c = 0.0;

double result = a / b / c * 4;

it will call FPEListener which will in turn throw an java.lang.ArithmeticException. The stack trace looks like this:

Exception in thread "main" java.lang.ArithmeticException: Floating point operation DIVIDE resulted infinite

at paci.samples.bce.FPEListener.resultIsInfinite(FPEListener.java:9)

at paci.Liukuuliukuu.main(Liukuuliukuu.java:30)

The reason I did it this way (instead of just throwing ArithmeticException) is that now it's easier to change the error handling behavior, e.g. whether it throws and exception or just prints out the stack trace.

That's it. The full code can be found here. I used ASM 3.1 with it, and it may have some problems with ASM 4.0.

The next story about BCE will be dealing with selectively adjusting time seen by the applications.