The gentle art of making ... programs: Adjusting time with BCE

This time we go deeper to BCE while we go thru another case: adjusting time seen by the applications. If you are not familiar with bytecode engineering (BCE), see my previous post for more information. If my previous post was too technical, please do a favor for yourself and skip this post, cause this time we really go deeper.

The problem

Sometimes programs want to "move" in time. For example, bills are created and after certain period of time it's time to check whether all bills are paid, and if not, start collecting money for unpaid bills. The period between two events can be days, weeks, months or even years. But you wan't to test them right away. Yeah, I know some of you are already thrilling to say something about unit tests, but we are not talking about small components "on the loose" here; we are now talking about full fledged Java EE applications running in Java EE application server.

Needles to say, it's sometimes appropriate to use current time in algorithms and these may well be the problematic snippets of logic. If you change the system time, everything is changed at once, and for example certificates in you appserver may expire (and server stops working). It's no go. Another solution proposed by some of my colleagues is to use custom type (maybe java.util.Date derivative) which is consulting "something" to get requested data offset. However, this latter approach sucks because:

How does it consult something? How does it differentiate the very same libraries embedded in applications?
It requires that you own the source code and can modify it (read: you must modify it)
The code "flows" all the way to production where it's not needed/wanted and you need special protection to keep it disabled in production environment
It doesn't work with 3rd party components (it's unlikely that they are using your custom type)

The perfect solution would be using just plain java.util.Date, but still be able to adjust time selectively. What I mean by selectively? How about changing the behavior for one application in a cluster and keep the rest intact? Enter BCE!

Using BCE to "move in time"

The idea to use BCE to implement ability to move in time was born after I had a conversation with my colleague. He was suggesting BCE to intercept native calls, but later on (after trying it) I realized it's not the way to go. Instead, I decided to instrument calls "at the client side", meaning I decided to modify all calls themselves instead of just modifying java.lang.System.currentTimeMillis (which is native method). This approach seemed to work and is still in use.

In this posting we go through the trickiest parts of time machine agent. We start with the easiest hook: intercepting calls to java.lang.System.currentTimeMillis. The ASM code intercepting those calls looks like this:

if( opcode == INVOKESTATIC && owner.equals( "java/lang/System") &&

name.equals( "currentTimeMillis" ) ) {

// Generate a currentTimeMillis call

super.visitMethodInsn( opcode,owner,name,desc );

// Push timeOffset on the stack

visitLdcInsn( new Long( timeOffset ) );

// Add the two topmost operands

visitInsn( LADD );

// Only current time remains on the stack

}

What's going on here? We intercept all calls which are static, are invoked against class of type java.lang.System and have method name currentTimeMillis. The very first thing we do is the invocation of actual method. Now the top of the operand stack (TOS) is current time. Then we push time offset on the stack and request addition of two longs (LADD). As a result we have now adjusted current time by the time offset. When this instrumentation code is run on any class calling java.lang.System.currentTimeMillis, it will add given offset to current time. So the following line of code:

System.out.println( new Date( ) );

actually executes this bytecode:

0: getstatic #32; //Field java/lang/System.out:Ljava/io/PrintStream;
3: invokestatic #47; //Method java/lang/System.currentTimeMillis:()J
6: ldc2_w #20; //long 18000000l
9: ladd
10: invokevirtual #49; //Method java/io/PrintStream.println:(J)V

Note how the time offset, five hours, maps to constant 1800000l, which is then added to current time (moving "current" time five hours to the future).

However, programmers often use more abstract way to ask for current time, for example by constructing an instance of type java.util.Date or calling java.util.Calendar.getInstance. To adjust time returned by default constructor of java.util.Date, we do it like this:

if( opcode == INVOKESPECIAL && owner.equals( "java/util/Date" ) &&

name.equals( "<init>" ) && desc.equals( "()V" ) ) {

// Bytecode NEW is already executed

// Stack contains now a reference to java.util.Date (slots reserved 1)

// Note that constructor call is not made yet.

// Dup the reference (slots used +1)

// Stack contains now a reference to java.util.Date

// Note that constructor call is not made yet.
// Dup the reference

dup( );

// Now let a constructor call happen (consumes one slot)

super.visitMethodInsn( opcode,owner,name,desc );

// Dup the reference

dup( );

// Invoke getTime method

visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","getTime", "()J");

// Push offset to the stack

visitLdcInsn( new Long( timeOffset ) );

// Add the operands

visitInsn( LADD );

// And finally call setTime method

visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","setTime", "(J)V");

// One reference to java.util.Date remains on the stack

}

As you can see, the method is now much more complicated. The comments along the lines tell what's going on there, so I don't repeat myself here. And now this line of Java code:

System.out.println( new Date( ) );

produces bytecode:

0: getstatic #32; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #14; //class java/util/Date
6: dup
7: dup
8: invokespecial #15; //Method java/util/Date."<init>":()V
11: dup
12: invokevirtual #19; //Method java/util/Date.getTime:()J
15: ldc2_w #20; //long 18000000l
18: ladd
19: invokevirtual #25; //Method java/util/Date.setTime:(J)V
22: invokevirtual #38; //Method java/io/PrintStream.println:(Ljava/lang/Object;)V

Full implementation of the time machine also requires interception of java.util.Calendar.getInstance. However, since it's basically similar as java.util.Date constructor case, I don't go thru it here.

Controlling which classes to instrument

In the beginning I was preaching about selectivity. Now it's time to approve my words.

My Java agent, like any other Java agent, takes one parameter. My agent uses this single string to define one or more regular expressions, which define whether a class is instrumented (match) or not (no match). So, right now the javaagent-switch is defined like this:

-javaagent:c:\temp\tm.jar=paci/.*

This results in instrumentation of classes which are in a package hierarchy starting with "paci". Here's our test program:

package paci;

import java.util.Date;

import org.joda.time.DateTime;

public class TimemachineTest {

public static void main(String[] args) {

System.out.println( new Date( ) );

System.out.println( new DateTime( ) );

}

Because of this "selection" or restriction, this is what happens:

Wed Jun 20 21:45:12 EEST 2012
2012-06-20T16:45:12.279+03:00

Plain java.util.Date call works, but Joda Time (org.joda.time.DateTime) doesn't. Let's modify our instrumentation targets to include also Joda Time packages. The javaagent-switch now looks like this:

-javaagent:c:\temp\tm.jar=paci/.*:org/joda/.*

Vóila, all of a sudden we start getting correct results with Joda Time, too. This is what gets printed out

Wed Jun 20 21:55:34 EEST 2012
2012-06-20T21:55:34.417+03:00

What if you have the same library in multiple applications, but only want to instrument some of them? The answer is to use java.security.ProtectionDomain parameter of transform method. From protection domain you can get the code source and its location. All you have to do is to parse the URL and decide whether to instrument or not. Here's a sample code source URL printed out by my agent:

file:/C:/programs/joda-time-2.1-dist/joda-time-2.1/joda-time-2.1.jar

But it breaks my application server!

Oh yeah, that's what happened to me. Since WebSphere Application Server or actually some of its components use ASM for its own purposes, it's possible to blow out the server with version mismatch. This in turn means classloader hacking. It's doable and not hard to implement, but it's another story.

Full source of the agent can be found here

Okie dokie, we are set with our time machine exercise.

The gentle art of making ... programs

Wednesday, June 20, 2012

Adjusting time with BCE

The problem

Using BCE to "move in time"

Controlling which classes to instrument

But it breaks my application server!

No comments:

Post a Comment