Monday, December 17, 2012

Mixing Java and dynamic scripting languages


I'm done with the assembly language (so far), so it's time to publish some high(er)-level stuff.

Problem


Several years ago I was a team member of a project which implemented a framework targeting insurance systems. The framework is based on IBM's specification framework, which in turn is the basis of IAA. Among other things, the specification framework can be used to form a network of agreements and various elements attached to agreements (requests, properties, roles, calculations and rules), possibly without any coding. Many objects in the framework extend StructuredActual (e.g. Agreements and Roles), so this class is commonly found in the interfaces. All "external" components are modeled as role players (e.g. a party acting as a policy holder) and are not part of the framework per se.

Agreements are "instances" of another concept called product. Product defines properties and other elements associated with an instance of agreement. The product itself is parametrized with the data from configuration database or XML file. This means we can build a network of agreements without coding a single line of Java. Data can be automatically persisted without any intervention of programmers. On the other hand, things like derived properties (e.g. things relative to current time) and rules must be coded in Java or there must be some other language to describe their behavior. Initially we decided to go with plain Java, but left the door open for other options. The reasons behind Java-only-approach were:


  • Performance. With Java we can have optimal performance.
  • Debugability. Since everything is in Java, it's quite easy to figure out what's going on with a debugger.
  • Avoidance of proprietary languages. We didn't want to invent our own yet-another programming language.
  • Skills. We know Java and it's easier to hire people to work with the framework since it's based on one of the most popular languages.


Basically there's nothing wrong with this approach, but there are some things which could make life easier. For example, class Agreement is a kind of chameleon class, and the properties of it are configured thru the product without any coding work. This in turn means that the properties must be modeled as instances of class Property (carrying information such as type, default value, whether it's value is mandatory or not, ...). Since these properties live inside a map, and each instance of class Agreement can have it's own properties, we can't have normal, JavaBean type accessors for properties. Instead, the properties are accessed like this:

Double value = agreement.getPropertyValue( "premium" );

It gets a little bit uglier when multiple properties are accessed for example in a derived property formula (the logic behind actual property). Here's a sample

public class NetPriceFormula extends Formula<Double> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ) * 
                         context.getPropertyValue( "discountPercent" ) / 100.0;
}
}

And if we are dealing with java.math.BigDecimal, it gets much uglier:


public class NetPriceFormula extends Formula<BigDecimal> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ). 
                         multiply( context.getPropertyValue( "discountPercent" ) ).
                            divide(  new BigDecimal( "100.0" ) );
}
}


At the end of the day, the formula should look like this:

context.premium * context.discountPercent / 100

Also, I didn't like the idea of having zillions of really small classes. That's why I decided to give a try to several scripting languages.

Using script languages with Java


To define a DSL for the framework, I tried Javascript, Clojure and Groovy in order to implement simple property access. They each have their strengths and weaknesses, but in this context I was only interested in syntax, datatypes, threading, performance and required skills. The results are:

Clojure


  • Syntax. Altough I have worked with LISP in the past, I have to say I really didn't enjoy the syntax. I found it horrible. 
  • Datatypes. Lists and some other things do not map well to Java.
  • Threading. Clojure seems to use unmanaged threads inside the scripting engine which makes it impossible to use it within Java EE.
  • Required skills. How many programmers know Clojure syntax?


Javascript


  • Lack of metaprogramming. It's not possible to intercept property accessor calls with the version what's bundled with Java 6. At least not easily.
  • Datatypes. The typing system of Javascript do match well with Java and you don't have control over it. 
  • Performance was not optimal.
  • Threading. Looks like Javascript is not using unmanaged threads.
  • Required skills. Many developers write Javascript anyhow, so skill-wise it could have been a perfect match.


Groovy


  • Metaprogramming is supported. This enabled easy property access and addition of several well-known pseudo properties. In general, Groovy seems to be highly extensible language.
  • Datatypes. Groovy uses Java objects. Most Groovy datatype extensions are done with in a non-intrusive manner using metaclasses.
  • Threading. Looks like Groovy is not using unmanaged threads.
  • Excellent performance.
  • Required skills. The syntax and types are somewhat similar to Java's. And closures are anyhow coming with Java 8.
  • Groovy is object-oriented. This is irrelevant, but since I like to tantalize functional programming fundamentalists, I couldn't help mentioning it.
  • Groovy is led by SpringSource. I don't feel comfortable whenever I hear a word Spring.


The drawbacks of Clojure are murderous, especially the use of unmanaged threads.The thing with uncontrolled use of threads may cause deadlocks and certainly problems with application server facilities. Think about situation where you have a synchronized method calling script which again is making callback to Java synchronized method; if the thread is the same it causes no problems, but with another thread you are in deadlock. Due to these threading issues I didn't measure performance aspects of Clojure at all. I decided to dump it. Javascript was performing poorly and the datatypes are somewhat limited and incompatible with Java, and thus Javascript was also dumped. Only Groovy survived initial filtering.

Using Groovy and Java together


Metaclasses


Groovy has some excellent features which support dynamism. For example, metaclasses enable nifty property accessors (see Using invokeMethod and getProperty for more information). Here's how property access can be customized using Groovy metaclasses:

def getterClosure = { name -> 
switch( name ) {
case "objectId" :  
return delegate.getObjectReference( )?.getId( );
default :
return delegate.getPropertyValue( name )

}

def setterClosure = { name,value -> 
property = delegate.getPropertyOfKind( name );
type = property.getSpec( ).getType( );
if( type == value.class ) {
property.setValue( value )
}
else {
property.setValue( value != null ? convert( value,type ) : null )
}
}

Agreement.metaClass.getProperty = getterClosure
Agreement.metaClass.setProperty = setterClosure
Role.metaClass.getProperty = getterClosure
Role.metaClass.setProperty = setterClosure

This means all properties of Agreement and Role are accessed by closures defined above. Whenever property is read from Agreement or Role object, Groovy calls closure assigned to getProperty of the object's metaclass. For example, the getterClosure maps  property read agreement.premium to method call agreement.getPropertyValue( "premium" ). Likewise agreement.objectId is mapped to delegate.getObjectReference( )?.getId( ). Notice the use of safe navigation operator (?) after getObjectReference call. This returns null and doesn't call getId if the return value of getObjectReference is null. Now, after evaluating the script above, I can access properties of Role and Agreement objects like this:

context.netPrice = context.premium * context.discountPercent / 100

Groovy implementation of derived properties, episode I


I started Groovy implementation of derived properties with functions. The functions looked like this:

def DerivedProperty_netPrice( context ) { context.premium * context.discountPercent / 100 }

I evaluated this generated script once and invoked it using javax.script.Invocable.invokeFunction. I ran unit tests and it was all good. I was also happy with the performance. However, when I added tests, all of a sudden something unexpected happened: invokeFunction threw a NoSuchMethodException. Thank good I had sources, which enabled me to debug what's going on. I had a hunch that it may have something to do with garbage collection since it happened after certain amount of looping all the time. By debugging the GroovyScripEngineImpl, I found my function in globalClosures map. It was hanging there, but map actually contained a SoftReference, which was cleared by JVM without any warnings and thus it returned null. However, during my debugging session I noticed that classes do not behave the same way. I decided to embed actual logic inside a class.

Groovy implementation of derived properties, episode II


After having problems with functions, I changed my implementation to generate classes instead of functions. Here's a sample:

@groovy.transform.Immutable
class DependentPropertyFormula_netPrice_1_0 { 
   java.lang.Double calculate( context ) {  
      context.premium * context.discountPercent / 100
   }
}
new DependentPropertyFormula_netPrice_1_0( )

Again, in the beginning I evaluate the code snippet above, and keep the return value. The return value of ScriptEngine.eval call is an instance returned in the last line of the generated script. I have to store the instance in order to call method and to prevent garbage collection of my precious object. The invocation of method calculate changed from javax.script.Invocable.invokeFunction to javax.script.Invocable.invokeMethod, but the change was trivial. The code now looks like this:

   public class GroovyDerivedPropertyFormula<T> implements DerivedPropertyFormula<T> {
      private final Class<T> type;
      private final Invocable invocable;
      private final Object instance;

      GroovyDerivedPropertyFormula( String kind,Class<T> type,String version,ScriptEngine engine,String code ) {
         this.invocable = ( Invocable ) engine;
         this.type = type;
         String classCode = String.format( 
            "@groovy.transform.Immutable \n" +
            "class DependentPropertyFormula_%s_%s { \n" +
            " %s calculate( context ) { 
            "      context.premium * context.discountPercent / 100
            " }
            "} \n" +
            "new DependentPropertyFormula_%s_%s( )",
            kind,version,type.getName( ),kind,version );
            this.instance = engine.eval( classCode );
      }

      @Override
      public T calculate( StructuredActual context ) {
         return type.cast( invocable.invokeMethod( instance,"calculate",context ) );
      } 
   }

This Java object is instantiated just once and thus it needs to be thread-safe. Same applies to Groovy object, although context given as a parameter doesn't have to be thread-safe.

Deployment of artifacts


When Java code is used to implement derived properties and other object types, the code must be deployed within the application. This may be a problem if multiple applications are using same product model. But when the code is stored in the database together with other data within the product model, the code is shipped automatically where ever it's needed. Cool!


A word of caution


Although it's easy to use scripting languages from Java, don't expect you can debug your script in proper context. It may be possible, but it's not going to be easy. That's why I wouldn't use scripting for nothing but very simple things.


Conclusion


As you can see, mixing Java and dynamic languages like Groovy is quite simple. I implemented all object types so that it's possible to implement actual business logic using Groovy and it took just several hours. If you, like most of programmers do, have any plans to implement a rule engine why don't you give a try to Groovy?

Friday, December 7, 2012

Intel x86 opcodes: a few samples

08.12.2012: updated a couple of bad misspellings.

This is actually just an addendum to my previous blog entry and doesn't make any sense if you are not familiar with Intel assembly language. Very, very low-level stuff.

Why CALL EAX is encoded to FFD0?


That's a very good question, indeed. Since this was one of the hardest things to understand to me within the context of hotpatching, I decided to make additional note/description about instruction encoding. Let's start with a picture from Intel Architecture Software Developer’s Manual:












The mnemonic we are interested in is CALL. As can be seen from the reference, the primary opcode of CALL is FF. But wait, there are six other mnemonics with the same primary opcode. We need tell the processor somehow that it's a specific CALL we are requesting. This is achieved via 3 bits in the following MOD R/M byte. Since we want to call 32-bit address, the opcode bits in MOD R/M byte must be set to 2 (010). We also set the two first bits of MOD R/M byte to ones in order to tell the processor that the R/M bits name the register which contains actual address for our call. Now we have a bit sequence 11010000. And this in turn happens to be mysterious D0 in our byte sequence FFD0. Finally we can verify from picture below that value zero (000) in R/M part means EAX when MOD bits are ones, indeed. That's the reason why register (EAX) itself didn't have any effect on the value of the MOD R/M byte. If we had CALL EDX, the corresponding byte sequence would have been FFD2.



























Why MOV EAX,<ADDRESS> is encoded to B8?


In the Intel Architecture Software Developer’s Manual, page 3-402, the 32-bit MOV operation is described as follows:

But what does the + rd mean in this context? Again, from the Intel Architecture Software Developer’s Manual we can see that encoding of register EAX in rd nomenclature is zero and that's exactly what we are trying to tell to the processor. Would it been MOV ECX,<imm32>, the instruction encoding has been B9, as shown in the table below.














Does this satisfy your question, Abu? From now on, you are ready to throw all languages (especially functional ones) to dumpster, and do all your coding directly with machine language ;-)

Saturday, December 1, 2012

Hotpatching, episode II

Update 2.12.2012: changed links to red so that they can be seen and modified some text to more meaningful.

After publishing my first hotpatching article, I started to think about concurrency issues with it. Since I'm creating a thread and changing values of global variables, it means possible problems with happens-before issues. These changes must be visible to all threads immediately after the change since pointers to original functions may otherwise point to NULL and may cause access violation to happen. I also started to think about dangers of using DllMain. After reading the article about more complete DLL injection by Drew Benton, I decided to make another version of my DLL injection based on his ideas.

I used the Drew's code as a basis and modified it as needed. It took a while to really get into the code. Meanwhile I was enjoying several access violations caused by my almost-correct "assembly code". Since the "shippable" code is written using x86 opcodes, even assembly language can be considered high-level compared to it. This actually reminded me about Java bytecode engineering. If you want to understand what's going on with the code, this reference may be helpful. And if you found my previous hotpatching article too technical, I don't recommend to proceed with the code.

To me this was a real eye-opener: it's really relatively easy to send whatever code you want to be executed by a given process. Just use a couple of x86 opcodes to break in the process by calling your own function in the DLL. It opens endless possibilities. (Do I ever need them is a different story).

Thread-safe DLL code is here and the code doing actual injection is here. Have fun with them ;-) And by the way, don't ever use CriticalSections the way I did in the DLL. I'm using GNU C++ and thus didn't have structured exception handling possibilities, so I decided to leave out exception handling completely.

Thursday, November 29, 2012

How's it hanging?


Once again I found myself going deeper in the world of low-level programming. This all began due to hang process caused systematically by the "collaboration" of two products. Let the offender be anonymous, while the other party was was BMC's Control-M, running as a Windows service.

Finding a correct location


How did I find what's wrong? It all started by using Process Explorer from Sysinternals. From the threads view I could see that certain DLL was always stuck pretty much in the same place. Actually it wasn't completely frozen but anyhow for some reason it couldn't proceed. This was a good start, but still it didn't lead me deeper into the problem at hand. I only knew that a thread was stuck and in the middle of the call stack I found FindWindow call, which was from now on my primary suspect. I googled FindWindow within a Windows service, but I couldn't find any clues of hang process behavior. I decided to take a look into DLL I found in the call stack. I searched for FindWindow and found this:

 L30003277:
  push 00000000h
  call [KERNEL32.dll!Sleep]
  push edi
  push 00000000h
  call [USER32.dll!FindWindowW]
  mov esi,eax
  test esi,esi
  jz L30003277

Wow! Although I'd say I found it by accident, I instantly knew this is it (or shit); a code snippet looping until certain window handle is found. And since this code is run under noninteractive window station, it will never find the window. However, I wanted more evidence and I downloaded API Monitor from Rohitab.com. Some people say it has it's deficiencies, but at least in this case this tool was extremely valuable. I ran the code with it and vĂ³ila: I caught the looping FindWindow code red-handed. From the call stack I saw the very same address as I found with the disassembler. The case seemed to be closed. We actually got a patch during the same day, but it wasn't because of my discoveries. The case was already reported by someone else.

Anyway, few days later I was discussing with my colleague about the case, who asked if I know how the API monitor does it's magic. I had to say I have no idea, but immediately I knew this won't be a long lasting answer. I had to find out how they do it! Enter hotpatches.

Hotpatches


Hotpatches allows you to patch a running process without requiring that the process be stopped and thus the patch can be applied without even stopping the process. This mechanism can be used to implement API spying, software cracking and malware, just to name few uses. What makes it possible is the sequence of five bytes just before the function and two bytes (doing basically nothing) in the beginning of the function. The instructions before function are NOPs while the very first instruction in the function is mov edi,edi. Here are some examples from USER32.DLL:

Disassembly of MessageBoxW looks like this:


7DCBFECA  90 db 90h;   '?'
7DCBFECB  90 db 90h;   '?'
7DCBFECC  90 db 90h;   '?'
7DCBFECD  90 db 90h;   '?'
7DCBFECE  90 db 90h;   '?'
7DCBFECF                           MessageBoxW:
7DCBFECF  8BFF mov edi,edi
7DCBFED1  55 push ebp
7DCBFED2  8BEC mov ebp,esp


while MessageBoxA looks like this:


7DCBFEA9  9090909090 Align 2
7DCBFEAE                           MessageBoxA:
7DCBFEAE  8BFF mov edi,edi
7DCBFEB0  55         push ebp
7DCBFEB1  8BEC mov ebp,esp


However, the thing is that either way there are five NOPs (0x90) before actual function entry. And these bytes can be rewritten to do far jump. But how to install our hot patch into given process?

DLL injection


There's a really neat way to inject code for a given process by using CreateRemoteThread. With this function we can start a thread in the process and execute our code there. The mechanism, called DLL injection, is actually quite simple and I was able to do it even after many years of C++ hibernation. Although it's definitely possible to hotpatch a running process, I will describe how to start a process so that hotpatching is active right from the beginning. This means I am creating the process. The basic principle goes like this:

1. Start a process so that dwCreationFlags are OR'ed with CREATE_SUSPENDED (CreateProcess).
2. Grab a module handle to KERNEL32.DLL (GetModuleHandle).
3. Retrieve the address of LoadLibraryA from kernel32 (GetProcAddress).
4. Allocate memory for DLL path in the remote process (VirtualAllocEx) and write DLL path to it (WriteProcessMemory)
5. Create a remote thread into remote process (CreateRemoteThread) and set it's starting address as previously taken address of LoadLibraryA.
- Injected DLL is now loaded and it's DllMain is run. This is where hotpatching is done.
6. Wait for the thread to terminate
7. Resume the primary thread of the remote process.
8. Clean up things in the calling process.

If the process is already running, one might try to pause the process by calling SuspendThread to calm down activities, but I think it's not mandatory.

The source code of injected DLL can be found here and the initiator of injection is here. Beware that the level of error handling is next to nothing.

Monday, November 5, 2012

Bit twiddling


Recently I had an interesting problem at hand: we needed to port a kind of hashing (or scaling) algorithm from mainframe assembler to another platform which doesn't have bit shifting operations. I started my journey by verifying what the assembler routine actually does. Althouhg I've done some x86 assember in the past and even some assembler for z-machines, it turned out to be an interesting session. In the beginning I was totally lost with assembler source code for the following reasons:


  • Registers in mainframe have really nice names: they are just numbers from 0 to 15. Which operand represents value and which represents register?
  • Instructions in mainframe assembler do not use too many letters: N stands for AND, XR stands for XOR, ...
  • Some instructions are manipulating multiple registers at once. For example SRDL 2,16 shifts contents of register 2 to register 3 by 16 bits.


Soon I realized I have to consult my colleague to figure what is going here. During the half an hour session with him, I ported the logic to Java just to understand how the algorithm works. In the end the algorithm was verified to be quite simple: shifting base value to the left by 4 bits, reversing the bits and shifting the result again to the left by 3 bits. By shifting the bit sequence one position less to the left after reversing we can guarantee that hash value never gets negative (32th bit is always zero). The lowest 4 bits were used for certain purpose which not relevant in this story. The resulting C++ routing is quite short, but performance-wise far from optimal. The initial code looks like this:

int32_t hash( uint_t base,uint32_t* hash ) {
if( base > 0x07FFFFFF )
return -1;
*hash = reverseBits( base << 4 ) << 3;
return 0;
}

uint32_t reverseBits( uint32_t num ) {
    uint32_t reverse_num = 0;
    for ( int i = 0; i < 32; i++ ) {
        if( ( num & ( 1 << i ) ) )
reverse_num |= 1 << ( ( 32 - 1 ) - i );
    }
    return reverse_num;
}

And here's a sample, showing bit twiddling with base value 25:

Base (dec 25): 0000 0000 0000 0000 0000 0000 0001 1001
After 1st shift: 0000 0000 0000 0000 0000 0001 1001 0000
After reversing: 0000 1001 1000 0000 0000 0000 0000 0000
After 2nd shift: 0100 1100 0000 0000 0000 0000 0000 0000

Since almost all platforms can use DLLs/shared objects, I decided to port my naive implementation to C++ and compile it to DLL. The results were identical on Windows, but being a paranoid, I also compiled the module on z/OS in 64-bit AMODE. Again, the results were identical, even though the program was now 64-bit and byte order on z/OS machine is big-endian.

I didn't invent the bit-reversing algorithm, but just found it from the Internet. For some reason, I decided to arrange a "race" between Java and C++ implementations. In the beginning the results were shocking; even after full optimizations the C++ implementation was more than eight times slower. But isn't C++ ought to be faster than Java? After looking deeper under the hood, it was clear what made the difference: looping. Modifying the way bits are reversed gave an incredible boost. A better bit-reversing routine looks like this:

uint32_t reverseBits( uint32_t num ) {
num = ( num & 0x55555555 ) << 1 | ( num >> 1 ) & 0x55555555;
num = ( num & 0x33333333) << 2 | ( num >> 2 ) & 0x33333333;
num = ( num & 0x0F0F0F0F) << 4 | ( num >> 4 ) & 0x0F0F0F0F;
num = ( num << 24 ) | ( ( num & 0xFF00 ) << 8 ) | ( ( num >> 8 ) & 0xFF00 ) | ( num >> 24 );
return num;
}

However, since the target platform actually runs on multiple operating systems (at least on Windows and AIX), using C/C++ requires compilations on multiple platforms. For this reason a colleague of mine implemented another solution which doesn't need bit shifting operations, and thus can be implemented directly within the target platform. Of course, the performance of the routine dropped again. To keep languages in minimum, I'm showing this approach again in C++.

int32_t hash( uint_t base,uint32_t* hash ) {
if( base > 0x07FFFFFF )
return -1;
*hash = reverseBits( base * 16 ) * 8;
return 0;
}

uint32_t reverseBits( uint32_t base ) {
uint32_t reversed_num = 0;
for( int bitpos = 32; base != 0; ) {
   bitpos--;
   if( base % 2 == 1 )
       reversed_num  += ( uint32_t ) pow( 2.0,( double ) bitpos );
   base /= 2;
}
return reversed_num ;
}

Finally, being a Java fella, I conclude this episode with a Java implementation of the hash routine. This implementation is the easiest and performance-wise comparable to C++. (Which is not a big surprise since the algorithm is the same). Here you are:

Integer.reverse( base << 4 ) << 3;

Now the big question is: what is all this bit twiddling good for? The reason why bits are reversed, is the way how DB2 for z/OS locks work, especially in datasharing mode. Since the number under manipulation is used as a clustering key, DB2 for z/OS tries to place consecutive numbers close to each other, possibly in the same page. This is turn means that inserts with consecutive numbers hit the same page at the same time and they are disturbing each other due to exclusive locks on the page. The page becomes "hot". But after the bit reversal, consecutive numbers are not not consecutive anymore and thus are not placed into same page (if there are more than one page, of course). Let's take a simple example with numbers 100, 101 and 102. The hashed values are 318 767 104, 1 392 508 928 and 855 638 016, respectively. The differences are huge and thus data lands evenly all around the pages. The new platform will not need this functionality per se, but it needs to be compliant with the numbers in legacy system. That's it.

Sunday, November 4, 2012

REST


Long time no see. My brains have been in REST for a while, and I decided to share my thoughts about it. Beware RESTafarians.

SOAP vs REST WebServices


There are certain differences between these two approaches. Here's a short list to name just a few:


  • SOAP is about services while REST is about resources. 
  • REST interface is standardized by HTTP (GET, POST, PUT, DELETE, ...) and that's what (according to RESTafarians) makes it easy. SOAP places no restrictions on the interface, but service producer must design and describe each and every SOAP interface.
  • SOAP services have a description language (WSDL) while REST services do not. Some RESTafarians may disagree with me and praise WADL, but here's what wikipedia says about WADL: "WADL was submitted to the World Wide Web Consortium by Sun Microsystems on 31 August 2009, but the consortium has no current plans to standardize it and it is not yet widely supported. WADL is the REST equivalent of SOAP's Web Services Description Language (WSDL), which can also be used to describe REST web services.". And as we know, wikipedia is not wrong. Ever. What this means, is that you should not expect any proxy code generation based on description of the service in hand.
  • Browsers can invoke REST services natively while SOAP services require major hassle. For REST based services this is a big plus.
  • SOAP webservices are based on XML while REST services use content negotiation and should be able to produce multiple representations (like JSON).
  • REST responses are meant to be cached and the caching infrastructure is well understood and widely available. With SOAP you are on your own.
  • REST security is quite simple and is purely based on HTTP and things like SPNEGO. Don't expect services other than plain HTTP/HTTPS have. 
  • Since SOAP is not based on HTTP, you can build and consume SOAP WebServices without full understanding of HTTP. With REST services you'd better know crucial parts of HTTP by heart (e.g. response codes).


REST used directly from a browser 


I have absolutely nothing against using REST directly from a browser. Browsers implement all necessary functionality to invoke REST services correctly. They are built to deal with HTTP messages and understanding various HTTP response codes is business as usual for them. Similarly, passing user identity to REST services is mostly taken care of by the browser. And if you want to have SSO, the chances are good that you have a working solution already. For example, if your clients have SPNEGO enabled browser and your application server also knows SPNEGO, implementing SSO is piece of cake; the identity of the user is passed all the way down to your REST service. Browsers also know how to handle different representations, be it JSON, XML, HTML or almost whatever.

REST used in application-to-application communication


Communication between applications with REST is something I just don't buy. While it's relatively easy in a to build REST services, consuming services may not be so simple. For example, consuming REST service from WebSphere Application Server (and from most Java EE application servers) means you have to manage user identity by yourself; nobody's propagating user identity for you. And although appending a couple of parameters to an URL is simple, handling response may not be that simple. First of all, you must understand the HTTP response code. You also must understand the resource representation returned and soon you will end up having all kinds of content parsers within your application (e.g. some kind of shitty open-source JSON parser). While these kind of parsers probably exist, I wouldn't expect quality even close to JAXB. Are you now wondering why I'm foaming about things like JSON parsers since REST services can also produce XML? Why don't just use XML and JAXB? Of course you can use XML and JAXB, but what does it actually mean? It means you end up building the very same things by yourself what SOAP toolkits/proxies/... provide for you.

Conclusion


I think the simplicity of REST is vastly overvalued. You need to know this and that to make it work. For example, localization/internationalization is a good example: how do you request representation in multiple languages? Are the URIs same or not? Is "Accept-Language" -header the way to go or not? Search the web and you'll find pretty nasty debates going on with this.

So, at the end of the day I'm still in favour of using SOAP between applications and REST if the service is consumed directly by the browser. RESTafarians of course ask the big question: "how would you know who is consuming your service and why would you even care"?

Wednesday, August 1, 2012

Enterprise-class Javascript, part III


The third episode is dealing with global variables and encapsulation. Let's start with globals. Global variables have been a curse in almost every programming language for decades, but many Javascript developers (most of them I know) still use global variables and functions all the time. Some of them may accidentally define functions which are not global (especially with jQuery), but don't really understand the difference. Why do they use globals? I think there are multiple reasons to it. First of all, it's because many web developers don't have real programming background, but instead they were once forced to widen their view of world from plain HTML to Javascript. Secondly, JavaScript has implied global variables and you may be using global variables by accident; if you do not explicitly declare a variable, a global variable is implicitly declared for you. And last but not least, Javascript has been written this way in the past; writing Javascript was considered just a mandatory plague and typically it was written by copycats without straining their brains too much.

In Javascript the global object is the global namespace that holds the top level functions and global variables. Variables which are not explicitly defined are implied global variables, and their names are also kept in the global object. Writing functions to global scope is roughly the same thing as writing Java classes to default package or C#'s global namespace. This inevitably leads to collisions and that's why many programming languages and markup languages have packages (e.g. Java) or namespaces (C#, XML, C++, ...). It's also the reason why global functions do not lead into reusable code which could be published as a library. In reality you must use global variables and functions, but don't put your own stuff there. I found a really nice depiction of global scope by Dmitry Baranovskiy. He compared JavaScript’s global scope to a public toilet: “you can’t avoid going in there, but try to limit your contact with surfaces when you do.”


As an example, let's assume for a second that jQuery and Underscore would have been implemented as a set of global functions. They share so many names of functions that you couldn't use them together. Everything would blow up instantly. Fortunately they are not reserving nothing but one or two slots from the global object: jQuery exposes 'jQuery' and '$', underscore.js just '_'. It doesn't matter how the functions and variables are named in the context of exposed global variable. If you are looking for a trouble, try using '$' as your global variable name. On the other hand, (sh)it also happens sooner or later when you use names like database, db, application, app and alike; the moment you start using another ill behaving library (in addition of yours) you are screwed.


In addition of keeping your stuff out of global scope, you should also embrace encapsulation. One particularly easy way to achieve this is to use RequireJS, which implements module pattern for you. Now it's time to show how to do something without global variables and keep your module internals hidden. First we define a module without any global vars. Here it is:

define( function( ) {
var counter = 0;

function incr( ) {
counter++;
}

return {
getAndIncrement : function( ) {
var tmp = counter;
incr( );
return tmp;
},
reset : function( ) {
counter = 0;
}
};
} );

Why aren't the functions and variables global in example above? That's because they are defined in the scope of anonymous factory function, the one and only parameter to define. They cannot be referenced from outside of their enclosing scope. The object literal returned from the factory function is going to be the public interface of your module and thus the consumer side of our module could look like this:

<script type="text/javascript">
require( [ 'counter' ],function( cnt ) {
alert( cnt.getAndIncrement( ) );
alert( cnt.getAndIncrement( ) );
cnt.reset( );
alert( cnt.getAndIncrement( ) );
} );
</script>

The series of calls show values 0, 1, 0, leaving counter to value 1. If you use the same module again (maybe another script block later in the same page), you can see that counter variable is shared between those scripts. The reason is that module loader executes the factory function (a function passed to define) just once and thus the counter variable declared inside the module is allocated and initialized just once. If this is not what we want, we can modify our module a little bit, so that it returns constructor function instead. This makes it possible to create multiple instances of our world-famous counter object. The new version now looks like this:

define( function( ) {
function incr( self ) {
self.counter++;
}

function Counter( ) {
this.counter = 0;
};

Counter.prototype.getAndIncrement = function( ) {
var tmp = this.counter;
incr( this );
return tmp;
};

Counter.prototype.reset = function( ) {
counter = 0;
};

// Return constructor function
return Counter;
} );

while the consumer could look like this:

<script type="text/javascript">
require( [ 'counter2' ],function( Counter ) {
var c1 = new Counter( );
var c2 = new Counter( );
alert( c1.getAndIncrement( ) );
alert( c2.getAndIncrement( ) );
alert( c1.getAndIncrement( ) );
alert( c2.getAndIncrement( ) );
c1.reset( );
alert( c1.getAndIncrement( ) );
} );
</script>

This time the series of alerts show values 0, 0, 1, 1, 2. Oops, what just happened? Why didn't the reset function work? Because the code is missing this-keyword in front of counter. Also, because there's no var keyword it actually sets a global variable called counter to value 0. To fix it we just add this-keyword, and now the code of function reset looks like this:

Counter.prototype.reset = function( ) {
this.counter = 0;
};

The series of alerts now show values 0, 0, 1, 1, 0. That's what we expected.

In conclusion, if you are writing nothing but random shit I would suggest you do it right. It doesn't mean huge overhead neither in performance nor productivity and then you in the right track from the beginning. Read these Global domination and JavaScript module-pattern in depth for more information.

At least one more episode is still coming, and it's about optimizations.

Tuesday, July 31, 2012

Enterprise-class Javascript, part II


My previous post was about modularity and dependency management using AMD based RequireJS. While modularity and dependency management form a basis for re-usable software development, it's not all. In the world of Web-based applications there's one special feature; if you application runs on browser it means all your HTML, CSS and Javascript files must be accessible in source format. Especially with Javascript it means your precious source code doensn't have any theft-protection. Legal or not, but anyhow the source code can be stolen just like that and I know people are doing it a lot. Now you may think "thank god we only allow clients who have signed on", but it really means "only signed-on clients can steel your source code". How do you control which of your clients has stolen your potentially valuable source code? If you don't care, how about donating all your server side code, too? On the other hand, most Javascript files I've seen are such big piles of steaming shit that nobody actually wants them even for free (wait for episode 3 of this series). I've contributed those piles too before I decided to get some real understanding about Javascript. And even if you don't mind about your source code being stolen, having your source code accessible and in readable format also opens it security vulnerability. For example, one can check from source code your field validation rules and use that information to enter illegal data.

So, if this is a real problem, somebody must have solved it already, right? Yes, it has been solved but it's not used as much as one would expect. Since I'm still in the middle of studying RequireJS, right now I'm suggesting Google's Closure Compiler. You can find a web version here. Google Closure Compiler will optimize and mangle your code. One of the episodes to come will dive into optimizations, but for now we are just fine with the mangling. I use my "fancy" memory game  as a sample. It contains just a couple of placeholder divs and mangled, optimized Javascript code. It's not optimized as much as it could, but it's a good start. You can find the associated Javascript content here.

Now a simple function like this:

function funnel( n,fn ) {
return function( ) {
if( --n == 0 )
fn.apply( null,arguments );
};
}

has become this:

function l(b,a){return function(){--b==0&&a.apply(null,arguments)}}

That's not very far from the original, so let's try a little bit more complex function like this:

function animateText( $objects,text,then,delay,classesToRemove,classesToAdd ) {
if( delay != undefined )
$objects.delay( delay );
$objects.fadeOut( delayInTurning,function( ) {
var $this = $(this);
$this.text( text );
if( classesToRemove != undefined )
$.each( classesToRemove,function( idx,cssClass ) {
$this.removeClass( cssClass );
} );
if( classesToAdd != undefined )
$.each( classesToAdd,function( idx,cssClass ) {
$this.addClass( cssClass );
} );
} ).fadeIn( delayInTurning,then );
}

It becomes this:

function n(b,a,c,f,d,e){f!=h&&b.delay(f);b.fadeOut(p,function(){var b=$(this);b.text(a);d!=h&&$.each(d,function(a,c){b.removeClass(c)});e!=h&&$.each(e,function(a,c){b.addClass(c)})}).fadeIn(p,c)}

While you have all the functionality in the mangled version, you understand it? Can you make any changes to it? What if you have lots of functions which all look like this? At least it makes a lot harder to understand and maintain this kind of code. How about reversing the mangling, reverting original sources back with another tool? Is it possible? If you put a donkey to the mincer, can you reincarnate the donkey by putting the minced meat to another machine?

Monday, July 30, 2012

Enterprise-class Javascript, part I



Recently a friend of mine asked my opinion about library called RequireJS. I have done lots of Javascript but still I haven't even heard about it. Shame on me. As an execuse I'm just saying I'm mostly server-side programmer and Javascript is just my hobby. After going through the RequireJS documentation I noticed that it's not just a regular Javascript library, but a Javascript module loader instead. I always overlooked CommonJS and things alike and thus I didn't even know about AMD (Asynchronous Module Definition). OMG, how it's possible I didn't know about it? So far I've thought that jQuery, maybe mated with underscore.js is enough. Just use things like namespaces, avoid globals at any cost, add JSDoc comments to your code and mangle it through Google Closure Compiler. With this approach you can have some kind of intellisense, loads faster and your code is theft-protected at least to some extent. But something was missing ...


After initial shock I did some tests with RequireJS. Worked as expected and I was really happy since I used to take care of namespaces and Javascript module pattern by myself. No more, since RequireJS takes care of it on my behalf. It was just what I've been looking for a while. Somehow it remainded me about OSGi. And just like with OSGi, it kind of forces you make modular code. Also, if you follow good programming practices you can hide your dependencies right to the place where belong and keeping your private functions private. Let's take an example. If the module is dependent on, let's say, underscore.js, I really don't care as far as it fullfils its contract. Or let's say a module is using backbone.js, which in turn is using underscore.js and jQuery. With traditional approach I have to have zillions of script-tags in my page using any of those libraries and they have to be in correct order. With RequireJS my module now looks like this:

// file uploader.js
define( ['jquery'],function( $ ) {
// Do this and that with jQuery
} );

and my module consumer now looks like this:

<script src="require.js"></script>
<script>
require( ['uploader'],function( Uploader ) {
var u = new Uploader( ... );
...
} );
</script>

Notice that I only have one script tag (besides my own inline script), which is loading require.js. If my consumer side code is using jQuery, I would also have jquery as depedency. Now it looks like this:

<script src="require.js"></script>
<script>
require( ['uploader','jquery' ],function( Uploader,$ ) {
$(document).ready( function( ) {
var u = new Uploader( ... );
...
} );
} );
</script>

But why load jQuery again? Isn't jQuery already loaded? Yes, maybe it is, but even though my uploader is using jQuery, it's just an implementation detail. You should always list all your dependencies and RequireJS takes care of asynchronously loading it just once (unless you want multiple versions).


Hiding dependencies to where they belong is really important thing. Think about adding a dependency to common code referenced by hundreds of pages. With RequireJS it's just a matter of telling the dependency and dropping a Javascript file to correct location.


And finally, if you think about how seriously you should take AMD, look inside your jQuery 1.7+. The AMD module definition is there.


This was just the beginning and more will come about manglers, optimizers, using globals, ... Meanwhile you should visit RequireJS site.

Thursday, June 21, 2012

Functional programming fundamentalism

In the near past I had a "little" discussion with a group of functional programming fundamentalists. The discussion, considering mainly F#, lasted for months and in fact it's still alive to some extent. Those posts actually strenghtened my opion about functional programming fundamentalists: most of them simply don't know well enough "the base" so that they could do any working solutions with OO-languages (or any?) utilizing multiple cores. For example, volatile variables were considered mostly mysterious. They presented several code snippets which were based on wrong assumptions about what concurrency and parallelism mean and when parallelism can actually happen. Quite often the answer is: "it just works". But does it work most of the time because of good (or bad) luck?
While we were filling the thread with posts, I found some really "nice" examples by Googling F# activities. The search resulted in several "diamonds", this being one of my favorite (from Misusing mutable state with F# Asynchronous Workflows):

"So it could be that the increment operation gets executed on two or more threads at exactly the same time. And if two threads read the variable at the same time, and then increment it before saving it back, we effectively lose one of our increment operations.


This is all fairly unlikely ..."


How about having that fella programming your banking solution? Not! If you know how locks, volatile variables and atomic types work, you can see in a second what's wrong with his program.


Fundamentalists swear in the name of immutability; all operations are acting on immutable data structures. In practice, applications need to have some side effects like it or not. Simon Peyton-Jones, a major contributor to the functional programming language Haskell, said the following: "In the end, any program must manipulate state. A program that has no side effects whatsoever is a kind of black box. All you can tell is that the box gets hotter." Admitting that you need side effects means is a good start. I think the key thing to understand is visibility: when a thread modifies a memory location, how, and when, is that change seen by other threads? (If you need more details about and cannot get sleep easily at nights, search for "Memory Barriers: a Hardware View for Software Hackers"). Sooner or later you will find yourself synchronizing access to shared mutable state and you'd better understand how it works. Let's consider a sample: a web crawler in F# by Tomas Petricek (http://fssnip.net/65). When it's time to check whether given URL is already visited, he falls back to object oriented, concurrent data structure called ConcurrentQueue. But he knew the data structure must be synchronizing one.

Functional programming languages very often use compact syntax which actually makes it really hard to understand. Fundamentalists think it's cool, but to me those programs often look like fonetic symbols of puking.


All this ranting doesn't mean that I don't understand the value of functional programming principles. Yes I do, and actually use them in many languages, but I don't believe in pure functional languages or even hybrids without understanding the base. Anyhow, be aware that functional features are being stolen from old languages like LISP and are now or in the near future in modern OO languages (like Java and C#). I would say that in the end of day, these are the big winners. For example, Scala will probably loose many followers when Java 8 is released with it's (not so sexy?) lambda expressions and built-in map/reduce (via java.lang.Iterable), etc. For those of you who don't know what the lambdas in Java will look like, here's a short sample:


import java.util.*;


public class Test {
public static void main( String... args ) {
Collection<Integer> list = Arrays.asList( 1,2,3,4,5 );
System.out.println( list.filter( x -> x > 2 ) );
}
}


Using OO- and functional techniques together is definitely OK and I believe it is an approach what will eventually win. Now you can still work with your favorite mainstream language (one of top five in Tiobe listing) and have excellent libraries for concurrent programming. As a bonus, you don't have to struggle with version imcompatibilities (like in Scala). Last but not least is the fact, that functional programming paradigm has no proper modelling supports, such as UML.


So, do yourself a favor by stopping playing with functional programming languages and starting to learn your mainstream language properly (including how to do concurrent programming with it).

Wednesday, June 20, 2012

What is this blog for?

In the beginning of year 2012 I moved to CIO office. After half a year of paperwork I decided to discharge my Microsoft Office bloat and created this blog. Posts are full of bits and bytes, so if you are looking for easy stories about social media and how everybody is having fun with it, this is not your place. On the other hand, if you are looking for practical tips about writing enterprise class Javascript, concurrenct programming in Java, mixing Java and scripting languages, ranting about functional programming languages or Spring, this is your place.

Adjusting time with BCE

This time we go deeper to BCE while we go thru another case: adjusting time seen by the applications. If you are not familiar with bytecode engineering (BCE), see my previous post for more information. If my previous post was too technical, please do a favor for yourself and skip this post, cause this time we really go deeper.

The problem


Sometimes programs want to "move" in time. For example, bills are created and after certain period of time it's time to check whether all bills are paid, and if not, start collecting money for unpaid bills. The period between two events can be days, weeks, months or even years. But you wan't to test them right away. Yeah, I know some of you are already thrilling to say something about unit tests, but we are not talking about small components "on the loose" here; we are now talking about full fledged Java EE applications running in Java EE application server.


Needles to say, it's sometimes appropriate to use current time in algorithms and these may well be the problematic snippets of logic. If you change the system time, everything is changed at once, and for example certificates in you appserver may expire (and server stops working). It's no go. Another solution proposed by some of my colleagues is to use custom type (maybe java.util.Date derivative) which is consulting "something" to get requested data offset. However, this latter approach sucks because:


  • How does it consult something? How does it differentiate the very same libraries embedded in applications?
  • It requires that you own the source code and can modify it (read: you must modify it)
  • The code "flows" all the way to production where it's not needed/wanted and you need special protection to keep it disabled in production environment
  • It doesn't work with 3rd party components (it's unlikely that they are using your custom type)


The perfect solution would be using just plain java.util.Date, but still be able to adjust time selectively. What I mean by selectively? How about changing the behavior for one application in a cluster and keep the rest intact? Enter BCE!

Using BCE to "move in time"


The idea to use BCE to implement ability to move in time was born after I had a conversation with my colleague. He was suggesting BCE to intercept native calls, but later on (after trying it) I realized it's not the way to go. Instead, I decided to instrument calls "at the client side", meaning I decided to modify all calls themselves instead of just modifying java.lang.System.currentTimeMillis (which is native method). This approach seemed to work and is still in use.

In this posting we go through the trickiest parts of time machine agent. We start with the easiest hook: intercepting calls to java.lang.System.currentTimeMillis. The ASM code intercepting those calls looks like this:


if( opcode == INVOKESTATIC && owner.equals( "java/lang/System") &&
    name.equals( "currentTimeMillis" ) ) {
   // Generate a currentTimeMillis call
   super.visitMethodInsn( opcode,owner,name,desc );
   // Push timeOffset on the stack
   visitLdcInsn( new Long( timeOffset ) );
   // Add the two topmost operands
   visitInsn( LADD );
   // Only current time remains on the stack
}

What's going on here? We intercept all calls which are static, are invoked against class of type java.lang.System and have method name currentTimeMillis. The very first thing we do is the invocation of actual method. Now the top of the operand stack (TOS) is current time. Then we push time offset on the stack and request addition of two longs (LADD). As a result we have now adjusted current time by the time offset. When this instrumentation code is run on any class calling java.lang.System.currentTimeMillis, it will add given offset to current time. So the following line of code:

System.out.println( new Date( ) );

actually executes this bytecode:

   0:   getstatic       #32; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   invokestatic    #47; //Method java/lang/System.currentTimeMillis:()J
   6:   ldc2_w  #20; //long 18000000l
   9:   ladd
   10:  invokevirtual   #49; //Method java/io/PrintStream.println:(J)V

Note how the time offset, five hours, maps to constant 1800000l, which is then added to current time (moving "current" time five hours to the future).


However, programmers often use more abstract way to ask for current time, for example by constructing an instance of type java.util.Date or calling java.util.Calendar.getInstance. To adjust time returned by default constructor of java.util.Date, we do it like this:

if( opcode == INVOKESPECIAL && owner.equals( "java/util/Date" ) &&
    name.equals( "<init>" ) && desc.equals( "()V" ) ) {
   // Bytecode NEW is already executed
   // Stack contains now a reference to java.util.Date (slots reserved 1)
   // Note that constructor call is not made yet.
   // Dup the reference (slots used +1)   
   // Stack contains now a reference to java.util.Date

   // Note that constructor call is not made yet.
   // Dup the reference
   dup( );
   // Now let a constructor call happen (consumes one slot)
   super.visitMethodInsn( opcode,owner,name,desc );
   // Dup the reference
   dup( );
   // Invoke getTime method
   visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","getTime", "()J");
   // Push offset to the stack
   visitLdcInsn( new Long( timeOffset ) );
   // Add the operands
   visitInsn( LADD );
   // And finally call setTime method
   visitMethodInsn( INVOKEVIRTUAL,"java/util/Date","setTime", "(J)V");
   // One reference to java.util.Date remains on the stack
}

As you can see, the method is now much more complicated. The comments along the lines tell what's going on there, so I don't repeat myself here. And now this line of Java code:

System.out.println( new Date( ) );

produces bytecode:

   0:   getstatic       #32; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   new     #14; //class java/util/Date
   6:   dup
   7:   dup
   8:   invokespecial   #15; //Method java/util/Date."<init>":()V
   11:  dup
   12:  invokevirtual   #19; //Method java/util/Date.getTime:()J
   15:  ldc2_w  #20; //long 18000000l
   18:  ladd
   19:  invokevirtual   #25; //Method java/util/Date.setTime:(J)V
   22:  invokevirtual   #38; //Method java/io/PrintStream.println:(Ljava/lang/Object;)V

Full implementation of the time machine also requires interception of java.util.Calendar.getInstance. However, since it's basically similar as java.util.Date constructor case, I don't go thru it here. 


Controlling which classes to instrument


In the beginning I was preaching about selectivity. Now it's time to approve my words. 


My Java agent, like any other Java agent, takes one parameter. My agent uses this single string to define one or more regular expressions, which define whether a class is instrumented (match) or not (no match). So, right now the javaagent-switch is defined like this:


-javaagent:c:\temp\tm.jar=paci/.*


This results in instrumentation of classes which are in a package hierarchy starting with "paci". Here's our test program: 


package paci;

import java.util.Date;

import org.joda.time.DateTime;

public class TimemachineTest {
   public static void main(String[] args) {
      System.out.println( new Date( ) );
      System.out.println( new DateTime( ) );
   }
}

Because of this "selection" or restriction, this is what happens: 


Wed Jun 20 21:45:12 EEST 2012
2012-06-20T16:45:12.279+03:00


Plain java.util.Date call works, but Joda Time (org.joda.time.DateTime) doesn't. Let's modify our instrumentation targets to include also Joda Time packages. The javaagent-switch now looks like this:


-javaagent:c:\temp\tm.jar=paci/.*:org/joda/.*

VĂ³ila, all of a sudden we start getting correct results with Joda Time, too. This is what gets printed out


Wed Jun 20 21:55:34 EEST 2012
2012-06-20T21:55:34.417+03:00


What if you have the same library in multiple applications, but only want to instrument some of them? The answer is to use java.security.ProtectionDomain parameter of transform method. From protection domain you can get the code source and its location. All you have to do is to parse the URL and decide whether to instrument or not. Here's a sample code source URL printed out by my agent:


file:/C:/programs/joda-time-2.1-dist/joda-time-2.1/joda-time-2.1.jar

But it breaks my application server!



Oh yeah, that's what happened to me. Since WebSphere Application Server or actually some of its components use ASM for its own purposes, it's possible to blow out the server with version mismatch. This in turn means classloader hacking. It's doable and not hard to implement, but it's another story.



Full source of the agent can be found here


Okie dokie, we are set with our time machine exercise.