The gentle art of making ... programs

Monday, December 17, 2012

Mixing Java and dynamic scripting languages

I'm done with the assembly language (so far), so it's time to publish some high(er)-level stuff.

Problem

Several years ago I was a team member of a project which implemented a framework targeting insurance systems. The framework is based on IBM's specification framework, which in turn is the basis of IAA. Among other things, the specification framework can be used to form a network of agreements and various elements attached to agreements (requests, properties, roles, calculations and rules), possibly without any coding. Many objects in the framework extend StructuredActual (e.g. Agreements and Roles), so this class is commonly found in the interfaces. All "external" components are modeled as role players (e.g. a party acting as a policy holder) and are not part of the framework per se.

Agreements are "instances" of another concept called product. Product defines properties and other elements associated with an instance of agreement. The product itself is parametrized with the data from configuration database or XML file. This means we can build a network of agreements without coding a single line of Java. Data can be automatically persisted without any intervention of programmers. On the other hand, things like derived properties (e.g. things relative to current time) and rules must be coded in Java or there must be some other language to describe their behavior. Initially we decided to go with plain Java, but left the door open for other options. The reasons behind Java-only-approach were:

Performance. With Java we can have optimal performance.
Debugability. Since everything is in Java, it's quite easy to figure out what's going on with a debugger.
Avoidance of proprietary languages. We didn't want to invent our own yet-another programming language.
Skills. We know Java and it's easier to hire people to work with the framework since it's based on one of the most popular languages.

Basically there's nothing wrong with this approach, but there are some things which could make life easier. For example, class Agreement is a kind of chameleon class, and the properties of it are configured thru the product without any coding work. This in turn means that the properties must be modeled as instances of class Property (carrying information such as type, default value, whether it's value is mandatory or not, ...). Since these properties live inside a map, and each instance of class Agreement can have it's own properties, we can't have normal, JavaBean type accessors for properties. Instead, the properties are accessed like this:

Double value = agreement.getPropertyValue( "premium" );

It gets a little bit uglier when multiple properties are accessed for example in a derived property formula (the logic behind actual property). Here's a sample

public class NetPriceFormula extends Formula<Double> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ) *
context.getPropertyValue( "discountPercent" ) / 100.0;
}
}

And if we are dealing with java.math.BigDecimal, it gets much uglier:

public class NetPriceFormula extends Formula<BigDecimal> {
public Double calculate( StructuredActual context ) {
return context.getPropertyValue( "premium" ).
multiply( context.getPropertyValue( "discountPercent" ) ).
divide( new BigDecimal( "100.0" ) );
}
}

At the end of the day, the formula should look like this:

context.premium * context.discountPercent / 100

Also, I didn't like the idea of having zillions of really small classes. That's why I decided to give a try to several scripting languages.

Using script languages with Java

To define a DSL for the framework, I tried Javascript, Clojure and Groovy in order to implement simple property access. They each have their strengths and weaknesses, but in this context I was only interested in syntax, datatypes, threading, performance and required skills. The results are:

Clojure

Syntax. Altough I have worked with LISP in the past, I have to say I really didn't enjoy the syntax. I found it horrible.
Datatypes. Lists and some other things do not map well to Java.
Threading. Clojure seems to use unmanaged threads inside the scripting engine which makes it impossible to use it within Java EE.
Required skills. How many programmers know Clojure syntax?

Javascript

Lack of metaprogramming. It's not possible to intercept property accessor calls with the version what's bundled with Java 6. At least not easily.
Datatypes. The typing system of Javascript do match well with Java and you don't have control over it.
Performance was not optimal.
Threading. Looks like Javascript is not using unmanaged threads.
Required skills. Many developers write Javascript anyhow, so skill-wise it could have been a perfect match.

Groovy

Metaprogramming is supported. This enabled easy property access and addition of several well-known pseudo properties. In general, Groovy seems to be highly extensible language.
Datatypes. Groovy uses Java objects. Most Groovy datatype extensions are done with in a non-intrusive manner using metaclasses.
Threading. Looks like Groovy is not using unmanaged threads.
Excellent performance.
Required skills. The syntax and types are somewhat similar to Java's. And closures are anyhow coming with Java 8.
Groovy is object-oriented. This is irrelevant, but since I like to tantalize functional programming fundamentalists, I couldn't help mentioning it.
Groovy is led by SpringSource. I don't feel comfortable whenever I hear a word Spring.

The drawbacks of Clojure are murderous, especially the use of unmanaged threads.The thing with uncontrolled use of threads may cause deadlocks and certainly problems with application server facilities. Think about situation where you have a synchronized method calling script which again is making callback to Java synchronized method; if the thread is the same it causes no problems, but with another thread you are in deadlock. Due to these threading issues I didn't measure performance aspects of Clojure at all. I decided to dump it. Javascript was performing poorly and the datatypes are somewhat limited and incompatible with Java, and thus Javascript was also dumped. Only Groovy survived initial filtering.

Using Groovy and Java together

Metaclasses

Groovy has some excellent features which support dynamism. For example, metaclasses enable nifty property accessors (see Using invokeMethod and getProperty for more information). Here's how property access can be customized using Groovy metaclasses:

def getterClosure = { name ->
switch( name ) {
case "objectId" :
return delegate.getObjectReference( )?.getId( );
default :
return delegate.getPropertyValue( name )
}
}

def setterClosure = { name,value ->
property = delegate.getPropertyOfKind( name );
type = property.getSpec( ).getType( );
if( type == value.class ) {
property.setValue( value )
}
else {
property.setValue( value != null ? convert( value,type ) : null )
}
}

Agreement.metaClass.getProperty = getterClosure
Agreement.metaClass.setProperty = setterClosure
Role.metaClass.getProperty = getterClosure
Role.metaClass.setProperty = setterClosure

This means all properties of Agreement and Role are accessed by closures defined above. Whenever property is read from Agreement or Role object, Groovy calls closure assigned to getProperty of the object's metaclass. For example, the getterClosure maps property read agreement.premium to method call agreement.getPropertyValue( "premium" ). Likewise agreement.objectId is mapped to delegate.getObjectReference( )?.getId( ). Notice the use of safe navigation operator (?) after getObjectReference call. This returns null and doesn't call getId if the return value of getObjectReference is null. Now, after evaluating the script above, I can access properties of Role and Agreement objects like this:

context.netPrice = context.premium * context.discountPercent / 100

Groovy implementation of derived properties, episode I

I started Groovy implementation of derived properties with functions. The functions looked like this:

def DerivedProperty_netPrice( context ) { context.premium * context.discountPercent / 100 }

I evaluated this generated script once and invoked it using javax.script.Invocable.invokeFunction. I ran unit tests and it was all good. I was also happy with the performance. However, when I added tests, all of a sudden something unexpected happened: invokeFunction threw a NoSuchMethodException. Thank good I had sources, which enabled me to debug what's going on. I had a hunch that it may have something to do with garbage collection since it happened after certain amount of looping all the time. By debugging the GroovyScripEngineImpl, I found my function in globalClosures map. It was hanging there, but map actually contained a SoftReference, which was cleared by JVM without any warnings and thus it returned null. However, during my debugging session I noticed that classes do not behave the same way. I decided to embed actual logic inside a class.

Groovy implementation of derived properties, episode II

After having problems with functions, I changed my implementation to generate classes instead of functions. Here's a sample:

@groovy.transform.Immutable
class DependentPropertyFormula_netPrice_1_0 {
java.lang.Double calculate( context ) {
context.premium * context.discountPercent / 100
}
}
new DependentPropertyFormula_netPrice_1_0( )

Again, in the beginning I evaluate the code snippet above, and keep the return value. The return value of ScriptEngine.eval call is an instance returned in the last line of the generated script. I have to store the instance in order to call method and to prevent garbage collection of my precious object. The invocation of method calculate changed from javax.script.Invocable.invokeFunction to javax.script.Invocable.invokeMethod, but the change was trivial. The code now looks like this:

public class GroovyDerivedPropertyFormula<T> implements DerivedPropertyFormula<T> {
private final Class<T> type;
private final Invocable invocable;
private final Object instance;

GroovyDerivedPropertyFormula( String kind,Class<T> type,String version,ScriptEngine engine,String code ) {
this.invocable = ( Invocable ) engine;
this.type = type;
String classCode = String.format(
"@groovy.transform.Immutable \n" +
"class DependentPropertyFormula_%s_%s { \n" +
" %s calculate( context ) {
" context.premium * context.discountPercent / 100
" }
"} \n" +
"new DependentPropertyFormula_%s_%s( )",
kind,version,type.getName( ),kind,version );
this.instance = engine.eval( classCode );
}

@Override
public T calculate( StructuredActual context ) {
return type.cast( invocable.invokeMethod( instance,"calculate",context ) );
}
}

This Java object is instantiated just once and thus it needs to be thread-safe. Same applies to Groovy object, although context given as a parameter doesn't have to be thread-safe.

Deployment of artifacts

When Java code is used to implement derived properties and other object types, the code must be deployed within the application. This may be a problem if multiple applications are using same product model. But when the code is stored in the database together with other data within the product model, the code is shipped automatically where ever it's needed. Cool!

A word of caution

Although it's easy to use scripting languages from Java, don't expect you can debug your script in proper context. It may be possible, but it's not going to be easy. That's why I wouldn't use scripting for nothing but very simple things.

Conclusion

As you can see, mixing Java and dynamic languages like Groovy is quite simple. I implemented all object types so that it's possible to implement actual business logic using Groovy and it took just several hours. If you, like most of programmers do, have any plans to implement a rule engine why don't you give a try to Groovy?

Friday, December 7, 2012

Intel x86 opcodes: a few samples

08.12.2012: updated a couple of bad misspellings.

This is actually just an addendum to my previous blog entry and doesn't make any sense if you are not familiar with Intel assembly language. Very, very low-level stuff.

Why CALL EAX is encoded to FFD0?

That's a very good question, indeed. Since this was one of the hardest things to understand to me within the context of hotpatching, I decided to make additional note/description about instruction encoding. Let's start with a picture from Intel Architecture Software Developer’s Manual:

The mnemonic we are interested in is CALL. As can be seen from the reference, the primary opcode of CALL is FF. But wait, there are six other mnemonics with the same primary opcode. We need tell the processor somehow that it's a specific CALL we are requesting. This is achieved via 3 bits in the following MOD R/M byte. Since we want to call 32-bit address, the opcode bits in MOD R/M byte must be set to 2 (010). We also set the two first bits of MOD R/M byte to ones in order to tell the processor that the R/M bits name the register which contains actual address for our call. Now we have a bit sequence 11010000. And this in turn happens to be mysterious D0 in our byte sequence FFD0. Finally we can verify from picture below that value zero (000) in R/M part means EAX when MOD bits are ones, indeed. That's the reason why register (EAX) itself didn't have any effect on the value of the MOD R/M byte. If we had CALL EDX, the corresponding byte sequence would have been FFD2.

Why MOV EAX,<ADDRESS> is encoded to B8?

In the Intel Architecture Software Developer’s Manual, page 3-402, the 32-bit MOV operation is described as follows:

But what does the + rd mean in this context? Again, from the Intel Architecture Software Developer’s Manual we can see that encoding of register EAX in rd nomenclature is zero and that's exactly what we are trying to tell to the processor. Would it been MOV ECX,<imm32>, the instruction encoding has been B9, as shown in the table below.

Does this satisfy your question, Abu? From now on, you are ready to throw all languages (especially functional ones) to dumpster, and do all your coding directly with machine language ;-)

Saturday, December 1, 2012

Hotpatching, episode II

Update 2.12.2012: changed links to red so that they can be seen and modified some text to more meaningful.

After publishing my first hotpatching article, I started to think about concurrency issues with it. Since I'm creating a thread and changing values of global variables, it means possible problems with happens-before issues. These changes must be visible to all threads immediately after the change since pointers to original functions may otherwise point to NULL and may cause access violation to happen. I also started to think about dangers of using DllMain. After reading the article about more complete DLL injection by Drew Benton, I decided to make another version of my DLL injection based on his ideas.

I used the Drew's code as a basis and modified it as needed. It took a while to really get into the code. Meanwhile I was enjoying several access violations caused by my almost-correct "assembly code". Since the "shippable" code is written using x86 opcodes, even assembly language can be considered high-level compared to it. This actually reminded me about Java bytecode engineering. If you want to understand what's going on with the code, this reference may be helpful. And if you found my previous hotpatching article too technical, I don't recommend to proceed with the code.

To me this was a real eye-opener: it's really relatively easy to send whatever code you want to be executed by a given process. Just use a couple of x86 opcodes to break in the process by calling your own function in the DLL. It opens endless possibilities. (Do I ever need them is a different story).

Thread-safe DLL code is here and the code doing actual injection is here. Have fun with them ;-) And by the way, don't ever use CriticalSections the way I did in the DLL. I'm using GNU C++ and thus didn't have structured exception handling possibilities, so I decided to leave out exception handling completely.

Thursday, November 29, 2012

How's it hanging?

Once again I found myself going deeper in the world of low-level programming. This all began due to hang process caused systematically by the "collaboration" of two products. Let the offender be anonymous, while the other party was was BMC's Control-M, running as a Windows service.

Finding a correct location

How did I find what's wrong? It all started by using Process Explorer from Sysinternals. From the threads view I could see that certain DLL was always stuck pretty much in the same place. Actually it wasn't completely frozen but anyhow for some reason it couldn't proceed. This was a good start, but still it didn't lead me deeper into the problem at hand. I only knew that a thread was stuck and in the middle of the call stack I found FindWindow call, which was from now on my primary suspect. I googled FindWindow within a Windows service, but I couldn't find any clues of hang process behavior. I decided to take a look into DLL I found in the call stack. I searched for FindWindow and found this:

L30003277:
push 00000000h
call [KERNEL32.dll!Sleep]
push edi
push 00000000h
call [USER32.dll!FindWindowW]
mov esi,eax
test esi,esi
jz L30003277

Wow! Although I'd say I found it by accident, I instantly knew this is it (or shit); a code snippet looping until certain window handle is found. And since this code is run under noninteractive window station, it will never find the window. However, I wanted more evidence and I downloaded API Monitor from Rohitab.com. Some people say it has it's deficiencies, but at least in this case this tool was extremely valuable. I ran the code with it and vóila: I caught the looping FindWindow code red-handed. From the call stack I saw the very same address as I found with the disassembler. The case seemed to be closed. We actually got a patch during the same day, but it wasn't because of my discoveries. The case was already reported by someone else.

Anyway, few days later I was discussing with my colleague about the case, who asked if I know how the API monitor does it's magic. I had to say I have no idea, but immediately I knew this won't be a long lasting answer. I had to find out how they do it! Enter hotpatches.

Hotpatches

Hotpatches allows you to patch a running process without requiring that the process be stopped and thus the patch can be applied without even stopping the process. This mechanism can be used to implement API spying, software cracking and malware, just to name few uses. What makes it possible is the sequence of five bytes just before the function and two bytes (doing basically nothing) in the beginning of the function. The instructions before function are NOPs while the very first instruction in the function is mov edi,edi. Here are some examples from USER32.DLL:

Disassembly of MessageBoxW looks like this:

7DCBFECA 90 db 90h; '?'
7DCBFECB 90 db 90h; '?'
7DCBFECC 90 db 90h; '?'
7DCBFECD 90 db 90h; '?'
7DCBFECE 90 db 90h; '?'
7DCBFECF MessageBoxW:
7DCBFECF 8BFF mov edi,edi
7DCBFED1 55 push ebp
7DCBFED2 8BEC mov ebp,esp

while MessageBoxA looks like this:

7DCBFEA9 9090909090 Align 2
7DCBFEAE MessageBoxA:
7DCBFEAE 8BFF mov edi,edi
7DCBFEB0 55 push ebp
7DCBFEB1 8BEC mov ebp,esp

However, the thing is that either way there are five NOPs (0x90) before actual function entry. And these bytes can be rewritten to do far jump. But how to install our hot patch into given process?

DLL injection

There's a really neat way to inject code for a given process by using CreateRemoteThread. With this function we can start a thread in the process and execute our code there. The mechanism, called DLL injection, is actually quite simple and I was able to do it even after many years of C++ hibernation. Although it's definitely possible to hotpatch a running process, I will describe how to start a process so that hotpatching is active right from the beginning. This means I am creating the process. The basic principle goes like this:

1. Start a process so that dwCreationFlags are OR'ed with CREATE_SUSPENDED (CreateProcess).
2. Grab a module handle to KERNEL32.DLL (GetModuleHandle).
3. Retrieve the address of LoadLibraryA from kernel32 (GetProcAddress).
4. Allocate memory for DLL path in the remote process (VirtualAllocEx) and write DLL path to it (WriteProcessMemory)
5. Create a remote thread into remote process (CreateRemoteThread) and set it's starting address as previously taken address of LoadLibraryA.
- Injected DLL is now loaded and it's DllMain is run. This is where hotpatching is done.
6. Wait for the thread to terminate
7. Resume the primary thread of the remote process.
8. Clean up things in the calling process.

If the process is already running, one might try to pause the process by calling SuspendThread to calm down activities, but I think it's not mandatory.

The source code of injected DLL can be found here and the initiator of injection is here. Beware that the level of error handling is next to nothing.

Monday, November 5, 2012

Bit twiddling

Recently I had an interesting problem at hand: we needed to port a kind of hashing (or scaling) algorithm from mainframe assembler to another platform which doesn't have bit shifting operations. I started my journey by verifying what the assembler routine actually does. Althouhg I've done some x86 assember in the past and even some assembler for z-machines, it turned out to be an interesting session. In the beginning I was totally lost with assembler source code for the following reasons:

Registers in mainframe have really nice names: they are just numbers from 0 to 15. Which operand represents value and which represents register?
Instructions in mainframe assembler do not use too many letters: N stands for AND, XR stands for XOR, ...
Some instructions are manipulating multiple registers at once. For example SRDL 2,16 shifts contents of register 2 to register 3 by 16 bits.

Soon I realized I have to consult my colleague to figure what is going here. During the half an hour session with him, I ported the logic to Java just to understand how the algorithm works. In the end the algorithm was verified to be quite simple: shifting base value to the left by 4 bits, reversing the bits and shifting the result again to the left by 3 bits. By shifting the bit sequence one position less to the left after reversing we can guarantee that hash value never gets negative (32th bit is always zero). The lowest 4 bits were used for certain purpose which not relevant in this story. The resulting C++ routing is quite short, but performance-wise far from optimal. The initial code looks like this:

int32_t hash( uint_t base,uint32_t* hash ) {
if( base > 0x07FFFFFF )
return -1;
*hash = reverseBits( base << 4 ) << 3;
return 0;
}

uint32_t reverseBits( uint32_t num ) {
uint32_t reverse_num = 0;
for ( int i = 0; i < 32; i++ ) {
if( ( num & ( 1 << i ) ) )
reverse_num |= 1 << ( ( 32 - 1 ) - i );
}
return reverse_num;
}

And here's a sample, showing bit twiddling with base value 25:

Base (dec 25): 0000 0000 0000 0000 0000 0000 0001 1001
After 1st shift: 0000 0000 0000 0000 0000 0001 1001 0000
After reversing: 0000 1001 1000 0000 0000 0000 0000 0000
After 2nd shift: 0100 1100 0000 0000 0000 0000 0000 0000

Since almost all platforms can use DLLs/shared objects, I decided to port my naive implementation to C++ and compile it to DLL. The results were identical on Windows, but being a paranoid, I also compiled the module on z/OS in 64-bit AMODE. Again, the results were identical, even though the program was now 64-bit and byte order on z/OS machine is big-endian.

I didn't invent the bit-reversing algorithm, but just found it from the Internet. For some reason, I decided to arrange a "race" between Java and C++ implementations. In the beginning the results were shocking; even after full optimizations the C++ implementation was more than eight times slower. But isn't C++ ought to be faster than Java? After looking deeper under the hood, it was clear what made the difference: looping. Modifying the way bits are reversed gave an incredible boost. A better bit-reversing routine looks like this:

uint32_t reverseBits( uint32_t num ) {
num = ( num & 0x55555555 ) << 1 | ( num >> 1 ) & 0x55555555;
num = ( num & 0x33333333) << 2 | ( num >> 2 ) & 0x33333333;
num = ( num & 0x0F0F0F0F) << 4 | ( num >> 4 ) & 0x0F0F0F0F;
num = ( num << 24 ) | ( ( num & 0xFF00 ) << 8 ) | ( ( num >> 8 ) & 0xFF00 ) | ( num >> 24 );
return num;
}

However, since the target platform actually runs on multiple operating systems (at least on Windows and AIX), using C/C++ requires compilations on multiple platforms. For this reason a colleague of mine implemented another solution which doesn't need bit shifting operations, and thus can be implemented directly within the target platform. Of course, the performance of the routine dropped again. To keep languages in minimum, I'm showing this approach again in C++.

int32_t hash( uint_t base,uint32_t* hash ) {
if( base > 0x07FFFFFF )
return -1;
*hash = reverseBits( base * 16 ) * 8;
return 0;
}

uint32_t reverseBits( uint32_t base ) {
uint32_t reversed_num = 0;
for( int bitpos = 32; base != 0; ) {
bitpos--;
if( base % 2 == 1 )
reversed_num += ( uint32_t ) pow( 2.0,( double ) bitpos );
base /= 2;
}
return reversed_num ;
}

Finally, being a Java fella, I conclude this episode with a Java implementation of the hash routine. This implementation is the easiest and performance-wise comparable to C++. (Which is not a big surprise since the algorithm is the same). Here you are:

Integer.reverse( base << 4 ) << 3;

Now the big question is: what is all this bit twiddling good for? The reason why bits are reversed, is the way how DB2 for z/OS locks work, especially in datasharing mode. Since the number under manipulation is used as a clustering key, DB2 for z/OS tries to place consecutive numbers close to each other, possibly in the same page. This is turn means that inserts with consecutive numbers hit the same page at the same time and they are disturbing each other due to exclusive locks on the page. The page becomes "hot". But after the bit reversal, consecutive numbers are not not consecutive anymore and thus are not placed into same page (if there are more than one page, of course). Let's take a simple example with numbers 100, 101 and 102. The hashed values are 318 767 104, 1 392 508 928 and 855 638 016, respectively. The differences are huge and thus data lands evenly all around the pages. The new platform will not need this functionality per se, but it needs to be compliant with the numbers in legacy system. That's it.

Sunday, November 4, 2012

REST

Long time no see. My brains have been in REST for a while, and I decided to share my thoughts about it. Beware RESTafarians.

SOAP vs REST WebServices

There are certain differences between these two approaches. Here's a short list to name just a few:

SOAP is about services while REST is about resources.
REST interface is standardized by HTTP (GET, POST, PUT, DELETE, ...) and that's what (according to RESTafarians) makes it easy. SOAP places no restrictions on the interface, but service producer must design and describe each and every SOAP interface.
SOAP services have a description language (WSDL) while REST services do not. Some RESTafarians may disagree with me and praise WADL, but here's what wikipedia says about WADL: "WADL was submitted to the World Wide Web Consortium by Sun Microsystems on 31 August 2009, but the consortium has no current plans to standardize it and it is not yet widely supported. WADL is the REST equivalent of SOAP's Web Services Description Language (WSDL), which can also be used to describe REST web services.". And as we know, wikipedia is not wrong. Ever. What this means, is that you should not expect any proxy code generation based on description of the service in hand.
Browsers can invoke REST services natively while SOAP services require major hassle. For REST based services this is a big plus.
SOAP webservices are based on XML while REST services use content negotiation and should be able to produce multiple representations (like JSON).
REST responses are meant to be cached and the caching infrastructure is well understood and widely available. With SOAP you are on your own.
REST security is quite simple and is purely based on HTTP and things like SPNEGO. Don't expect services other than plain HTTP/HTTPS have.
Since SOAP is not based on HTTP, you can build and consume SOAP WebServices without full understanding of HTTP. With REST services you'd better know crucial parts of HTTP by heart (e.g. response codes).

REST used directly from a browser

I have absolutely nothing against using REST directly from a browser. Browsers implement all necessary functionality to invoke REST services correctly. They are built to deal with HTTP messages and understanding various HTTP response codes is business as usual for them. Similarly, passing user identity to REST services is mostly taken care of by the browser. And if you want to have SSO, the chances are good that you have a working solution already. For example, if your clients have SPNEGO enabled browser and your application server also knows SPNEGO, implementing SSO is piece of cake; the identity of the user is passed all the way down to your REST service. Browsers also know how to handle different representations, be it JSON, XML, HTML or almost whatever.

REST used in application-to-application communication

Communication between applications with REST is something I just don't buy. While it's relatively easy in a to build REST services, consuming services may not be so simple. For example, consuming REST service from WebSphere Application Server (and from most Java EE application servers) means you have to manage user identity by yourself; nobody's propagating user identity for you. And although appending a couple of parameters to an URL is simple, handling response may not be that simple. First of all, you must understand the HTTP response code. You also must understand the resource representation returned and soon you will end up having all kinds of content parsers within your application (e.g. some kind of shitty open-source JSON parser). While these kind of parsers probably exist, I wouldn't expect quality even close to JAXB. Are you now wondering why I'm foaming about things like JSON parsers since REST services can also produce XML? Why don't just use XML and JAXB? Of course you can use XML and JAXB, but what does it actually mean? It means you end up building the very same things by yourself what SOAP toolkits/proxies/... provide for you.

Conclusion

I think the simplicity of REST is vastly overvalued. You need to know this and that to make it work. For example, localization/internationalization is a good example: how do you request representation in multiple languages? Are the URIs same or not? Is "Accept-Language" -header the way to go or not? Search the web and you'll find pretty nasty debates going on with this.

So, at the end of the day I'm still in favour of using SOAP between applications and REST if the service is consumed directly by the browser. RESTafarians of course ask the big question: "how would you know who is consuming your service and why would you even care"?

Wednesday, August 1, 2012

Enterprise-class Javascript, part III

The third episode is dealing with global variables and encapsulation. Let's start with globals. Global variables have been a curse in almost every programming language for decades, but many Javascript developers (most of them I know) still use global variables and functions all the time. Some of them may accidentally define functions which are not global (especially with jQuery), but don't really understand the difference. Why do they use globals? I think there are multiple reasons to it. First of all, it's because many web developers don't have real programming background, but instead they were once forced to widen their view of world from plain HTML to Javascript. Secondly, JavaScript has implied global variables and you may be using global variables by accident; if you do not explicitly declare a variable, a global variable is implicitly declared for you. And last but not least, Javascript has been written this way in the past; writing Javascript was considered just a mandatory plague and typically it was written by copycats without straining their brains too much.

In Javascript the global object is the global namespace that holds the top level functions and global variables. Variables which are not explicitly defined are implied global variables, and their names are also kept in the global object. Writing functions to global scope is roughly the same thing as writing Java classes to default package or C#'s global namespace. This inevitably leads to collisions and that's why many programming languages and markup languages have packages (e.g. Java) or namespaces (C#, XML, C++, ...). It's also the reason why global functions do not lead into reusable code which could be published as a library. In reality you must use global variables and functions, but don't put your own stuff there. I found a really nice depiction of global scope by Dmitry Baranovskiy. He compared JavaScript’s global scope to a public toilet: “you can’t avoid going in there, but try to limit your contact with surfaces when you do.”

As an example, let's assume for a second that jQuery and Underscore would have been implemented as a set of global functions. They share so many names of functions that you couldn't use them together. Everything would blow up instantly. Fortunately they are not reserving nothing but one or two slots from the global object: jQuery exposes 'jQuery' and '$', underscore.js just '_'. It doesn't matter how the functions and variables are named in the context of exposed global variable. If you are looking for a trouble, try using '$' as your global variable name. On the other hand, (sh)it also happens sooner or later when you use names like database, db, application, app and alike; the moment you start using another ill behaving library (in addition of yours) you are screwed.

In addition of keeping your stuff out of global scope, you should also embrace encapsulation. One particularly easy way to achieve this is to use RequireJS, which implements module pattern for you. Now it's time to show how to do something without global variables and keep your module internals hidden. First we define a module without any global vars. Here it is:

define( function( ) {
var counter = 0;

function incr( ) {
counter++;
}

return {
getAndIncrement : function( ) {
var tmp = counter;
incr( );
return tmp;
},
reset : function( ) {
counter = 0;
}
};
} );

Why aren't the functions and variables global in example above? That's because they are defined in the scope of anonymous factory function, the one and only parameter to define. They cannot be referenced from outside of their enclosing scope. The object literal returned from the factory function is going to be the public interface of your module and thus the consumer side of our module could look like this:

<script type="text/javascript">
require( [ 'counter' ],function( cnt ) {
alert( cnt.getAndIncrement( ) );
alert( cnt.getAndIncrement( ) );
cnt.reset( );
alert( cnt.getAndIncrement( ) );
} );
</script>

The series of calls show values 0, 1, 0, leaving counter to value 1. If you use the same module again (maybe another script block later in the same page), you can see that counter variable is shared between those scripts. The reason is that module loader executes the factory function (a function passed to define) just once and thus the counter variable declared inside the module is allocated and initialized just once. If this is not what we want, we can modify our module a little bit, so that it returns constructor function instead. This makes it possible to create multiple instances of our world-famous counter object. The new version now looks like this:

define( function( ) {
function incr( self ) {
self.counter++;
}

function Counter( ) {
this.counter = 0;
};

Counter.prototype.getAndIncrement = function( ) {
var tmp = this.counter;
incr( this );
return tmp;
};

Counter.prototype.reset = function( ) {
counter = 0;
};

// Return constructor function
return Counter;
} );

while the consumer could look like this:

<script type="text/javascript">
require( [ 'counter2' ],function( Counter ) {
var c1 = new Counter( );
var c2 = new Counter( );
alert( c1.getAndIncrement( ) );
alert( c2.getAndIncrement( ) );
alert( c1.getAndIncrement( ) );
alert( c2.getAndIncrement( ) );
c1.reset( );
alert( c1.getAndIncrement( ) );
} );
</script>

This time the series of alerts show values 0, 0, 1, 1, 2. Oops, what just happened? Why didn't the reset function work? Because the code is missing this-keyword in front of counter. Also, because there's no var keyword it actually sets a global variable called counter to value 0. To fix it we just add this-keyword, and now the code of function reset looks like this:

Counter.prototype.reset = function( ) {
this.counter = 0;
};

The series of alerts now show values 0, 0, 1, 1, 0. That's what we expected.

In conclusion, if you are writing nothing but random shit I would suggest you do it right. It doesn't mean huge overhead neither in performance nor productivity and then you in the right track from the beginning. Read these Global domination and JavaScript module-pattern in depth for more information.

At least one more episode is still coming, and it's about optimizations.