Friday, December 7, 2012

Intel x86 opcodes: a few samples

08.12.2012: updated a couple of bad misspellings.

This is actually just an addendum to my previous blog entry and doesn't make any sense if you are not familiar with Intel assembly language. Very, very low-level stuff.

Why CALL EAX is encoded to FFD0?


That's a very good question, indeed. Since this was one of the hardest things to understand to me within the context of hotpatching, I decided to make additional note/description about instruction encoding. Let's start with a picture from Intel Architecture Software Developer’s Manual:












The mnemonic we are interested in is CALL. As can be seen from the reference, the primary opcode of CALL is FF. But wait, there are six other mnemonics with the same primary opcode. We need tell the processor somehow that it's a specific CALL we are requesting. This is achieved via 3 bits in the following MOD R/M byte. Since we want to call 32-bit address, the opcode bits in MOD R/M byte must be set to 2 (010). We also set the two first bits of MOD R/M byte to ones in order to tell the processor that the R/M bits name the register which contains actual address for our call. Now we have a bit sequence 11010000. And this in turn happens to be mysterious D0 in our byte sequence FFD0. Finally we can verify from picture below that value zero (000) in R/M part means EAX when MOD bits are ones, indeed. That's the reason why register (EAX) itself didn't have any effect on the value of the MOD R/M byte. If we had CALL EDX, the corresponding byte sequence would have been FFD2.



























Why MOV EAX,<ADDRESS> is encoded to B8?


In the Intel Architecture Software Developer’s Manual, page 3-402, the 32-bit MOV operation is described as follows:

But what does the + rd mean in this context? Again, from the Intel Architecture Software Developer’s Manual we can see that encoding of register EAX in rd nomenclature is zero and that's exactly what we are trying to tell to the processor. Would it been MOV ECX,<imm32>, the instruction encoding has been B9, as shown in the table below.














Does this satisfy your question, Abu? From now on, you are ready to throw all languages (especially functional ones) to dumpster, and do all your coding directly with machine language ;-)

No comments:

Post a Comment