Once again I found myself going deeper in the world of low-level programming. This all began due to hang process caused systematically by the "collaboration" of two products. Let the offender be anonymous, while the other party was was BMC's Control-M, running as a Windows service.
Finding a correct location
How did I find what's wrong? It all started by using Process Explorer from Sysinternals. From the threads view I could see that certain DLL was always stuck pretty much in the same place. Actually it wasn't completely frozen but anyhow for some reason it couldn't proceed. This was a good start, but still it didn't lead me deeper into the problem at hand. I only knew that a thread was stuck and in the middle of the call stack I found FindWindow call, which was from now on my primary suspect. I googled FindWindow within a Windows service, but I couldn't find any clues of hang process behavior. I decided to take a look into DLL I found in the call stack. I searched for FindWindow and found this:
L30003277:
push 00000000h
call [KERNEL32.dll!Sleep]
push edi
push 00000000h
call [USER32.dll!FindWindowW]
mov esi,eax
test esi,esi
jz L30003277
Wow! Although I'd say I found it by accident, I instantly knew this is it (or shit); a code snippet looping until certain window handle is found. And since this code is run under noninteractive window station, it will never find the window. However, I wanted more evidence and I downloaded API Monitor from Rohitab.com. Some people say it has it's deficiencies, but at least in this case this tool was extremely valuable. I ran the code with it and vĂ³ila: I caught the looping FindWindow code red-handed. From the call stack I saw the very same address as I found with the disassembler. The case seemed to be closed. We actually got a patch during the same day, but it wasn't because of my discoveries. The case was already reported by someone else.
Anyway, few days later I was discussing with my colleague about the case, who asked if I know how the API monitor does it's magic. I had to say I have no idea, but immediately I knew this won't be a long lasting answer. I had to find out how they do it! Enter hotpatches.
Hotpatches
Hotpatches allows you to patch a running process without requiring that the process be stopped and thus the patch can be applied without even stopping the process. This mechanism can be used to implement API spying, software cracking and malware, just to name few uses. What makes it possible is the sequence of five bytes just before the function and two bytes (doing basically nothing) in the beginning of the function. The instructions before function are NOPs while the very first instruction in the function is mov edi,edi. Here are some examples from USER32.DLL:
Disassembly of MessageBoxW looks like this:
7DCBFECA 90 db 90h; '?'
7DCBFECB 90 db 90h; '?'
7DCBFECC 90 db 90h; '?'
7DCBFECD 90 db 90h; '?'
7DCBFECE 90 db 90h; '?'
7DCBFECF MessageBoxW:
7DCBFECF 8BFF mov edi,edi
7DCBFED1 55 push ebp
7DCBFED2 8BEC mov ebp,esp
while MessageBoxA looks like this:
7DCBFEA9 9090909090 Align 2
7DCBFEAE MessageBoxA:
7DCBFEAE 8BFF mov edi,edi
7DCBFEB0 55 push ebp
7DCBFEB1 8BEC mov ebp,esp
However, the thing is that either way there are five NOPs (0x90) before actual function entry. And these bytes can be rewritten to do far jump. But how to install our hot patch into given process?
DLL injection
There's a really neat way to inject code for a given process by using CreateRemoteThread. With this function we can start a thread in the process and execute our code there. The mechanism, called DLL injection, is actually quite simple and I was able to do it even after many years of C++ hibernation. Although it's definitely possible to hotpatch a running process, I will describe how to start a process so that hotpatching is active right from the beginning. This means I am creating the process. The basic principle goes like this:
1. Start a process so that dwCreationFlags are OR'ed with CREATE_SUSPENDED (CreateProcess).
2. Grab a module handle to KERNEL32.DLL (GetModuleHandle).
3. Retrieve the address of LoadLibraryA from kernel32 (GetProcAddress).
4. Allocate memory for DLL path in the remote process (VirtualAllocEx) and write DLL path to it (WriteProcessMemory)
5. Create a remote thread into remote process (CreateRemoteThread) and set it's starting address as previously taken address of LoadLibraryA.
- Injected DLL is now loaded and it's DllMain is run. This is where hotpatching is done.
6. Wait for the thread to terminate
7. Resume the primary thread of the remote process.
8. Clean up things in the calling process.
If the process is already running, one might try to pause the process by calling SuspendThread to calm down activities, but I think it's not mandatory.
The source code of injected DLL can be found here and the initiator of injection is here. Beware that the level of error handling is next to nothing.