Malwarebytes 2017 CrackMe Stage 1
Hello. This post marks the beginning of my blog where I will be posting writeups for reverse engineering challenges, as well as analyzing malware found in the wild. For this first post, I will be detailing my thought process and discoveries while analyzing the Malwarebytes 2017 CrackMe. Since this is my first blog post, I am not sure how much detail to cover and how exactly to go about it. That being said, this challenge was made for the beginner malware analyst, so I thought it would be a good place to start. I spent my free time the last few days working on this challenge. It has taken me longer than I initially expected, but I have learned a lot in the process.
For this challenge, I used a Kali Linux VM for static analysis and a Windows VM for dynamic analysis. The tools I used for static analysis were Ghidra and DIE (Detect It Easy). For dynamic analysis I used x32dbg. Additionally, I wrote some basic Python scripts to aid in the analysis.
First, I opened the provided Windows executable in DIE to get a basic overview of the binary. I did not glean very much information from DIE, but I did find some interesting strings: “Nope :(“, “Better luck next time!”, and “HARDWARE\ACPI\DSDT\VBOX__”. The VBOX string is of particular interest, since it looks like it could be referencing a Windows registry key. Furthermore, I investigated some of the imports, which included IsDebuggerPresent and CheckRemoteDebuggerPresent. This indicated that I should expect some antidebugging trickery, as well as anti-VM tricks due to the VBOX registry key string.
I went ahead and ran the executable in my Windows sandbox. The program prints out ASCII art of the Malwarebytes logo, as well as some basic information about the CrackMe. According to the output, the flag will be in the format flag{…}.
data:image/s3,"s3://crabby-images/5f9fb/5f9fb4d587cb8b80ad074b1c5792d87e41939c37" alt=""
Now the fun begins. I opened the PE in Ghidra and let it perform its autoanalysis. Following from the entry point, the main function can be easily found due to the presence of the ASCII art.
data:image/s3,"s3://crabby-images/ce3df/ce3dfa020073728fd42627fe914663558710658d" alt=""
The main function prints out that banner, then calls another function and checks that function’s return value. If the function returns 0, the program outputs “I am so sorry, you failed! :(“ and exits the program. Otherwise, it calls another function. The code for that function looks a bit overwhelming at first glance. It calls a few functions and loads some hardcoded values to the stack. At this moment, I am not too sure what is going on with the code in this function, so I decided to take a slight detour. Remembering the imports I saw in DIE, I searched for “IsDebuggerPresent” in the symbol tree and followed the XREF. This lead me to a function which calls IsDebuggerPresent and CheckRemoteDebuggerPresent. I have seen this before in other reverse engineering CTFs, like the HackTheBox challenges, as a simple antidebugging trick.
data:image/s3,"s3://crabby-images/76c37/76c37658de3b8cca0a1784c43cd4fa4214413a01" alt=""
If there is no debugger present, the function modifies some global data. This pattern, performing some antidebugging checks then modifying global data when the checks are satisfied, occurs again and again in stage 1 of the CrackMe, as will be apparent later.
To better understand the program, I began performing dynamic analysis alongside the static analysis I was already undergoing. In order to do this, I loaded the program in x32dbg. I found IsDebuggerPresent in the intermodular calls window in order to navigate to the “CheckTheDebugger” function I inspected in Ghidra. The disassembly shows the calls to the imported functions as well as the if statement.
data:image/s3,"s3://crabby-images/496b5/496b55146ab22ae38bfe9212a798b3605b623207" alt=""
I placed breakpoints on both of those function calls and executed the program. I stepped over both calls, and when the CheckRemoteDebugger function was called, it returned 1. I modified eax to set it to 0 and continued the program. Despite this, the program still gave me the same failure message.
I then moved back to Ghidra to find out what else needed to be done. After calling the debugger check function, the program sleeps for 1 second then calls another interesting function. This function calls RaiseException with an exception code of 0x40010006. Doing some research, I found this throws DBG_PRINT_EXCEPTION_C. This exception will be handled by a debugger, in which case a program can assume there is a debugger present, making this another antidebugging trick (Sources: ntquery blog and domin568 on GitHub).
data:image/s3,"s3://crabby-images/89179/89179c901bb93eafefdc70775e8fc007e4604c5c" alt=""
This function again will modify some global data if the code determines that there is no debugger present. I made a note of this so that I would again force the debugger check to pass, or in other words ensure that the code modifying the global data runs, when running the program in x32dbg. I renamed this function in Ghidra and labeled the function in x32dbg for ease of understanding.
The next function calls GetThreadContext to get the context structure in order to perform another antidebugging check. I had not seen this technique either, so I did some research and discovered that the DR0-DR3 debug registers, which can be accessed in the context structure, store linear addresses of breakpoints (Source). The code ensures that they are not set to verify that the process is not being debugged. Just as with the previous two functions, the program can easily be coerced using into thinking the checks passed and that the program is not being debugged in x32dbg.
I continued analyzing the next few functions in the same fashion, all the while renaming functions in Ghidra and labelling them in x32dbg. The following functions were similar to the previous. They performed antidebugging checks and modified global data when satisfied. The code which succeeds the DR register check function checked the ProcessEnvironmentBlock to see whether the NtGlobalFlag was set. Next, the program queried active devices. The one after checked the presence of the “HARDWARE\ACPI\DSDT\VBOX__” registry key to see if the program is running in a Virtual Box VM. The next function calls CreateToolhelp32Snapshot, Module32First, and then loops while calling Module32Next. I am still not sure what the program is looking for while doing this. My current hypothesis is that it is looking for the presence of some DLL that would indicate it is being run by a debugger, although I have no idea as to which one specifically. The penultimate debugging check function again uses CreateToolhelp32Snapshot, but instead enumerates processes with Process32First and Process32Next. Again, I am not sure which specific process it is looking for. The final function which performs a debugging check compares the time elapsed from near the start of the program with the current time.
While looking at all these antidebugging tricks, I wondered what exactly the global data being modified was. What data was the program hiding there? To figure this out, I used Ghidra to understand what the data was being used for. After the program completes all these checks, it loads some data to the stack, then passes that data, as well as pointers to the previously mentioned global data to a function.
data:image/s3,"s3://crabby-images/8b138/8b1387a3a131cdfde0c4cbd7d6a7d39c0fcbf518" alt=""
This function, which I named “crypto_stuff,” uses wincrypt functions to decrypt, or encrypt data, depending on the input. This function reveals that one of the two global variables is hashed and then used as the input to derive the key used for encryption and decryption. The algorithm used is AES-128 bit, as indicated by the 0x660e ALG_ID value (Source).
I am not yet certain what is being decrypted, but going back to the main function, if all these checks are passed, a final function will be called (I renamed it “finale”). Looking at finale, some strings are printed such as “I need internet!” and “You are on the right track!” Anyways, I will force the checks to pass in x32dbg by setting the EIP to be inside each of the if-statement bodies which modify the global data. Doing this, a global counter is increased after each successful check, which should total 9 after all the checks (the total number of antidebugging functions). Success! The encrypted data gets decrypted, resulting in a url: Pastebin link. Following the URL leads to a Pastebin dump with encoded or encrypted data. Furthermore, a new message is shown in the console:
data:image/s3,"s3://crabby-images/f3b1c/f3b1c94df1f2bf84f9eaf3de69fc5c3ae3ca4f64" alt=""
Before continuing further, I went back and patched the program to always pass the debugging checks, and saved it to a patch file. Continuing the program, it says “You are on the right track” and gives an uncompressed size of some data, as well as opening a window reminding me that I am not done yet.
data:image/s3,"s3://crabby-images/011e7/011e7fb3455271270d4941454f2830c1ce780198" alt=""
Going back to Ghidra, the purpose of this Pastebin can be better understood. Firstly, the program verifies it can reach the Internet, using InternetGetConnectedState. It proceedes to download the file, again using WinINet functions. Interestingly, it sets the user-agent to “Mal-zilla.” After it downloads the file, it decompresses it. This is accomplished by using another interesting anti-reversing technique which I have not seen before: GetProcAddress. The “malware” hides the import of the RtlDecompressBuffer function by getting a handle to ntdll and then getting the address (function pointer) to the procedure and calling it. I haven’t seen this before, and so I thought this was pretty cool.
data:image/s3,"s3://crabby-images/1cb91/1cb9177c27fedec82e9f72c3ce645ba2ca191d14" alt=""
This uses the LZNT1 compression algorithm. I tried to write a Python script for decompressing the data, but was unsuccessful. So, I resolved to dump the data from x32dbg after it is decompressed. I saved the dump to a file. Returning to Ghidra, the program can be seen checking that the first two bytes of the data equal “MZ.” So it has to be an executable or a DLL! Looking further, the code expands an environment string “…\rundll.exe secret.dll,#1.” But, looking at the data, it certainly doesn’t look like an executable. The first two bytes are 0x20 0x3b, not MZ! After some more investigation, there is a function which xors the file after it is decompressed. The catch is, the key is grabbed from the keyboard by another function which calls GetClipboardData. This means that the key has to be guessed or uncovered in some other fashion.
I spent a significant amount of time stuck here. I tried various ways of trying to crack the key. I knew the first two bytes were MZ, so I already had the first two values. Additionally, I began researching the PE file format more to discover what the most likely values would be for the first few bytes. I found some Apple documentation on the format and tried getting the key by xoring the expected values with the dumped data. None of this worked. After writing several different scripts to try and solve this, I admittedly gave up and looked at the official writeup. This is the only time I consulted any writeup, but the solution to my problem was much easier than I expected.
Firstly, I accidently dumped the wrong data. I dumped the compressed data because I chose to dump the wrong variable when viewing the stack right after the call to RtlDecompressBuffer. Secondly, a quick glance at the actual decompressed data shows a repeating phrase: “malwarebytes.” Well, 0 is the identity for xor, so 0 xor key = key. In this way, the key can be easily ascertained to be malwarebytes.
data:image/s3,"s3://crabby-images/c7045/c70455e7036974d328978ac4afb2186a856ec3ff" alt=""
Even though I was disappointed that I did not figure this part out, and disappointed that I cheated, I returned to cracking the challenge with renewed vigor. I wrote a simple Python script to xor the dump with the key to get the executable.
data:image/s3,"s3://crabby-images/16c7a/16c7ae5737997a0876edf8db68b57494eb7f8056" alt=""
Checking back at Ghidra, after the code is successfully decrypted a new rundll.exe process is created, with its state being set to suspended. Then it copies the xored data (the executable payload) into the process’s memory using WriteProcessMemory, then runs the program by calling ResumeThread. This is the classic RunPE process hollowing technique. Now, a new message box is shown on the screen.
data:image/s3,"s3://crabby-images/a7e70/a7e705dcb09970041582570e80d290eb297ba4f4" alt=""
data:image/s3,"s3://crabby-images/7b78f/7b78fceb1965db55e2885c5cdd05babcd0418e5d" alt=""
Well, now that the payload has been loaded, this is the end of stage 1. Stage 1 used some basic antidebugging techniques, some of which I learned about for the first time. Additionally, it downloaded a payload from Pastebin and xored it with a key that it retrieved from the clipboard. Finally, it used a process hollowing technique to execute the stage 2 payload. I will end my blog post off here, and in the next one I will continue by looking into the stage 2 executable.