High quality red team operations are research-led. Being able to simulate current and emerging threats at an accurate level is of paramount importance if the engagement is going to provide value to clients.
One common use case for offensive operations is the requirement to run native executable files or compiled code on the target and in memory. Loading and running these files in memory is not a new technique, but running executables as secondary modules within a Command & Control (C2) framework is rarer, particularly those that support arguments from the host process.
This blog introduces innovative techniques and is a must have tool for the red team arsenal. RunPE is a .NET assembly that uses a technique called Process Hiving to manually load an unmanaged executable into memory along with all its dependencies, run that executable with arguments passed at runtime, including capturing any output, before cleaning up and restoring memory to hide any trace that it was run.
What is it?
The aim of this project is to develop a .NET assembly that provides a mechanism for running arbitrary unmanaged executables in memory. It should allow arguments to be provided, load any libraries that are required by the code, obtain any STDOUT and STDERR from the process execution, and not terminate the host process once the execution of the loaded PE finishes.
This .NET assembly must be able to be run in the normal way in C2 frameworks, such as by execute-assembly in Cobalt Strike or run-exe in PoshC2, in order to extend the functionality of those frameworks.
Finally, as this is to all take place in an implant process, any artefacts in memory should then be cleaned up by zeroing out the memory and removing them or restoring original values in order to better hide the activity.
We’re calling this technique of running multiple PEs from the within the same process ‘Process Hiving’ and the result of this work is the .NET assembly RunPE. In essence this technique:
- Receives a file path or base64 blob of a PE to run
- Manually maps that file into memory without using the Windows Loader in the host process
- Loads any dependencies required by the target PE
- Patches memory to provide arguments to the target PE when it is run
- Patches various API calls to allow the target PE to run correctly
- Replaces the file descriptors in use to capture output
- Patches various API calls to prevent the host process from exiting when the PE finishes executing
- Runs the target PE from within the host process, while maintaining host process functionality
- Restores memory, unloads dependencies, removes patches and cleans up artefacts in memory after executing
Loading the PE
The starting point for the work was @subtee‘s .NET PE Loader utilised in GhostPack’s SafetyKatz. This .NET PE Loader already mapped a PE into memory manually and invoked the entry point, however a few issues remained preventing its use it in an implant process. SafetyKatz uses a ‘slightly modified’ version of Mimikatz as the target PE, critically to not require arguments or exit the process upon completion.
The first step then was to re-use as much of this work as possible and rewrite it to suit our needs – no need to reinvent the wheel when a lot of great work was already done. The modified loader manually maps the target PE into memory, performs any fixups and then loads any dependency DLLs that are not already loaded. The Import Address Table for the PE is patched with the locations of all the libraries once they are loaded, mimicking the real Windows loader.
In a Windows process a pointer to the command line arguments is located in the Process Environment Block (PEB) and can be retrieved directly or, more commonly, using the Windows API call
GetCommandLine. Similarly, the current image name is also stored in the PEB. With RunPE, the command line and image name are backed-up for when we reset during the clean-up phase and then replaced with the new values for the target PE.
Preventing Process Exit
Another issue with running vanilla PEs in this way is that when they finish executing the PE inevitably tries to exit the process, such as by calling
Similarly, as the RunPE process is .NET, the CLR also tries to shut down once process termination is initiated, so even if
TerminateProcess is prevented
CorExitProcess will cause any .NET implant to exit.
To circumvent this a number of these API calls are patched to instead
ExitThread. As the entry point of the target PE is to be run in a new thread this means that once it has finished it will gracefully exit the thread only, leaving the process and CLR instead.
These API calls are patched with bytes that use Return Oriented Programming (ROP) to instead call
ExitThread, passing an exit code of 0.
An example of this patch if the
ExitThread function was located at
0x1337133713371337 is below:
0: 48 c7 c1 00 00 00 00 mov rcx, 0x0 // Move 0 into rcx for exit code argument
7: 48 b8 37 13 37 13 37 movabs rax, 0x1337133713371337 // Move address of ExitThread into rax
e: 13 37 13
11: 50 push rax // Push rax onto stack and ret, so this value with be the 'return address'
12: c3 ret
We can see this in x64dbg while RunPE is running, viewing the
NtTerminateProcess function and noting it has been patched to exit the thread instead.
Several other API calls also required patching with new values in order for PEs to work. One example is
GetModuleHandle which, if called with a NULL parameter, returns a handle to the base of the main When a PE calls this function it is expecting to receive its base address, however in this scenario the API call will in fact return the host process’ binary’s base address, which could cause the whole process to crash, depending on how that address is then used.
GetModuleHandle could also be called with a non-
NULL value, in which case the base address of a different module will be returned.
GetModuleHandle is therefore hooked and execution jumps to a newly allocated area of memory that performs some simple logic; returning the base address of the mapped PE if the argument is
NULL and rerouting back to the original
GetModuleHandle function if not. As the first few bytes of
GetModuleHandle get overwritten with a jump to our hook these instructions must be executed in the hook before jumping back to the
GetModuleHandle function, return execution to after the hook jump.
As with the previous API patches, these bytes must be dynamically built-in order to provide the runtime addresses of the hook location, the
GetModuleHandle function and the base address of the target PE.
As an additional change the PEB is also updated, replacing the base address with that of the target PE so that if any programs retrieve this address from the PEB directly then they get the expected value.
At this point, the target PE should be in a position to be able to run from within the host process by calling the entry point of the PE directly. However, as the intended use case is to be able to use RunPE to execute PEs in memory from with an implant, it is a requirement to be able to capture output from the program.
Output is captured from the target process by replacing the handles to STDOUT and STDERR with handles to anonymous pipes using
Just before the target PE entry point is invoked on a new thread, an additional thread is first created that will read from these pipes until they are closed. In this way, the output is captured and can be returned from RunPE. The pipes are closed by RunPE after the target PE has finished executing, ensuring that all output is captured.
As Process Hiving includes running multiple processes from within one, long-running host process it is important that any execution of these ‘sub’ processes includes full and proper clean up. This serves two purposes:
- To restore any changed state and functionality in order to ensure that the host process can continue to operate normally.
- To remove any artefacts from memory that may cause an alert or artifact if detected through techniques such as in-memory scanning or aid an investigator in the event of a manual triage.
To achieve this, any code change made by RunPE is stored during execution and restored once execution is complete. This includes API hooks, changed values in memory, file descriptors, loaded modules and of course the mapped PE itself. In the case of any particularly sensitive values, such as the command line arguments and mapped PE, the memory region is first zeroed out before it is freed.
An example of RunPE running unchanged and up-to-date Mimikatz is below, alongside Procmon process activity events for the process.
Note that there are no sub-processes created, and Mimikatz runs successfully with the provided arguments.
Running a debug build provides more output and allows us to verify that the artefacts are being removed from memory and hooks removed, etc. We can see below that after the clean-up has occurred the ‘new’ DLLs loaded for Mimikatz have either already been cleaned up by Mimikatz itself (the error code 126) or are freed by RunPE and are now no longer visible in the Modules tab of Process Hacker.
Similarly, the original code on the hooks such as
NtTerminateProcess has been restored, which we can verify using a debugger such as x64dbg as below.
As during Red Team operations Mimikatz.exe is unlikely to exist in the target environment, RunPE also supports loading of binaries from base64 blobs so that they can be passed with arguments down C2 channels. Long, triple dash switches are used in order to avoid conflicts with any arguments to the target PE.
An example of this from a PoshC2 implant below demonstrates the original use case. The implant host process of netsh.exe loads and invokes the RunPE .NET assembly which in turn loads and runs net.exe in the host process with arguments. In this case net.exe is passed as a base64 blob down C2.
Known Issues & Further Work
There are a number of known issues and caveats with this work in its current state which are detailed below.
- RunPE only supports x64 bit native Windows PE files.
- During testing any modern PE compiled by the testers has worked without issues, however issues remain with a number of older Windows binaries such as ipconfig.exe and icacls.exe. Further research is presently ongoing into what specific characteristics of these files cause issues.
- If the target PE spawns sub-processes itself then those are not subject to Process Hiving and will be performed in the normal fashion. It is up to the operator to understand what the behaviour of the target PE is any other considerations that should be made.
- RunPE presently calls the entry point of the target PE on a new thread and waits for that thread to finish, with a timeout. If the timeout is reached or if the target PE manipulates that thread, this is undefined behaviour.
- PEs compiled without ASLR support do not work currently, such as by mingw.
Additionally, further work can be made on RunPE to improve the stealth of the Process Hiving technique:
- Dependencies of the target PE can be mapped into memory using the same PE loader as the target PE itself and not using the standard Windows Loader. This would bypass detections on API calls such as
GetProcAddress as well as any hooks placed in those modules by defensive software.
- For any native API calls that remain, the use of syscalls directly can be explored to achieve the same ends for the same reasons as described above.
For Blue Team members, the best way to prevent this technique is to prevent the attacker from reaching this stage in the kill chain. Delivery and initial execution for example likely provide more options for detecting an attack than process self-manipulation. However, a number of the actions taken by RunPE can be explored as detections.
SetStdHandle is called six times per RunPE call, once to set STDOUT, STDERR and STDIN to handles to anonymous pipes and then again to reset them. A cursory monitor of a number and range of processes on the author’s own machine did not show any invocations of this API call as part of standard use, so this activity could potentially be used to detect RunPE.
- A number of APIs are hooked or modified and then restored as part of every RunPE run such as
TerminateProcess. Continued modification of these Windows API calls in memory is not likely to be common behaviour and a potential avenue to detection.
- Similarly, the PEB is also continually modified as the command line string and image name are updated with every invocation of RunPE.
- While the source code can be obfuscated, any attempt to load the default RunPE assembly into a .NET process provides a strong opportunity for detection.
At its core, Process Hiving is a fairly simple process. A PE is manually mapped into memory using existing techniques and a number of changes are made to API calls and the environment so that when the entry point of that PE is invoked it runs in the expected way.
We hope that this technique and the tool that implements it will allow Red Teams to be able to quickly and easily run native binaries from their implant processes without having to deal with many of the pain points that plague similar techniques that already exist.
The source code for RunPE is available at https://github.com/nettitude/RunPE and any further work on the tool can be found there. Contributions and collaboration are also welcome.