Following my previous blog post (Fusion level 05 solution), I continued my nightly quest for fun and solved the next challenge as well. As I try to keep my content unique and original, this blog post is mostly created due to the fact that as for today, there was no solution for this challenge on the internet.
This time we will examine the danger of unsafe shared data structures between multiple threads, as well as function arguments type mismatch causing an integer overflow leading to stack overrun exploit. In this blog post i will cover the concepts of controllable threads race-conditions, stack overflowing while making sure the flow of the program remains safe to the end of the function, where the return address is overwritten.
So let’s get to it.

The Problem

The code in this exercise Which can be found here:
https://exploit-exercises.com/fusion/level06/ 
has several compiled protections mechanisms and we expect the vulnerability type to be stack based.

Compiled protection

Compiled protection

As this level is protected with ASLR, and Non-Executable Stack & Heap, to properly exploit the server we will have to find some primitives (Basic abilities) to bypass these protections:

  • (Arbitrary) Memory read – Mostly in order to bypass ASLR, an information leak must be at hand, reading some address can give us a clue of where exactly some modules are located at. Once we know a module’s absolute address, we can find functions relative to that address if we can determine for which module it belongs to. As arbitrary read is usually preferred, sometimes (spoiler) we just have to deal with what we have and every scenario is unique.
  • Memory write – Either for writing a shellcode, or for a string command to execute, writing to a memory and knowing the address of that memory.
  • Execution – Changing a previously allocated page permissions which has a shellcode or ROPing to a differen with-already-execute-permissions location (such as different functions).

Intro

The code begins with a simple SSL server listening for connection on port 20006. For every new connection established, a new thread function is created: keyval_thread(), which receives the session as argument and will handle the rest of the thread’s communication from here on forth. The python library I used to establish the ssl connections is “python-gnutls 3.1.1” which you can get here:
https://pypi.python.org/pypi/python-gnutls/

The next obvious thing to do, is understand what this new threaded function keyval_thread() is doing with the session. So first it seems like it verifies the gnutls_handshake() worked as expected, then sending us a welcome message explaining we can send 'h' on the socket to receive help. After this happens the real logic begins: the server receives data on the socket, splits it with a delimiter of space and puts each part of the data in the args[6] variable. Further in the code the first letter of the first argument is checked against a predefined exposed server functionalities nested in a switch statement.

So what this server does exactly? Mainly using libHX for some sort of `dictionary` based structure for insertion of a key and a data type defined as:

struct data {
    void *data; 
    size_t length; 
};

The documentation for the library can be found here:
http://libhx.sourceforge.net/libHX_Documentation.pdf
We can either delete an existing node by its key, set a new key:val node, update an existing node’s data, check if an existing key exists, get an existing node value, clean up the dictionary structure and finally exit the program using 'd', 's', 'u', 'f', 'g', 'X' and 'e' respectively.

Focus on what’s important

One thing that really caught my eyes was the global variable HXmap *dict, which can be used between all functions with no actual locking mechanism. Since we can connect to the web server from multiple sessions and a new thread for each session, every thread can access that structure and manipulate its data simultaneously. Two different threads changing the same data type in a not atomic way, can (and will) cause disastrous results. Let’s see which functions use this global variable: update_data()  used by 'u' command, new_dict() used by 'X' command, HXmap_del() used by 'd' command, gather_data() used by 's' command, HXmap_find() used by 'f' command and finally HXmap_get() used by 'g' command.

HXmap_find() – gives us very little value. It basically sends back whether a key was found on the dictionary structure or not with no data modifications at all.
new_dict() – Good for a clean start. Deletes the data structure and re initializing it, but nothing else.
HXmap_del() – Deletes a specific key and its value from the dictionary structure.
gather_data() – Followed by HXmap_add(), is in charge of creating a new key:value in the global structure.
update_data() – Alter’s an existing key, creating a new value for it
For this exploit purposes I will only focus on the commands: get, update and set.

Set New Data

Here is a pseudo python code to use this function

socket = ssl_connect()
key = "test"
data = "A" * 100

socket.send("s %s %s" % (key, len(data)))
socket.send(data)

The first buffer is parsed by the main thread function keyval_thread() using HX_split5(), then a call to gather_data() is made, sending the data_length as a parameter. Receiving the rest of the data from the socket, allocating memory for the buffer and the data structure as well, eventually returning the data. An interesting fact about this function is that it allocated the same buffer twice (malloc(), HX_memdup()) and it never really even checked if the key exists already. This is just a fun-memoryleak-possiblyspray-fact, lets continue. Finally the key:data are added to the dictionary using HXmap_add()

Get Key Value

To receive data, we just need to send a simple buffer on the socket g <key>, then it triggers the function HXmap_get() which will return on the receiving the node’s data as defined above. The length of the data returned depends on the variable length in the structure.

Update Key Value

Similar to the set function in terms of what should be sent to the server.

The rest of the content here is quite important for the rest of the exploit, so read and understand carefully.
The update_data() function is looking for an existing node using HXmap_get(), and exit function if failed finding a matching node.
proceeds to check if the new data length is longer than the current length.
if longer, proceed with realloc() to find a new memory location matching the new length.
Receive new data on socket. if fails, exit function.
Finally, update the node’s length variable.

The Info Leak

the first part of the full exploit begins with a shwifty bug happening when two threads try to update the same data structure simultaneously. I will now explain exactly how it is done.

We create a new node on the memory using the 's' command. When the new data is stored, i want it to be as small as possible, smaller than 16 bytes. that is because the heap is devided into several different parts, each for different sizes of data such as for malloc(16), malloc(32) and a few more. I am creating a small data buffer because i want to ensure that later when a realloc() will occur, i will arrive into a new different heap location where surely the address will change, rather than stay on the same spot on the heap, later you will see the reason why.

# Connect 
s = ssl_connect()

# Add new key:data node
s.send("s test 1")
s.send("A")

After we have a new data in the memory. Its time to manipulate it simultaneously with two threads and weird magic things will happen. Watch closely on the update_data() function. Here is a truncated version of it, where i left only the interesting parts

int update_data(gnutls_session_t session, char *key, size_t length)
{
	struct data *data;
	size_t offset;
	int ret;

	data = HXmap_get(dict, key);
	...

	if(length > data->length) {
			tmp = realloc(data->data, length);
			...
			data->data = tmp;
	}		

	for(offset = 0; offset < length; ) {
			ret = gnutls_record_recv(session, data->data + offset,
					(length - offset) > 65535 ? 65535 : (length - offset));
			if(ret <= 0) return 0;
			offset += ret;
	}

	...

	data->length = length;
	return 0;
}

Lets say i have an allocated node with the values (Imagine the pseudo code used malloc())

data node = {0};
node.data = "A";
node.length = 1;

and now i want to update the data with a new larger in size data. for example 0x500 times the letter "B". the update_data() function already knows i’m going to send it soon 0x500 bytes, as size_t length is one of the function’s parameters. But i didn’t send them just yet. so the function will realloc() the data->data into a new heap region (line 11) with the required size and before it will proceed with changing the data->length, a blocking function is called: gnutls_record_recv() (line 17), waiting now to receive the amount of data specified by the length parameter (0x500). I will send 0x4FF bytes to keep the function blocking and wait for the last byte.

So the context so far for everything:

/* original */
data.length = 1;
data.data = "BBBB...B" (0x4FF) /* (PVOID) Of size 0x500 */

/* The thread is now stuck on gnutls_record_recv() 
untill i will send it all the data it expects - 0x500 */

Now i will connect again and update the same node with new values while the previous thread remains on a blocked state. The idea is to send a new smaller in size data (still bigger than the original) for example 0x100. the node data->data will point to a heap memory region of only 0x100 in size, then i will continue and send 0x100 bytes of the letter "C" to complete the function, resulting in change of the original data->length parameter to 0x100 while the data->data points to a heap region of 0x100 bytes as well. Finally i finish the execution of the first thread by sending the last byte, resulting in the original data to look like that:

/* original */
data.length = 0x500
data.data = "BBB...C" (0x100) /* Memory region of 0x100 */

Now we have a data structure where two different threads wrote to it simultaneously, thus we could change the length of the data type to have a much larger value than the data size it self.

Finally, we can use the 'g' - HXmap_get() function to read 0x500 bytes from a 0x100 bytes region, resulting the desired memory leak.
After i played for a bit with this multi threaded memory leak concept, i noticed i could find an address of libpthread-2.13.so quite reliably based on a few simple rules, thus also finding the address of system(), as well as a pointer to the memory which i just allocated. Basically i have everything i need to run an exploit, the address of system() and the address of the currently allocated buffer. All i am missing now is a stack overflow to change the flow of the program.

The Stack overflow

Finding the stack vulnerability was actually a lot of fun. The problem occurs due to type misuse of variable causing an integer overflow allowing me to overwrite the stack. This time, the vulnerability is with the gather_data() function.

We will have to dive a bit deeper into the code to understand exactly what is going on in this scenario. It all begins with the very first types of data that we send on the protocol: s <key> <key_length>. where key is args[1] and key_length is args[2].

case 's': // set
    data = gather_data(session, args[1], atoi(args[2]));
    if(data != NULL) {

a quick observation of the atoi() documentation:

The function first discards as many whitespace characters (as in isspace) as necessary until the first non-whitespace character is found. Then, starting from this character, takes an optional initial plus or minus sign followed by as many base-10 digits as possible, and interprets them as a numerical value.

atoi() is sign sensitive, so in order to send a really big number, such as 0xFFFFFF, i will have to send to the key_length, the value "-1"gather_data() function receives the length parameter as type size_t which is unsigned – not minus one, but a really big unsigned number instead – 4294967295.  Lets have a look at the prologue of the function.

struct data *gather_data(gnutls_session_t session, char *key, size_t length)
{
    unsigned char buffer[length];
    int offset, ret;
    struct data *data;
    ...
    return data;
}

we see the local stack parameters are being allocated dynamically depends on the value of length (unsigned char buffer[length]). Here is how it happens on the assembly:

Prologue

0x00001F99 – put the value of size_t length in the register EDI.
0x00001FAD – Substract ESP with the length (This is how local variables are allocated on the stack)

Lets focus on the Stack variables allocation which is allocated depending on length. Since registers can only hold up to 32 bits of data, the number 0xFFFFFFFF is the largest unsigned number that a register can hold. If we add 0x00000001 to that value, a register cant hold 0x100000000, so the actual value will become 0x00000000!  Basically on add \ sub operations the value 0xFFFFFFFF will mean very little although its actual value is huge. With that in mind, when the function allocate variables on the stack, ESP will experience an integer overflow causing it to remain almost the same value, which is not even close to the actual length the function thinks it allocated.

Moving on to the data receiving function

  for(offset = 0; offset < length; ) {
      ret = gnutls_record_recv(session, buffer + offset, (length -
          offset) > 65535 ? 65535 : (length - offset));
      if(ret <= 0) return NULL;
      offset += ret;
  }

As the length parameter holds the value of 0xFFFFFFFF, the gnutls_record_recv() function is going to receive data up to length 0f 65535, and write it on the stack variable buffer.
Great because since i sent 0xFFFFFFFF as the length parameter, not nearly as much is allocated on the stack (The Integer overflow i explained earlier). This will now allow me to overwrite the stack. I dont even need to write 65535 bytes. if gnutls_record_recv() fails for some reason (hint hint) it will return an error code, causing the function’s flow to return NULL; which is exactly what i want – easy overflow.

The Mini Failure

Well as i first tested the vulnerability by spamming “A”s on the socket \ stack and i received a different crash, from inside the gnutls_record_recv() function:

Mini failure

Notice line #2: it seems like my overflow was also overflowing local stack variables that were passed to gnutls_record_recv(), causing it to receive bad parameters and crashing unexpectedly.  To overcome this issue, i overwrote these parameters with \x00s, and hoped that like every good function it will have a sanity case to see if the parameters make sense \ valid, and if they do not to return an error code.
Great success.

I managed to overflow the local variables with values that will cause the gnutls_record_recv() function to return immediately, while still overflowing the RetAddress with the system() function, adding an argument of a reverse shell, which i created earlier in the memory leak function. Running the exploit and waiting locally for the reverse shell:

Exploit & Reverse shell

As a final note, although i solved this exercise in one way, i figured out there is possibly another way to solve it which is a bit harder. With the memory leak bug, i could possibly also cause a heap corruption, and then solve this exercise using a completely different vulnerability.

This post is already so long i decided to omit some of the things happened in this level. To understand the solution further you are welcome to see the solution code on my github.

Hope you enjoyed reading this blog post, if you have any ideas or thoughts, feel free to share in the comments below.