…and how to ptrace the entry point and m3ss w1th da stack.
In this article, you will learn what happens inside the Linux Kernel when a process calls execve()
, how the Kernel prepares the stack and how control is then passed to the userland process for execution.
I had to learn this for the development of Zapper – a Linux tool to delete all command line options from any process (without needing root).
Overview
-
The Kernel receives SYS_execve() by a userland program.
-
The Kernel reads the executable file (specific sections) into specific memory locations.
-
The Kernel prepares the stack, heap, signals, …
-
The Kernel passes execution to the userland program.
Examining a binary
Let us start with a simple Linux C program:
int main(int argc, char *argv[0]) {
return 0;
}
Compile it with gcc -static -o none none.c
and find out some details:
$ readelf -h none
ELF Header:
Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4014f0
Start of program headers: 64 (bytes into file)
Start of section headers: 760112 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 10
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 29
The first instructions start at the ‘Entry Point’ at 0x4014f0
. These instructions were created by the compiler (gcc
, go
, etc). They differ by compiler.
Let’s load the binary into gdb and disass 0x4014f0
the instructions. The instructions perform a bit of housekeeping but eventually will call main()
(or the GoLang equivalent).
Let’s set a break-point at the Entry Point (0x4014f0
)and run the app with two command line options (firstarg
and secondarg
):
gdb ./none
pwndbg> disass 0x4014f0
pwndbg> br *0x4014f0
pwndbg> r firstarg secondarg
► 0x4014f0 <_start> xor ebp, ebp
0x4014f2 <_start+2> mov r9, rdx
0x4014f5 <_start+5> pop rsi
[...]
──────────────────────[ STACK ]──────────────────────
00:0000│ rsp 0x7ffca4229540 ◂— 0x3
01:0008│ 0x7ffca4229548 —▸ 0x7ffca422a4b3 ◂— '/sec/root/none'
02:0010│ 0x7ffca4229550 —▸ 0x7ffca422a4c2 ◂— 'firstarg'
03:0018│ 0x7ffca4229558 —▸ 0x7ffca422a4cb ◂— 'secondarg'
04:0020│ 0x7ffca4229560 ◂— 0x0
05:0028│ 0x7ffca4229568 —▸ 0x7ffca422a4d5 ◂— 'BASH_ENV=/etc/shellrc'
[...]
(If you are using gdb without pwngdb then you may need to x/64a $rsp
to list the first 64 entries from the stack.)
The Stack Pointer rsp
is at 0x7ffd4f48bd10
. Let’s find out the end of the stack with grep -F '[stack]' /proc/$(pidof none)/maps
:
7ffd4f46c000-7ffd4f48d000 rw-p 00000000 00:00 0 [stack]
The Kernel has allocated the stack memory from 0x7ffd4f46c000
to 0x7ffd4f48d000
– a total of 132 KB. It will grow dynamically up to 8MB (ulimit -s
kilobytes). Our program (so far; see rsp
) only uses the stack from the rsp
address (0x7ffd4f48bd10
) down to the same end of the stack (0x7ffd4f48d000
) – a total of 4,848 bytes (echo $((0x7ffd4f48d000 - 0x7ffd4f48bd10))
== 4848).
This is the ‘birth’ of the execution: The Kernel, in all its braveness, has passed control to our program. Our program is about to execute its very first instruction – to take its very first step (so to say).
What is on the stack right now is all the information the program gets from the Kernel to run. It contains the argument list, the environment variables and a lot of other interesting information.
For Zapper we had to manipulate the argument list, move stack values around, adjust the pointers and then pass control back to the program – without it falling over. It was prudent to understand a bit better what the Kernel had put on the stack.
Let’s dump the stack:
pwndbg> dump binary memory stack.dat $rsp 0x7ffd4f48d000
and load it into hd
xxd
03 00 00 00 00 00 00 00 b3 a4 22 a4 fc 7f 00 00 |..........".....|
c2 a4 22 a4 fc 7f 00 00 cb a4 22 a4 fc 7f 00 00 |..".......".....|
00 00 00 00 00 00 00 00 d5 a4 22 a4 fc 7f 00 00 |..........".....|
eb a4 22 a4 fc 7f 00 00 11 a5 22 a4 fc 7f 00 00 |..".......".....|
25 a5 22 a4 fc 7f 00 00 30 a5 22 a4 fc 7f 00 00 |%.".....0.".....|
[...]
b4 5c 18 e0 ed f9 fb 0d 30 78 38 36 5f 36 34 00 |.......0x86_64.|
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00 00 00 2f 73 65 63 2f 72 6f 6f 74 2f 6e 6f 6e |.../sec/root/non|
65 00 66 69 72 73 74 61 72 67 00 73 65 63 6f 6e |e.firstarg.secon|
64 61 72 67 00 42 41 53 48 5f 45 4e 56 3d 2f 65 |darg.BASH_ENV=/e|
74 63 2f 73 68 65 6c 6c 72 63 00 43 48 45 41 54 |tc/shellrc.CHEAT|
5f 43 4f 4e 46 49 47 5f 50 41 54 48 3d 2f 65 74 |_CONFIG_PATH=/et|
[...]
55 4d 4e 53 3d 31 31 38 00 2f 73 65 63 2f 72 6f |UMNS=118./sec/ro|
6f 74 2f 6e 6f 6e 65 00 00 00 00 00 00 00 00 00 |ot/none.........|
Lots of pointers. Lots of strings. Lots of unknowns.
Let's follow the call from execve()
to the new program's entry point.
The execve()
calls the Kernel via a syscall which then calls do_execve():
Eventually, this ends up in do_execveat_common()
. The bprm
structure is created and assigned all kinds of information about the program (see binfmts.h).
Important to us, the program's filename, e