Guide: Disassembling a C program
Navigate to guides/disassembling-c/support/.
As mentioned, ultimately everything ends up in assembly language (to be 100% accurate, everything ends up as machine code, which has a fairly good correspondence with assembly code). Often, we find ourselves with access only to the object code of some programs and we want to inspect how it looks.
To observe this, let’s compile a C program to its object code and then disassemble it. We’ll use the test.c program from the lab archive.
NOTE: To compile a C/C++ source file in the command-line, follow these steps:
-
Open a terminal. (shortcut
Ctrl+Alt+T) -
Navigate to the directory containing your source code.
-
Use the command:
gcc -fno-PIC -o <exec> <sourcefile>
where <sourcefile> is the name of the source file (test.c) and <exec> is the name of the result executable.
If you only want to compile (without linking it), use:
gcc -fno-PIC -c -o <objfile> <sourcefile>
where <sourcefile> is the name of the source file and <objfile> is the name of the desired output object file.
Since we want to transform test.c into an object file, we’ll run:
gcc -fno-PIC -c -o test.o test.c
After running the above command, we should see a file named test.o.
Furthermore, we can use gcc to transform the C code in Assembly code:
gcc -fno-PIC -masm=intel -S -o test.asm test.c
After running the above command we’ll have a file called test.asm, which we can inspect using any text editor/reader, such as cat:
cat test.asm
In order to disassembly the code of an object file we’ll use objdump as follows:
objdump -M intel -d <path-to-obj-file>
where <path-to-obj-file> is the path to the object file test.o.
Afterwards, you’ll see an output similar to the following:
$ objdump -M intel -d test.o
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <second_func>:
0: f3 0f 1e fa endbr64
4: 55 push rbp
5: 48 89 e5 mov rbp,rsp
8: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi
c: 89 75 f4 mov DWORD PTR [rbp-0xc],esi
f: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]
13: 8b 10 mov edx,DWORD PTR [rax]
15: 8b 45 f4 mov eax,DWORD PTR [rbp-0xc]
18: 01 c2 add edx,eax
1a: 48 8b 45 f8 mov rax,QWORD PTR [rbp-0x8]
1e: 89 10 mov DWORD PTR [rax],edx
20: 90 nop
21: 5d pop rbp
22: c3 ret
0000000000000023 <first_func>:
23: f3 0f 1e fa endbr64
27: 55 push rbp
28: 48 89 e5 mov rbp,rsp
2b: 48 83 ec 20 sub rsp,0x20
2f: 89 7d ec mov DWORD PTR [rbp-0x14],edi
32: c7 45 fc 03 00 00 00 mov DWORD PTR [rbp-0x4],0x3
39: bf 00 00 00 00 mov edi,0x0
3e: e8 00 00 00 00 call 43 <first_func+0x20>
43: 8b 55 fc mov edx,DWORD PTR [rbp-0x4]
46: 48 8d 45 ec lea rax,[rbp-0x14]
4a: 89 d6 mov esi,edx
4c: 48 89 c7 mov rdi,rax
4f: e8 ac ff ff ff call 0 <second_func>
54: 8b 45 ec mov eax,DWORD PTR [rbp-0x14]
57: c9 leave
58: c3 ret
0000000000000059 <main>:
59: f3 0f 1e fa endbr64
5d: 55 push rbp
5e: 48 89 e5 mov rbp,rsp
61: bf 0f 00 00 00 mov edi,0xf
66: e8 b8 ff ff ff call 23 <first_func>
6b: 89 c6 mov esi,eax
6d: bf 00 00 00 00 mov edi,0x0
72: b8 00 00 00 00 mov eax,0x0
77: e8 00 00 00 00 call 7c <main+0x23>
7c: b8 00 00 00 00 mov eax,0x0
81: 5d pop rbp
82: c3 ret
You may notice the repeated occurrences of the endbr64 instruction. It is part of Intel's Control-Flow Enforcement Technology(CET) and its purpose is to prevent malicious function executions (such as corrupting buffers and trying to alter the normal execution flow of the program). Detailed explanations about this instruction can be found in the Buffer Management lab.
There are many other utilities that allow disassembly of object modules, most of them with a graphical interface and offering debugging support. objdump is a simple utility that can be quickly used from the command-line.
It’s interesting to observe, both in the test.asm file and in its disassembly, the way a function call is made, which we’ll discuss further.