STAGE 1: Preprocessing
Command: gcc -E hello.c -o hello.i
What happens:
#include <stdio.h> gets replaced by the entire contents of stdio.h, thousands of lines of function declarations, type definitions, macros. printf is not defined yet, just declared. The preprocessor doesn’t know what printf does, just that it exists.
Output: hello.i, pure C code, no # directives, just expanded text. Roughly 800 lines for this tiny program.
// ... thousands of lines from stdio.h ...
extern int printf(const char *__restrict __fmt, ...);
// ...
int main() {
printf("hello world");
return 0;
}
STAGE 2: Compilation (C → Assembly)
Command: gcc -S hello.i -o hello.s
What happens inside the compiler:
Step 2a: Lexing (Tokenization)
Source text gets broken into tokens:
int main ( ) { printf ( "hello world" ) ; return 0 ; }
KW IDENT ( ) { IDENT ( STRING ) ; KW NUM ; }`
Step 2b: Parsing (Tokens → AST)
Tokens become an Abstract Syntax Tree:
FunctionDecl: main
└── CompoundStmt
├── CallExpr: printf
│ └── StringLiteral: "hello world"
└── ReturnStmt
└── IntLiteral: 0`
Step 2c: Semantic Analysis
- Is
printfdeclared? Yes, from stdio.h - Are argument types correct? Yes,
const char*matches string literal - Is return type correct? Yes,
intmain returns0
Step 2d: IR Generation (AST → LLVM IR)
@str = private constant [12 x i8] c"hello world\00"
define i32 @main() {
entry:
%call = call i32 @printf(ptr @str)
ret i32 0
}
declare i32 @printf(ptr, ...)
Notice:
"hello world"becomes a global constant in.rodataprintfis declared not defined, it’ll be resolved by linker latermainreturnsi32 0
Step 2e: Optimization Passes
On the IR, LLVM runs passes:
- Dead code elimination
- Constant propagation
- Inlining (printf might get inlined if it’s simple enough)
Step 2f: Backend — IR → Assembly
.section .rodata
.LC0:
.string "hello world"
.text
.globl main
main:
pushq %rbp ; save frame pointer
movq %rsp, %rbp ; set up stack frame
leaq .LC0(%rip), %rdi ; arg1 = address of "hello world"
call printf ; call printf
movl $0, %eax ; return value = 0
popq %rbp ; restore frame pointer
ret ; return
Output: hello.s, human readable assembly text
STAGE 3: Assembly (Assembly → Object File)
Command: as hello.s -o hello.o
What happens:
Each assembly instruction → binary encoding. 32 or 64 bit words. But call printf, the assembler doesn’t know where printf lives. So it:
- Emits placeholder bytes for the address
- Creates a relocation entry: “at this offset, patch in the address of
printf”
Output: hello.o, ELF object file
readelf -S hello.o
Sections inside:
.text ← your compiled main() binary
.rodata ← "hello world\0" string
.rela.text ← relocation: "patch printf address here"
.symtab ← symbols: main defined, printf undefined`
STAGE 4: Linking
Command: ld hello.o -lc -o hello (gcc does this automatically)
What happens:
Linker takes:
hello.o, your codelibc.soorlibc.a, C standard library (contains printf definition)crt0.o/crtbegin.o, C runtime startup code
Resolves symbols:
printffound in libc → patches the relocation entry with real address
Produces final executable layout:
Program Headers (segments)
.text ← all code merged
.rodata ← all read-only data
.data ← initialized globals
.bss ← uninitialized globals
.dynamic ← dynamic linking info (if shared)
.got ← Global Offset Table
.plt ← Procedure Linkage Table`
Output: hello executable ELF binary
STAGE 5: OS Loading, You Type ./hello
Step 5a: Shell forks
shell calls fork() → creates child process
child calls execve("./hello", ...) → OS takes over`
Step 5b: OS reads ELF header
Magic: 7f 45 4c 46 ← "ELF" magic bytes
Class: 64-bit
Entry point: 0x401050 ← address of _start`
Step 5c: OS maps segments into virtual memory
mmap .text → virtual address 0x401000, read+execute
mmap .rodata → virtual address 0x402000, read only
mmap .data → virtual address 0x403000, read+write
mmap stack → top of address space, read+write, grows down
mmap heap → above .data, read+write, grows up (empty now)`
Step 5d: Dynamic linker runs (ld.so)
Since printf is in libc.so (shared library):
- OS loads
ld.sofirst ld.soreads.dynamicsection- Finds dependency:
libc.so.6 - Maps libc into process address space
- Resolves
printfsymbol → patches GOT entry
Step 5e: _start runs (CRT — C Runtime)
This is code you never wrote — it comes from crt0.o:
asm
_start:
xor %rbp, %rbp ; mark outermost frame
mov %rsp, %rdi ; argc
call __libc_start_main ; sets up environment
; calls main()
; calls exit() when main returns`
Step 5f: main() runs
Stack frame created:
│ return addr │ ← address in _start to return to
├──────────────┤ ← RBP (frame pointer)
│ saved RBP │
├──────────────┤
│ (locals) │ ← none in this program
└──────────────┘ ← RSP`
leaq .LC0(%rip), %rdi, loads address of “hello world” into first argument register.
call printf:
- CPU pushes return address onto stack
- Jumps to printf in libc
- printf calls
write(1, "hello world", 11)- syscall - CPU switches to kernel mode
- Kernel writes bytes to stdout file descriptor
- Returns to userspace
- printf returns
ret- pops return address, jumps back to _start.
Step 5g: exit(0) called by _start
- Flushes stdio buffers
- Calls atexit handlers
exit_group(0)syscall- OS reclaims all memory, closes file descriptors
- Process is gone