Lab 8 - The Stack

Task: Max

Calculate the maximum between two numbers in two registers (eax and ebx) using a comparison instruction, a jump instruction, and push/pop instructions.

TIP: Consider how you can swap two registers using the stack.

If you're having difficulties solving this exercise, go through this reading material

Task: Reverse Array

Building on the reverse-array.asm exercise, implement the TODOs without using the mov instruction when working with arrays, so that at the end of the program, the output array contains the input array in reverse order.

NOTE: After a correct solution, the program should print:
Reversed array:
911
845
263
242
199
184
122

If you're having difficulties solving this exercise, go through this reading material

Task: Stack Addressing

The stack-addressing.asm program in the lab's archive allocates and initializes two local variables on the stack:

an array of natural numbers from 1 to NUM
a string "Bob has corn".

Replace each push instruction with an equivalent sequence of instructions.
Print the addresses and values on the stack in the interval [esp, ebp] (from low addresses to high addresses) byte by byte.
Print the string allocated on the stack byte by byte and explain how it looks in memory. Think about where you should start displaying and when you should stop.
Print the vector allocated on the stack element by element. Think about where you should start displaying and what size each element has.

NOTE: After a successful implementation, the program should display something similar to the following output (it won't be exactly the same, stack memory addresses may differ):
0xffcf071b: 65
0xffcf071c: 110
0xffcf071d: 97
0xffcf071e: 32
0xffcf071f: 97
...
0xffcf0734: 4
0xffcf0735: 0
0xffcf0736: 0
0xffcf0737: 0
0xffcf0738: 5
0xffcf0739: 0
0xffcf073a: 0
0xffcf073b: 0
Bob has corn
1 2 3 4 5
Explain the significance of each byte. Why are they arranged in that particular order? Why are some bytes 0?
TIP: Remember that ASCII character codes are represented as decimal values. Remember the order in which the bytes of a larger number are stored: review the section Order of representation of numbers larger than one byte from Lab 01.

If you're having difficulties solving this exercise, go through this reading material

Task: Local Var

The program local-var.asm in the laboratory archive combines two sorted arrays (array_1 and array_2) by placing the resulting array in array_output defined in the .data section.

Modify the program so that array_1, array_2, and array_output are allocated on the stack. The array allocation is done using the sub instruction. For the copies of arrays array_1 and array_2, you will need to copy their elements from the .data section to the stack before using them.

If you're having difficulties solving this exercise, go through this reading material

Task: GCD - Greatest Common Divisor

Open gcd.asm and run the program. The code calculates the greatest common divisor (GCD) of two numbers given as parameters using the eax and edx registers, and then stores the calculated value back in the eax register.

Make the necessary modifications so that the error message - Segmentation fault (core dumped) - no longer appears.
Within the print label, display the result in the following format:

gcd(49,28)=7

If you're having difficulties solving this exercise, go through this reading material

Introduction to the Stack

In this lab, we will learn about how the stack is represented in assembly language, its utility, and how to it could be useful to us.

Reminder: Stack Data Structure

NOTE: This is a quick reminder on how the abstract data structure works. If you feel like you already understand this, you can skip this part.

In the world of algorithms and data structure, a "stack" is a data structure used to hold data, mirroring a real-life stack of objects (for example, a stack of books, or a stack of boxes). This data structure's usefulness comes from optimizing the ease and speed at which elements can be added or removed from the top of the stack. It forces us to think about how our data is organized relative to the stack's base and top. The usual operations with the stack are:

push - add an element to the top of the stack
pop - get the element from the top of the stack and remove it
peek or top - get the element from the top of the stack without removing it

An image showing a stack data structure at different stages. The empty stack gains the element labeled "1" after the instruction "push 1" is applied to it. After the instruction "push 2", the stack has two items, element 1 being below element 2. Finally, the "pop" instruction makes the stack lose element 2

As the above image suggests, the order in which items are inserted and removed from a stack is represented by the phrase "first in, last out".

So, Why is it Useful?

In the previous chapters we learned how to work with the basics of assembly. A pretty big limitation we have imposed on ourselves by using such a low-level language is the small number of values we can work with at a time. For anything but small programs, having just the 6 registers (eax, ebx, ecx, edx, esi, edi) is usually not enough, and creating global variables for temporary values is not memory efficient and, at some point, we'll struggle to even name them something reasonable!

You might have also felt the absence of functions. The stack will help us out as it provides a nice place to store:

the arguments,
the values of registers before entering a function so they can be restored on exit,
and some metadata useful for when we want to exit out of a function.

Stack Operations

The stack can be modified in two ways:

By using special instructions designed for stack operations, the most common of which are push and pop:

%include "io.asm"

section .text
global CMAIN
CMAIN:

    mov eax, 7
    mov ebx, 8
    add eax, ebx
    push eax                 ; push the value of the eax register onto the stack
    mov eax, 10              ; we can now use the eax register, as its value is saved on the stack
    PRINTF32 `%d \n\x0`, eax  ; 10

    pop eax                  ; retrieve the value of the eax register from the stack
    PRINTF32 `%d \n\x0`, eax  ; 15

By directly accessing the memory with the help of a special register in which the top of the stack is held - esp also known as the "stack pointer register".

%include "io.asm"

section .text
global CMAIN
CMAIN:
    mov eax, 7
    mov ebx, 8
    add eax, ebx
    sub esp, 4           ; reserve 4 bytes on the stack
    mov [esp], eax       ; move the contents of the eax register to the new address pointed to by esp
    mov eax, 10
    PRINTF32 `%d \n\x0`, eax

    mov eax, [esp]       ; retrieve the value from the stack
    add esp, 4           ; restore the value of the esp register
    PRINTF32 `%d \n\x0`, eax

IMPORTANT: Comment out the instructions sub esp, 4 and add esp, 4. What happens? Why?
NOTE: The stack is used to remember the return address when a function is called. Note that the stack grows from higher addresses to lower addresses. This is why memory allocation on the stack is done using the sub instruction, and deallocation is done using the add instruction.

stack_image

Some processors do not have support for stack operations: for example, MIPS processors do not have push and pop instructions and do not have a special register for the stack pointer. Thus, if we want to implement stack operations on a MIPS processor, we would do it exactly as in the example above, but we can choose any register to keep track of the stack pointer.

Therefore, the push eax instruction on an x86 processor is equivalent to:

sub esp, 4
mov [esp], eax

And the pop eax is equivalent to:

mov eax, [esp]
add esp, 4

IMPORTANT: We need to be careful with the amount of data allocated on the stack because the size of the stack is limited. Overfilling the stack will lead to the well-known error of stack overflow (more in the security lab).
NOTE: The default stack size on Linux for a 64-bit architecture is 8MiB.

Stack in the Context of a Process's Address Space

A process's address space, or more precisely, a process's virtual address space, represents the virtual memory area usable by a process. Each process has its own address space. Even in situations where two processes share a memory region, the virtual space is distinct, but it maps to the same physical memory region.

stack_image

In the figure above, a typical process's address space is presented.

The four important zones in a process's address space are the data zone, the code zone, the stack, and the heap. As can be observed from the figure, the stack and the heap are the zones that can grow. In fact, these two zones are dynamic and only make sense in the context of a process. On the other hand, the information in the data and code zones is described in the executable.

Tricks and Tips

The golden rule of stack usage is: the number of push-es should equal the number of pop-s in a function. Given that the stack is used for function calls, it is very important that when a function finishes its execution, the stack pointer should be updated so that it points to the same memory location (of the stack) as it did at the time of entering the function.
In situations where we perform N push-es and reach the end of the function without doing a pop for any of the values, we can restore the stack pointer using the add instruction.

section .text
global CMAIN
CMAIN:
    mov eax, 5
    mov ebx, 6
    mov ecx, 7

    push eax
    push ebx
    push ecx

    add esp, 12     ; equivalent to using 3 consecutive pop-s
    ret

An alternative method is to save the current stack pointer value in a separate register, such as ebp, before performing any push operations. This allows us to easily restore the stack pointer value at the end of the function, without having to keep track of the number of push operations performed.

section .text
global CMAIN
CMAIN:

    mov ebp, esp       ; save current stack pointer value in ebp

    mov eax, 5
    mov ebx, 6
    mov ecx, 7

    push eax
    push ebx
    push ecx

    mov esp, ebp       ; restore stack pointer value
    ret

IMPORTANT: What is the primary use of the ebp register?

As we can observe, the ebp register defines the stack frame for each function. Similarly to how we can address local variables using the esp register, we can do the same with ebp. Additionally, we will see that function parameters are addressed using ebp.

Guide: Stack Operations

The stack_operations.asm file demonstrates various stack operations. The main focus is to show how to manipulate the stack by pushing and popping values, and how to "allocate" and "deallocate" memory on the stack.

Note: Notice how push and pop are just syntactic sugar for the simpler sub, add, and mov instructions.

For convenience, here's the contents of the file. To play around with it, download the lab locally.

%include "printf32.asm"

section .data
    var: dd ?

section .text

; esp -> stack pointer
; ebp -> base pointer

extern printf
global main
main:
    push ebp
    mov ebp, esp

    push dword 10 ; sub esp, 4;  mov [esp], 10;
    push dword 11 ; sub esp, 4;  mov [esp], 11;
    push dword 12 ; sub esp, 4;  mov [esp], 12;
    push dword 13 ; sub esp, 4;  mov [esp], 13;
    push dword 14 ; sub esp, 4;  mov [esp], 13;


    pusha  ; push all registers on the stack
    popa  ; pop all registers from the stack

    ; Version 1
    pop eax; ; mov eax, [esp]; add esp, 4
    pop eax; ; mov eax, [esp]; add esp, 4
    pop eax; ; mov eax, [esp]; add esp, 4
    pop eax; ; mov eax, [esp]; add esp, 4
    pop eax; ; mov eax, [esp]; add esp, 4

    ; Version 2
    ; add esp, 20 ; 4 * number_of_push

    ; Version 3
    ; mov esp, ebp

    ; sub esp <-> add esp -> use to allocate/deallocate memory

    ; Aloc 8 bytes <-> 2 int
    ; sub esp, 8
    ; mov [esp], 10
    ; mov [esp + 4], 12

    ; Push/Pop from global variable

    mov dword [var], 1337

    push dword [var]
    pop dword [var]

    mov eax, [var]
    PRINTF32 `VAR: %d\n\x0`, eax


    leave
    ret

Guide: Stack Addressing

The stack_addressing.asm file demonstrates how data is stored on the stack, and especially in what order.

Here's what an usual output for the compiled program would be:

0xff99fba8: 0xf7f46020
0xff99fba4: 0xa
0xff99fba0: 0xb
0xff99fb9c: 0xc
0xff99fb98: 0xd

Note: The last 4 values are the ones we pushed on stack. What is the first one?
Answer: It is the old EBP we push at the start of the function.

For convenience, here's the contents of the file. To play around with it, download the lab locally.

%include "printf32.asm"

section .text

extern printf
global main
main:
    push ebp
    mov ebp, esp

    push dword 10
    push dword 11
    push dword 12
    push dword 13

    mov eax, ebp
print_stack:
    PRINTF32 `0x\x0`
    PRINTF32 `%x\x0`, eax
    PRINTF32 `: 0x\x0`
    PRINTF32 `%x\n\x0`, [eax]

    sub eax, 4
    cmp eax, esp
    jge print_stack

    xor eax, eax
    leave
    ret

Lab 8 - The Stack

Task: Max​

Task: Reverse Array​

Task: Stack Addressing​

Task: Local Var​

Task: GCD - Greatest Common Divisor​

Introduction to the Stack​

Reminder: Stack Data Structure​

So, Why is it Useful?​

Stack Operations​

Stack in the Context of a Process's Address Space​

Tricks and Tips​

Guide: Stack Operations​

Guide: Stack Addressing​

Task: Max

Task: Reverse Array

Task: Stack Addressing

Task: Local Var

Task: GCD - Greatest Common Divisor

Introduction to the Stack

Reminder: Stack Data Structure

So, Why is it Useful?

Stack Operations

Stack in the Context of a Process's Address Space

Tricks and Tips

Guide: Stack Operations

Guide: Stack Addressing