Skip to main content

Lab 10 - The C - Assembly Interaction

Task: Maximum Calculation in Assembly with Call from C

Navigate to drills/tasks/max-c-calls/support and open main.c

In this subdirectory you can find an implementation of calculating the maximum of a number where the main() function is defined in C from which the get_max() function defined in assembly language is called.

Trace the code in the two files and how the function arguments and return value are passed.

Compile and run the program. To compile it run the command:

make

Then run the resulting executable:

./mainmax

IMPORTANT: Pay attention to understanding the code before proceeding to the next exercise.

IMPORTANT: The return value of a function is placed in the eax register.

Maximum Computation Extension in Assembly with Call from C

Extend the program from the previous exercise (in assembly language and C) so that the get_max() function now has the signature unsigned int get_max(unsigned int *arr, unsigned int len, unsigned int *pos). The third argument to the function is the address where the position in the vector on which the maximum is found will be held.

The position in the vector on which the maximum is found will also be displayed on display.

TIP: To hold the position, it is best to define a local variable pos in the main() function in the C file (main.c) in the form

unsigned int pos;

and call the get_max() function in the form:

max = get_max(arr, 10, &pos);

If you're having difficulties solving this exercise, go through this relevant section reading material.

Task: Corrupt Stack Frame Debugging

Navigate to drills/tasks/stack-frame/support and open main.c

In current subdirectory of the lab's task archive you can find a C program that implements the display of the string Hello world! by a call to the print_hello() function defined in the assembly for the first part of the message, followed by two calls to the printf() function directly from the C code.

Compile and run the program. What do you notice? The printed message is not as expected because the assembly code is missing an instruction.

Use GDB to inspect the address at the top of the stack before executing the ret statement in the print_hello() function. What does it point to? Track the values of the ebp and esp registers during the execution of this function. What should be at the top of the stack after execution of the leave statement?

Find the missing instruction and rerun the executable.

TIP: In order to restore the stack to its state at the start of the current function, the leave statement relies on the function's pointer frame having been set.

If you're having difficulties solving this exercise, go through this relevant section reading material.

Task: Maximum Calculation in C with Call from Assembly

Navigate to drills/tasks/max-assembly-calls/support/ and open main.asm

In this directory you can find an implementation of calculating the maximum of a number where the main() function is defined in assembly language from which the get_max() function defined in C is called.

Trace the code in the two files and how the function arguments and return value are passed.

Compile and run the program.

IMPORTANT: Pay attention to understanding the code before proceeding to the next exercise.

Extending Maximum Computation in C with Call from Assembly

Extend the program from the previous exercise (in assembly language and C) so that the get_max() function now has the signature unsigned int get_max(unsigned int *arr, unsigned int len, unsigned int *pos). The third argument to the function is the address where the position in the vector on which the maximum is found will be held.

The position in the vector on which the maximum is found will also be displayed on display.

TIP: To hold the position, it is best to define a global variable in the assembly file (main.asm) in the .data section, of the form

pos: dd 0

This variable you will pass (by address) to the get_max() call and by value to the printf() call for display.

For display modify the print_format string and the printf() call in the assembly file (main.asm) to allow two values to be displayed: maximum and position.

If you're having difficulties solving this exercise, go through this relevant section reading material.

Task: Keeping Records

Navigate to drills/tasks/regs-preserve/support and open main.asm

In this subdirectory of the lab's task repository you will find the print_reverse_array() function implemented by a simple loop that makes repeated calls of the printf() function.

Follow the code in the main.asm file, compile and run the program. What happened? The program runs indefinitely. This is because the printf() function does not preserve the value in the ecx register, used here as a counter.

Uncomment the lines marked TODO1 and rerun the program.

Troubleshooting SEGFAULT

Decompose the lines marked TODO2 in the assembly file from the previous exercise. The code sequence makes a call to the double_array() function, implemented in C, just before displaying the vector using the function seen earlier.

Compile and run the program. To debug the segfault you can use the objdump utility to trace the assembly language code corresponding to the double_array() function. Notice which of the registers used before and after the call are modified by this function.

Add the instructions for preserving and restoring the required registers to the assembly file.

If you're having difficulties solving this exercise, go through this relevant section reading material.

Task: Warning (not an error)

Access the directory drills/tasks/include-fix/support/. Run the make command. A warning appears, but it is from the preprocessing/compilation process. Resolve this warning by editing the hello.c file.

Bonus: Fix the warning without using the #include directive.

If you're having difficulties solving this exercise, go through this reading material.

Task: Fixing Export Issues

Access the directory drills/tasks/export-fix/support/. Each subdirectory (a-func/, b-var/, c-var-2/) contains a problem related to the export of symbols (functions or variables). In each subdirectory, run the make command, identify the issue, and edit the necessary files to resolve it.

If you're having difficulties solving this exercise, go through this reading material.

Task: Maximum Computation in Assembly with 64-bit C Call

Navigate to drills/tasks/max-c-calls-x64/support and open main.c

In this subdirectory you should have implemented the maximum assembly language calculation on a 64-bit system. Start the program from exercises 4 and 5 in such a way that you run it using a 64-bit system.

TIP: https://en.wikipedia.org/wiki/X86_calling_conventions.

The first thing to note is that on the x64 architecture the registers are 8 bytes in size and have different names than the 32-bit ones (in addition to extending the traditional ones: eax register becomes rax, ebx register becomes rbx, etc., there are new ones: R10-R15: for more information see here).

Also, on x64 architecture parameters are no longer sent to the stack, but put in registers. The first 3 parameters are put in: rdi, rsi and rdx registers. This is not a uniformly adopted convention. This convention is only valid on Linux, on Windows there are other registers which are used to pass the parameters of a function. The calling convention requires that, for functions with a variable number of arguments, rax register be set to the number of vector registers used to pass arguments. printf() is a variable argument count function, and unless you use registers other than those mentioned in the previous paragraph for passing arguments, you must set rax = 0 before calling. Read more here.

Task: Bonus: Maximum Calculation in C with Call from Assembly - 64 Bits

Enter the directory drills/tasks/max-assembly-calls-x64/support and implement the maximum calculation in C with a call from Assembly language on a 64-bit system. Start from the program used in drills/tasks/max-assembly-calls, ensuring it runs on a 64-bit system. Follow the instructions from the previous exercise and pay attention to the order of parameters.

C - Assembly Interaction: Memory Perspective

Considering that assembly language poses challenges both in reading and in developing code, the general trend is to migrate towards high-level languages (which are much easier to read and provide a more user-friendly API). However, there are still situations where, for optimization reasons, small assembly routines are used and integrated into the high-level language module.

In this laboratory, we will explore how assembly modules can be integrated into C programs and vice versa.

Using Assembly Procedures in C Functions

Using assembly procedures in C functions for a C program to be executed, it must be translated into the machine code of the processor; this is the task of a compiler. Since this compiled code is not always optimal, in some cases it is preferable to replace portions of code written in C with portions of assembly code that do the same thing, but with better performance.

Declaration of the Procedure

In order to ensure that the assembly procedure and Module C are properly combined and compatible, the following steps must be followed:

  • declare the procedure label as global, using the GLOBAL directive. In addition to this, any data that will be used by the procedure must be declared as global.

  • using the extern directive to declare procedures and global data as external.

Calling C Functions from Assembly Procedures

In most cases, calling routines or functions from the standard C library in an assembly language program is a much more complex operation than vice versa. Take the example of calling the printf() function from an assembly language program:

global main

external printf

section .data

text db "291 is the best!", 10, 0
strformat db "%s", 0

section .code

main:
push dword text
push dword strformat
call printf
add esp, 8
ret

Note that the procedure is declared as global and is called main - the starting point of any C program. Since in C the parameters are stacked in reverse order, the string offset is set first, followed by the format string offset. The C function can be called afterwards, but the stack must be restored when the function exits.

When linking assembly code the standard C library (or the library containing the functions you use) must be included.

C - Assembly Interaction: Stack

Setting the Stack

When entering a procedure, it is necessary to set a stack frame to which to send parameters. Of course, if the procedure does not receive parameters, this step is not necessary. So to set the stack, the following code must be included:

push ebp
mov ebp, esp

The EBP gives us the ability to use it as an index within the stack and should not be altered during the procedure.

Passing Parameters from C to the Assembly Procedure

C programs send parameters to assembly procedures using the stack. Consider the following C program sequence:

##include <stdio.h>

extern int sum(int a, int b); // declare the assembly procedure as external

int main() {
int a = 5, b = 7;
int res = sum(a, b); // call the assembly procedure

return 0;
}

When C executes the call to sum(), it first pushes arguments on the stack in reverse order, then actually calls the procedure. Thus, upon entering the procedure body, the stack will be intact.

Since the variables a and b are declared as int values, they will each use one word on the stack. This method of passing parameters is called value passing. The code of the Sum procedure might look like this:

section .text
global sum ; declare the procedure label as global

sum:
push ebp
mov ebp, esp

mov eax, [ebp+8] ; retrieve the first argument
mov ecx, [ebp+12] ; retrieve the second argument
add eax, ecx ; calculate the sum

pop ebp
ret

It is interesting to note several things. First, the assembly code defaults the return value of the procedure to the eax register. Second, the ret command is sufficient to exit the procedure, due to the fact that the C compiler takes care of the rest of the stuff, such as removing parameters from the stack.