Lab 10 - The C - Assembly Interaction
Task: Maximum Calculation in Assembly with Call from C
Navigate to drills/tasks/max-c-calls/support
and open main.c
In this subdirectory you can find an implementation of calculating the maximum of a number where the main()
function is defined in C from which the get_max()
function defined in assembly language is called.
Trace the code in the two files and how the function arguments and return value are passed.
Compile and run the program. To compile it run the command:
make
Then run the resulting executable:
./mainmax
IMPORTANT: Pay attention to understanding the code before proceeding to the next exercise.
IMPORTANT: The return value of a function is placed in the
eax
register.
Maximum Computation Extension in Assembly with Call from C
Extend the program from the previous exercise (in assembly language and C) so that the get_max()
function now has the signature unsigned int get_max(unsigned int *arr, unsigned int len, unsigned int *pos)
.
The third argument to the function is the address where the position in the vector on which the maximum is found will be held.
The position in the vector on which the maximum is found will also be displayed on display.
TIP: To hold the position, it is best to define a local variable
pos
in themain()
function in the C file (main.c
) in the formunsigned int pos;
and call the
get_max()
function in the form:max = get_max(arr, 10, &pos);
If you're having difficulties solving this exercise, go through this relevant section reading material.
Task: Corrupt Stack Frame Debugging
Navigate to drills/tasks/stack-frame/support
and open main.c
In current subdirectory of the lab's task archive you can find a C program that implements the display of the string Hello world!
by a call to the print_hello()
function defined in the assembly for the first part of the message, followed by two calls to the printf()
function directly from the C code.
Compile and run the program. What do you notice? The printed message is not as expected because the assembly code is missing an instruction.
Use GDB to inspect the address at the top of the stack before executing the ret
statement in the print_hello()
function.
What does it point to?
Track the values of the ebp
and esp
registers during the execution of this function.
What should be at the top of the stack after execution of the leave
statement?
Find the missing instruction and rerun the executable.
TIP: In order to restore the stack to its state at the start of the current function, the
leave
statement relies on the function's pointer frame having been set.
If you're having difficulties solving this exercise, go through this relevant section reading material.
Task: Maximum Calculation in C with Call from Assembly
Navigate to drills/tasks/max-assembly-calls/support/
and open main.asm
In this directory you can find an implementation of calculating the maximum of a number where the main()
function is defined in assembly language from which the get_max()
function defined in C is called.
Trace the code in the two files and how the function arguments and return value are passed.
Compile and run the program.
IMPORTANT: Pay attention to understanding the code before proceeding to the next exercise.
Extending Maximum Computation in C with Call from Assembly
Extend the program from the previous exercise (in assembly language and C) so that the get_max()
function now has the signature unsigned int get_max(unsigned int *arr, unsigned int len, unsigned int *pos)
.
The third argument to the function is the address where the position in the vector on which the maximum is found will be held.
The position in the vector on which the maximum is found will also be displayed on display.
TIP: To hold the position, it is best to define a global variable in the assembly file (
main.asm
) in the.data
section, of the formpos: dd 0
This variable you will pass (by address) to the
get_max()
call and by value to theprintf()
call for display.For display modify the
print_format
string and theprintf()
call in the assembly file (main.asm
) to allow two values to be displayed: maximum and position.
If you're having difficulties solving this exercise, go through this relevant section reading material.
Task: Keeping Records
Navigate to drills/tasks/regs-preserve/support
and open main.asm
In this subdirectory of the lab's task repository you will find the print_reverse_array()
function implemented by a simple loop that makes repeated calls of the printf()
function.
Follow the code in the main.asm
file, compile and run the program.
What happened?
The program runs indefinitely.
This is because the printf()
function does not preserve the value in the ecx
register, used here as a counter.
Uncomment the lines marked TODO1
and rerun the program.
Troubleshooting SEGFAULT
Decompose the lines marked TODO2
in the assembly file from the previous exercise.
The code sequence makes a call to the double_array()
function, implemented in C, just before displaying the vector using the function seen earlier.
Compile and run the program.
To debug the segfault you can use the objdump
utility to trace the assembly language code corresponding to the double_array()
function.
Notice which of the registers used before and after the call are modified by this function.
Add the instructions for preserving and restoring the required registers to the assembly file.
If you're having difficulties solving this exercise, go through this relevant section reading material.
Task: Warning (not an error)
Access the directory drills/tasks/include-fix/support/
.
Run the make
command.
A warning appears, but it is from the preprocessing/compilation process.
Resolve this warning by editing the hello.c
file.
Bonus: Fix the warning without using the #include
directive.
If you're having difficulties solving this exercise, go through this reading material.
Task: Fixing Export Issues
Access the directory drills/tasks/export-fix/support/
.
Each subdirectory (a-func/
, b-var/
, c-var-2/
) contains a problem related to the export of symbols (functions or variables).
In each subdirectory, run the make
command, identify the issue, and edit the necessary files to resolve it.
If you're having difficulties solving this exercise, go through this reading material.
Task: Maximum Computation in Assembly with 64-bit C Call
Navigate to drills/tasks/max-c-calls-x64/support
and open main.c
In this subdirectory you should have implemented the maximum assembly language calculation on a 64-bit system. Start the program from exercises 4 and 5 in such a way that you run it using a 64-bit system.
TIP: https://en.wikipedia.org/wiki/X86_calling_conventions.
The first thing to note is that on the x64 architecture the registers are 8 bytes in size and have different names than the 32-bit ones (in addition to extending the traditional ones:
eax
register becomesrax
,ebx
register becomesrbx
, etc., there are new ones: R10-R15: for more information see here).Also, on x64 architecture parameters are no longer sent to the stack, but put in registers. The first 3 parameters are put in:
rdi
,rsi
andrdx
registers. This is not a uniformly adopted convention. This convention is only valid on Linux, on Windows there are other registers which are used to pass the parameters of a function. The calling convention requires that, for functions with a variable number of arguments,rax
register be set to the number of vector registers used to pass arguments.printf()
is a variable argument count function, and unless you use registers other than those mentioned in the previous paragraph for passing arguments, you must setrax = 0
before calling. Read more here.
Task: Bonus: Maximum Calculation in C with Call from Assembly - 64 Bits
Enter the directory drills/tasks/max-assembly-calls-x64/support
and implement the maximum calculation in C with a call from Assembly language on a 64-bit system.
Start from the program used in drills/tasks/max-assembly-calls
, ensuring it runs on a 64-bit system.
Follow the instructions from the previous exercise and pay attention to the order of parameters.
C - Assembly Interaction: Memory Perspective
Considering that assembly language poses challenges both in reading and in developing code, the general trend is to migrate towards high-level languages (which are much easier to read and provide a more user-friendly API). However, there are still situations where, for optimization reasons, small assembly routines are used and integrated into the high-level language module.
In this laboratory, we will explore how assembly modules can be integrated into C programs and vice versa.
Using Assembly Procedures in C Functions
Using assembly procedures in C functions for a C program to be executed, it must be translated into the machine code of the processor; this is the task of a compiler. Since this compiled code is not always optimal, in some cases it is preferable to replace portions of code written in C with portions of assembly code that do the same thing, but with better performance.
Declaration of the Procedure
In order to ensure that the assembly procedure and Module C are properly combined and compatible, the following steps must be followed:
declare the procedure label as global, using the GLOBAL directive. In addition to this, any data that will be used by the procedure must be declared as global.
using the
extern
directive to declare procedures and global data as external.
Calling C Functions from Assembly Procedures
In most cases, calling routines or functions from the standard C library in an assembly language program is a much more complex operation than vice versa.
Take the example of calling the printf()
function from an assembly language program:
global main
external printf
section .data
text db "291 is the best!", 10, 0
strformat db "%s", 0
section .code
main:
push dword text
push dword strformat
call printf
add esp, 8
ret
Note that the procedure is declared as global and is called main
- the starting point of any C program.
Since in C the parameters are stacked in reverse order, the string offset is set first, followed by the format string offset.
The C function can be called afterwards, but the stack must be restored when the function exits.
When linking assembly code the standard C library (or the library containing the functions you use) must be included.
C - Assembly Interaction: Stack
Setting the Stack
When entering a procedure, it is necessary to set a stack frame to which to send parameters. Of course, if the procedure does not receive parameters, this step is not necessary. So to set the stack, the following code must be included:
push ebp
mov ebp, esp
The EBP gives us the ability to use it as an index within the stack and should not be altered during the procedure.
Passing Parameters from C to the Assembly Procedure
C programs send parameters to assembly procedures using the stack. Consider the following C program sequence:
##include <stdio.h>
extern int sum(int a, int b); // declare the assembly procedure as external
int main() {
int a = 5, b = 7;
int res = sum(a, b); // call the assembly procedure
return 0;
}
When C executes the call to sum()
, it first pushes arguments on the stack in reverse order, then actually calls the procedure.
Thus, upon entering the procedure body, the stack will be intact.
Since the variables a
and b
are declared as int
values, they will each use one word on the stack.
This method of passing parameters is called value passing.
The code of the Sum procedure might look like this:
section .text
global sum ; declare the procedure label as global
sum:
push ebp
mov ebp, esp
mov eax, [ebp+8] ; retrieve the first argument
mov ecx, [ebp+12] ; retrieve the second argument
add eax, ecx ; calculate the sum
pop ebp
ret
It is interesting to note several things.
First, the assembly code defaults the return value of the procedure to the eax
register.
Second, the ret
command is sufficient to exit the procedure, due to the fact that the C compiler takes care of the rest of the stuff, such as removing parameters from the stack.