Lab 2 - Library Perspective
Task: Common Functions
Enter the chapters/software-stack/libc/drills/tasks/common-functions/
folder, run make skels
, then enter support/
.
Go through the practice items below.
Update
os_string.c
andos_string.h
to make available theos_strcat()
function that performs the same string concatenation asstrcat()
fromlibc
. Check your implementation by runningmake check
insupport/tests/
. If some of the tests fail, start debugging from the file that callsos_strcat()
:test.c
.Update the
main_printf.c
file to use the implementation ofsprintf()
to collect information to be printed inside a buffer. Call thewrite()
function to print the information. Theprintf()
function will no longer be called. This results in a singlewrite()
system call.Using previously implemented functions allows us to more efficiently write new programs. These functions provide us with extensive features that we use in our programs.
Update the
putchar()
function inmain_printf.c
to implement a buffered functionality ofprintf()
. Characters passed via theputchar()
call will be stored in a predefined static global buffer. Thewrite()
call will be invoked when a newline is encountered or when the buffer is full. This results in a reduced number ofwrite
system calls. Usestrace
to confirm the reduction of the number ofwrite
system calls.Update the
main_printf.c
file to also feature aflush()
function that forces the flushing of the static global buffer and awrite
system call. Make calls toprintf()
andflush()
to validate the implementation. Usestrace
to inspect thewrite()
system calls invoked byprintf()
andflush()
.
If you're having difficulties solving this exercise, go through this reading material.
Task: Libraries and libc
Enter the chapters/software-stack/libc/libc/drills/tasks/
folder, run make skels
, then enter support/
.
Now go through the practice items below.
Use
malloc()
andfree()
functions in thememory.c
program. Make your own use of the allocated memory.It's very easy to use memory management functions with the libc. The alternative (without the libc) would be more cumbersome.
Use different values for
malloc()
, i.e. the allocation size. Usestrace
to check the system calls invoked bymalloc()
andfree()
. You'll see that, depending on the size, thebrk()
ormmap()
/munmap()
system calls are invoked. And for certain calls tomalloc()
/free()
no syscall is happening. You'll find more about them in the Data chapter.Create your own C program with calls to the standard C library in
vendetta.c
. Be as creative as you can about the types of functions being made.Inside the
vendetta.c
file make a callopen("a.txt", O_RDWR | O_CREAT, 0644)
to open / create thea.txt
file. Make sure you include all required headers. Check the system call being made.Make an
fopen()
with the proper arguments that gets as close as possible to theopen()
call, i.e. the system call arguments are as close as possible.Inside the
vendetta.c
file make a call tosin()
function (for sine). Computesin(0)
andsin(PI/2)
.
If you're having difficulties solving this exercise, go through this reading material.
Task: High-Level Languages
Enter the chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/
folder, run make skels
, then enter spport/
Then go through the practice items below.
Use
make
to create thehello
executable from thehello.go
file (a Go "Hello, World!"-printing program). Useltrace
andstrace
to compute the number of library calls and system calls. Useperf
to measure the running time.Compare the values with those from the "Hello, World!"-printing programs in C and Python.
Create a "Hello, World!"-printing program in a programming language of your choice (other than C, Python and Go). Find the values above (library calls, system calls and running time).
Create programs in C, Python and Go that compute the N-th Fibonacci number.
N
is passed as a command-line argument. Run the checker (make check
in thehigh-level-lang/solution/tests/
folder) to check your results.Use
ltrace
andstrace
to compute the number of library calls and system calls. Useperf
to measure the running time.Compare the values of the three programs.
Create programs in C, Python and Go that copy a source file into a destination file. Both files are passed as the two command-line arguments for the program. Run the checker (
make check
in thehigh-level-lang/support/tests/
folder) to check your results.Sample run:
student@so:~/.../solution/tests/$ make check
make -C ../src
make[1]: Entering directory '/media/teo/1TB/Poli/Asistent/SO/operating-systems/chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/solution/src'
go build -ldflags '-linkmode external -extldflags "-dynamic"' hello.go
cc -z lazy fibo.c -o fibo
go build -o fibo_go -ldflags '-linkmode external -extldflags "-dynamic"' fibo.go
cc -z lazy copy.c -o copy
go build -o copy_go -ldflags '-linkmode external -extldflags "-dynamic"' copy.go
make[1]: Leaving directory '/media/teo/1TB/Poli/Asistent/SO/operating-systems/chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/solution/src'
Fibonacci [C] -- fibo(10) == 55 -- PASSED
Fibonacci [C] -- fibo( 5) == 5 -- PASSED
Fibonacci [C] -- fibo(20) == 6765 -- PASSED
Fibonacci [Python] -- fibo(10) == 55 -- PASSED
Fibonacci [Python] -- fibo( 5) == 5 -- PASSED
Fibonacci [Python] -- fibo(20) == 6765 -- PASSED
Fibonacci [Go] -- fibo(10) == 55 -- PASSED
Fibonacci [Go] -- fibo( 5) == 5 -- PASSED
Fibonacci [Go] -- fibo(20) == 6765 -- PASSED
Copy [C] -- PASSED
Copy [Python] -- PASSED
Copy [Go] -- PASSEDUse
ltrace
andstrace
to compute the number of library calls and system calls. Useperf
to measure the running time. Use source files of different sizes. Compare the outputs of these commands on the three programs.
If you're having difficulties solving this exercise, go through this reading material.
Task: App Investigation
Enter the chapters/software-stack/applications/drills/tasks/app-investigation/support/
folder and go through the practice items below.
Select a binary executable application and a scripted application.
Use
ldd
on the two applications. Notice the resulting messages and explain the results.Use
ltrace
andstrace
on the two applications. Follow the library calls and the system calls done by each application.Check to see whether there are statically-linked application executables in the system. The
file
command tells if the file passed as argument is a statically-linked executable. If you can't find one, install thebusybox-static
package.Look into what busybox is and explain why it's custom to have it as statically-linked executable.
Run
ldd
,nm
,strace
,ltrace
on a statically-linked application executable. Explain the results.
If you're having difficulties solving this exercise, go through this reading material.
Guide: Statically-linked and Dynamically-linked Libraries
Libraries can be statically-linked or dynamically-linked, creating statically-linked executables and dynamically-linked executables. Typically, the executables found in modern operating systems are dynamically-linked, given their reduced size and ability to share libraries at runtime.
The chapters/software-stack/libraries/guides/static-dynamic/support/
folder stores the implementation of a simple "Hello, World!"-printing program that uses both static and dynamic linking of libraries.
Let's build and run the two executables:
student@os:~/.../static-dynamic/support$ ls
hello.c Makefile
student@os:~/.../static-dynamic/support$ make
cc -Wall -c -o hello.o hello.c
cc hello.o -o hello
cc -static -o hello_static hello.o
student@os:~/.../static-dynamic/support$ ls -lh
total 852K
-rwxrwxr-x 1 razvan razvan 8.2K Aug 2 15:53 hello
-rw-rw-r-- 1 razvan razvan 76 Aug 2 15:51 hello.c
-rw-rw-r-- 1 razvan razvan 1.6K Aug 2 15:53 hello.o
-rwxrwxr-x 1 razvan razvan 827K Aug 2 15:53 hello_static
-rw-rw-r-- 1 razvan razvan 237 Aug 2 15:53 Makefile
student@os:~/.../static-dynamic/support$ ./hello
Hello, World!
student@os:~/.../static-dynamic/support$ ./hello_static
Hello, World!
The two executables (hello
and hello_static
) behave similarly, despite having vastly different sizes (8.2K
vs. 827K
- 100 times larger).
We use nm
and ldd
to catch differences between the two types of resulting executables:
student@os:~/.../static-dynamic/support$ ldd hello
linux-vdso.so.1 (0x00007ffc8d9b2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f10d1d88000)
/lib64/ld-linux-x86-64.so.2 (0x00007f10d237b000)
student@os:~/.../static-dynamic/support$ ldd hello_static
not a dynamic executable
student@os:~/.../static-dynamic/support$ nm hello | wc -l
33
student@os:~/.../static-dynamic/support$ nm hello_static | wc -l
1674
The dynamic executable references the dynamically-linked libc library (/lib/x86_64-linux-gnu/libc.so.6
), while the statically-linked executable has no references.
Also, given the statically-linked executable integrated entire parts of statically-linked libraries, there are many more symbols than in the case of a dynamically-linked executable (1674
vs. 33
).
We can use strace
to see that there are differences in the preparatory system calls for each type of executables.
For the dynamically-linked executable, the dynamically-linked library (/lib/x86_64-linux-gnu/libc.so.6
) is opened during runtime:
student@os:~/.../static-dynamic/support$ strace ./hello
execve("./hello", ["./hello"], 0x7ffc409c6640 /- 66 vars */) = 0
brk(NULL) = 0x55a72eda6000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=198014, ...}) = 0
mmap(NULL, 198014, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3136a41000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\35\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2030928, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3136a3f000
mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f3136458000
mprotect(0x7f313663f000, 2097152, PROT_NONE) = 0
mmap(0x7f313683f000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f313683f000
mmap(0x7f3136845000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3136845000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7f3136a404c0) = 0
mprotect(0x7f313683f000, 16384, PROT_READ) = 0
mprotect(0x55a72d1bb000, 4096, PROT_READ) = 0
mprotect(0x7f3136a72000, 4096, PROT_READ) = 0
munmap(0x7f3136a41000, 198014) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
brk(NULL) = 0x55a72eda6000
brk(0x55a72edc7000) = 0x55a72edc7000
write(1, "Hello, World!\n", 14Hello, World!
) = 14
exit_group(0) = ?
+++ exited with 0 +++
student@os:~/.../static-dynamic/support$ strace ./hello_static
execve("./hello_static", ["./hello_static"], 0x7ffc9fd45400 /- 66 vars */) = 0
brk(NULL) = 0xff8000
brk(0xff91c0) = 0xff91c0
arch_prctl(ARCH_SET_FS, 0xff8880) = 0
uname({sysname="Linux", nodename="yggdrasil", ...}) = 0
readlink("/proc/self/exe", "/home/razvan/school/so/operating"..., 4096) = 116
brk(0x101a1c0) = 0x101a1c0
brk(0x101b000) = 0x101b000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
write(1, "Hello, World!\n", 14Hello, World!
) = 14
exit_group(0) = ?
+++ exited with 0 +++
Similarly, we can investigate a system executable (/bin/ls
) to see that indeed all referenced dynamically-linked libraries are opened (via the openat
system call) at runtime:
student@os:~/.../static-dynamic/support$ ldd $(which ls)
linux-vdso.so.1 (0x00007ffc3bdf3000)
libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f092bd88000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f092b997000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f092b726000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f092b522000)
/lib64/ld-linux-x86-64.so.2 (0x00007f092c1d2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f092b303000)
student@os:~/.../static-dynamic/support$ strace -e openat ls
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
community docs _index.html search.md
+++ exited with 0 +++
Common Functions
By using wrapper calls, we are able to write our programs in C. However, we still need to implement common functions for string management, working with I/O, working with memory.
The simple attempt is to implement these functions (printf()
or strcpy()
or malloc()
) once in a C source code file and then reuse them when needed.
This saves us time (we don't have to reimplement) and allows us to constantly improve one implementation constantly;
there will only be one implementation that we update to increase its safety, efficiency or performance.
Go to chapters/software-stack/libc/drills/tasks/common-functions/
and run make skels
.
The support/
folder stores the implementation of string management functions, in os_string.c
and os_string.h
and of printing functions in printf.c
and printf.h
.
The printf()
implementation is this one.
There are two programs: main_string.c
showcases string management functions, main_printf.c
showcases the printf()
function.
main_string.c
depends on the os_string.h
and os_string.c
files that implement the os_strlen()
and os_strcpy()
functions.
We print messages using the write()
system call wrapper implemented in syscall.s
Let's build and run the program:
student@os:~/.../common-functions/support/src$ make main_string
gcc -fno-PIC -fno-stack-protector -c -o main_string.o main_string.c
gcc -fno-PIC -fno-stack-protector -c -o os_string.o os_string.c
nasm -f elf64 -o syscall.o syscall.s
gcc -nostdlib -no-pie -Wl,--entry=main -Wl,--build-id=none main_string.o os_string.o syscall.o -o main_string
student@os:~/.../common-functions/support/src$ ./main_string
Destination string is: warhammer40k
student@os:~/.../common-functions/support/src$ strace ./main_string
execve("./main_string", ["./main_string"], 0x7ffd544d0a70 /- 63 vars */) = 0
write(1, "Destination string is: ", 23Destination string is: ) = 23
write(1, "warhammer40k\n", 13warhammer40k
) = 13
exit(0) = ?
+++ exited with 0 +++
When using strace
we see that only the write()
system call wrapper triggers a system call.
There are no system calls triggered by os_strlen()
and os_strcpy()
as can be seen in their implementation.
In addition, main_printf.c
depends on the printf.h
and printf.c
files that implement the printf()
function.
There is a requirement to implement the _putchar()
function;
we implement it in the main_printf.c
file using the write()
syscall call wrapper.
The main()
function main_printf.c
file contains all the string and printing calls.
printf()
offers a more powerful printing interface, allowing us to print addresses and integers.
Let's build and run the program:
student@os:~/.../common-functions/support$ make main_printf
gcc -fno-PIC -fno-stack-protector -c -o main_printf.o main_printf.c
gcc -fno-PIC -fno-stack-protector -c -o printf.o printf.c
gcc -no-pie main_printf.o printf.o syscall.o -o main_printf
student@os:~/.../common-functions/support$ ./main_printf
[before] src is at 00000000004026A0, len is 12, content: "warhammer40k"
[before] dest is at 0000000000603000, len is 0, content: ""
copying src to dest
[after] src is at 00000000004026A0, len is 12, content: "warhammer40k"
[after] dest is at 0000000000603000, len is 12, content: "warhammer40k"
student@os:~/.../common-functions/support$ strace ./main_printf
[...]
write(1, "[", 1[) = 1
write(1, "b", 1b) = 1
write(1, "e", 1e) = 1
write(1, "f", 1f) = 1
write(1, "o", 1o) = 1
write(1, "r", 1r) = 1
write(1, "e", 1e) = 1
write(1, "]", 1]) = 1
[...]
We see that we have greater printing flexibility with the printf()
function.
However, one downside of the current implementation is that it makes a system call for each character.
This is inefficient and could be improved by printing a whole string.
Libraries and libc
Once we have common functions implemented, we can reuse them at any time.
The main unit for software reusability is the library.
In short, a library is a common machine code that can be linked against different other software components.
Each time we want to use the printf()
function or the strlen()
function, we don't need to reimplement them.
We also don't need to use existing source code files, rebuild them and reuse them.
We (re)use existing machine code in libraries.
A library is a collection of object files that export given data structures and functions to be used by other programs. We create a program, we compile and then we link it against the library for all the features it provides.
The most important library in modern operating systems is the standard C library, also called libc.
This is the library providing system call wrappers and basic functionality for input-output, string management, memory management.
By default, a program is always linked with the standard C library.
In the examples above, we've explicitly disabled the use of the standard C library with the help of the -nostdlib
linker option.
By using the standard C library, it's much easier to create new programs. You call existing functionality in the library and implement only features particular to your program.
The chapters/software-stack/libc/drills/tasks/libc/support/
folder stores the implementation of programs using the standard C library: hello.c
, main_string.c
and main_printf.c
.
These programs are almost identical to those used in the past sections:
hello.c
is similar to the programs inchapters/software-stack/system-calls/drills/tasks/basic-syscall/solution/
andchapters/software-stack/system-calls/drills/tasks/syscall-wrapper/solution/
main_string.c
andmain_printf.c
are similar to the programs inchapters/software-stack/libc/drills/tasks/common-functions/solution/
Let's build and run them:
student@os:~/.../libc/support$ ls
hello hello.c hello.o main_printf main_printf.c main_printf.o main_string main_string.c main_string.o Makefile
student@os:~/.../libc/support$ make clean
rm -f hello hello.o
rm -f main_printf main_printf.o
rm -f main_string main_string.o
student@os:~/.../libc/support$ ls
hello.c main_printf.c main_string.c Makefile
student@os:~/.../libc/support$ make
cc -Wall -c -o hello.o hello.c
cc -static hello.o -o hello
cc -Wall -c -o main_printf.o main_printf.c
cc -static main_printf.o -o main_printf
cc -Wall -c -o main_string.o main_string.c
cc -static main_string.o -o main_string
student@os:~/.../libc/support$ ls
hello hello.c hello.o main_printf main_printf.c main_printf.o main_string main_string.c main_string.o Makefile
student@os:~/.../libc/support$ ./hello
Hello, world!
Bye, world!
aaa
aaa
^C
student@os:~/.../libc/support$ ./main_string
Destination string is: warhammer40k
student@os:~/.../libc/support$ ./main_printf
[before] src is at 0x492308, len is 12, content: "warhammer40k"
[before] dest is at 0x6bb340, len is 0, content: ""
copying src to dest
[after] src is at 0x492308, len is 12, content: "warhammer40k"
[after] dest is at 0x6bb340, len is 12, content: "warhammer40k"
abc
The behavior / output is similar to the ones in the previous sections:
student@os:~/.../libc/support$ ../../solution/basic-syscall/hello-nasm
Hello, world!
Bye, world!
aaa
aaa
^C
student@os:~/.../libc/support$ ../../solution/common-functions/main_string
Destination string is: warhammer40k
student@os:~/.../libc/support$ ../../solution/common-functions/main_printf
[before] src is at 0000000000402680, len is 12, content: "warhammer40k"
[before] dest is at 0000000000604000, len is 0, content: ""
copying src to dest
[after] src is at 0000000000402680, len is 12, content: "warhammer40k"
[after] dest is at 0000000000604000, len is 12, content: "warhammer40k"
abc
We can inspect the system calls made to check the similarities.
For example, for the main_printf
program we get the outputs:
student@os:~/.../libc/support$ strace ./main_printf
execve("./main_printf", ["./main_printf"], 0x7fff7b38c240 /- 66 vars */) = 0
brk(NULL) = 0x15af000
brk(0x15b01c0) = 0x15b01c0
arch_prctl(ARCH_SET_FS, 0x15af880) = 0
uname({sysname="Linux", nodename="[...]", ...}) = 0
readlink("/proc/self/exe", "[...]/operating"..., 4096) = 105
brk(0x15d11c0) = 0x15d11c0
brk(0x15d2000) = 0x15d2000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 18), ...}) = 0
write(1, "[before] src is at 0x492308, len"..., 64[before] src is at 0x492308, len is 12, content: "warhammer40k"
) = 64
write(1, "[before] dest is at 0x6bb340, le"..., 52[before] dest is at 0x6bb340, len is 0, content: ""
) = 52
write(1, "copying src to dest\n", 20copying src to dest
) = 20
write(1, "[after] src is at 0x492308, len "..., 63[after] src is at 0x492308, len is 12, content: "warhammer40k"
) = 63
write(1, "[after] dest is at 0x6bb340, len"..., 64[after] dest is at 0x6bb340, len is 12, content: "warhammer40k"
) = 64
write(1, "ab", 2ab) = 2
write(1, "c\n", 2c
) = 2
exit_group(0) = ?
+++ exited with 0 +++
student@os:~/.../libc/support$ strace ../../solution/common-functions/main_printf
execve("../../solution/common-functions/main_printf", ["../../solution/common-functions/"...], 0x7ffe204eec00 /- 66 vars */) = 0
write(1, "[before] src is at 0000000000402"..., 72[before] src is at 0000000000402680, len is 12, content: "warhammer40k"
) = 72
write(1, "[before] dest is at 000000000060"..., 60[before] dest is at 0000000000604000, len is 0, content: ""
) = 60
write(1, "copying src to dest\n", 20copying src to dest
) = 20
write(1, "[after] src is at 00000000004026"..., 71[after] src is at 0000000000402680, len is 12, content: "warhammer40k"
) = 71
write(1, "[after] dest is at 0000000000604"..., 72[after] dest is at 0000000000604000, len is 12, content: "warhammer40k"
) = 72
write(1, "ab", 2ab) = 2
write(1, "c\n", 2c
) = 2
exit(0) = ?
+++ exited with 0 +++
The output is similar, with differences at the beginning and the end of the system call trace.
In the case of the libc-built program, a series of additional system calls (brk
, arch_prctl
, uname
etc.) are made.
Also, there is an implicit call to exit_group
instead of an explicit one to exit
in the non-libc case.
These are initialization and cleanup routines that are implicitly added when using the standard C library.
They are generally used for setting and cleaning up the stack, environment variables and other pieces of information required by the program or the standard C library itself.
We could argue that the initialization steps incur overhead, and that's a downside of using the standard C library. However, these initialization steps are required for almost all programs. And, given that almost all programs make use of the basic features of the standard C library, libc is almost always used. We can say the above were exceptions to the rule, where we didn't make use of the standard C library.
Summarizing, the advantages and disadvantages of using the standard C library are:
- (+) easier development: do calls to existing functions already implemented in the standard C library; default build and link flags
- (+) portability: if the system provides a standard C library, one calls the library functions that will then interact with the lower-layer API
- (+) implicit initialization and cleanup: no need for you do explicitly create them
- (-) usually larger in size (static) executables
- (-) a level of overhead as the standard C library wraps system calls
- (-) potential security issues: a larger set of (potentially vulnerable) functions are presented by the standard C library
High-Level Languages
Using the standard C library (libc) frees the programmer from the cumbersome steps of invoking system calls and reimplementing common features. Still, for improved development time and safety, other programming languages can be used, such as Rust, Python, JavaScript. Most (if not all) of these high-level programming languages still make use of the standard C library. Such that a call to a function in Python would end-up making a call to a function in the standard C library.
The chapters/software-stack/high-level-languages/drills/tasks/high-level-lang/support/
folder stores the implementation of a simple "Hello, World!"-printing program in Python.
We simply invoke the python
interpreter to run the program:
student@os:~/.../high-level-lang/support$ python hello.py
Hello, world!
We count the number of functions called from the standard C library and the number of system calls:
student@os:~/.../high-level-lang/support$ ltrace -l 'libc*' python hello.py 2> libc.out
Hello, world!
student@os:~/.../high-level-lang/support$ wc -l libc.out
50469 out
student@os:~/.../high-level-lang/support$ strace python hello.py 2> syscall.out
Hello, world!
student@os:~/.../high-level-lang/support$ wc -l syscall.out
948 syscall.out
The dynamic standard C library (libc.so.6
) is a dependency of the Python interpreter (/usr/bin/python3
):
student@os:~/.../high-level-lang/support$ ldd /usr/bin/python3
[...]
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa6fd6d0000)
[...]
We can see the complexity of invoking the Python interpreter, resulting in more the 50,000 of library calls being made. This means added overhead versus a simple C function. However, this also means faster development in the Python programming language. Each new layer in the software stack simplifies development but adds overhead.
We can use perf
to compare the running time between the Python and a C "Hello, World!"-printing programs:
student@os:~/.../high-level-lang/support$ sudo perf stat ../static-dynamic/hello
Hello, World!
Performance counter stats for '../static-dynamic/hello':
0.46 msec task-clock # 0.559 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.114 M/sec
859,341 cycles # 1.882 GHz
713,395 instructions # 0.83 insn per cycle
141,710 branches # 310.393 M/sec
6,208 branch-misses # 4.38% of all branches
0.000816974 seconds time elapsed
0.000872000 seconds user
0.000000000 seconds sys
student@os:~/.../high-level-lang/support$ sudo perf stat python hello.py
Hello, world!
Performance counter stats for 'python hello.py':
69.39 msec task-clock # 0.992 CPUs utilized
2 context-switches # 0.029 K/sec
0 cpu-migrations # 0.000 K/sec
1,115 page-faults # 0.016 M/sec
74,405,125 cycles # 1.072 GHz
84,957,056 instructions # 1.14 insn per cycle
18,574,724 branches # 267.689 M/sec
759,104 branch-misses # 4.09% of all branches
0.069981351 seconds time elapsed
0.054376000 seconds user
0.015536000 seconds sys
We can see that on all metrics, the running of the Python program is less efficient than the running of the C program.
The Python code takes 69
milliseconds, whereas the C code runs in less than 1
millisecond.
When deciding what programming language and what libraries and software components to use, you have to balance requirements for fast development and increased safety (inherent to higher-level programming languages) with requirements for speed or efficiency (common to lower-level programming languages such as C). Newer modern programming languages such as Go, Rust, D aim to add the benefits of high-level programming languages and keep efficiency close to the C programming language. Generally, additional software layers (libraries, language environments, interpreters) simplify development but decrease speed and efficiency.
App Investigation
Let's spend some time investigating actual applications residing on the local system. For now, we know that applications are developed using high-level languages and then compiled or interpreted to use the lower-layer interfaces of the software stack, such as the system call API.
Let's enter the chapters/software-stack/applications/drills/tasks/app-investigation/support/
folder and run the get_app_types.sh
script:
student@os:~/.../app-investigation/support/$ ./get_app_types.sh
binary apps: 2223
Perl apps: 256
Shell apps: 454
Python apps: 123
Other apps: 27
The script prints the types of the application executables in the system. The output will differ between systems, given each has particular types of applications installed.
We list them by running the command inside the get_app_types.sh
script:
student@os:~/.../app-investigation/support/$ find /usr/bin /bin /usr/sbin /sbin -type f -exec file {} \;
[...]
/usr/bin/qpdldecode: ELF 64-bit LSB shared object, x86-64 [...]
/usr/bin/mimeopen: Perl script text executable
[...]
As above, the output will differ between systems.
So, depending on the developers' choice, applications may be:
- compiled into executables, from compiled languages such as C, C++, Go, Rust, D
- developed as scripts, from interpreted languages such as Python, Perl, JavaScript