Laboratory: Adding a system call to Linux
CS 273 (OS), Fall 2020
Note the following instructions are for a 64-bit x86 kernel, kernel version 5.0.21.
User-level invocation and implementation of system calls
Recall that each process carries out user-level code (written by the programmer) and kernel-level code. Here are the steps that take place when user-level code makes a system call.
User program calls a system-call library function, e.g.,
fork()
oropen()
.Each system-call library function is implemented as an assembly language call to the operating system, in terms of a system call number
__NR_call
. For example,__NR_fork
has the value 57;__NR_getuid
has the value 102.These system call numbers must be defined for compiling both user-level code (e.g., when defining the C library functions such as
fork()
andgetuid()
) and for kernel source code (for implementing those system calls in the kernel).For user-level code, the numbers are defined in the header file /usr/include/x86_64-linux-gnu/asm/unistd_64.h on your virtual machine's file system. This file is ordinarily accessed by including
#include <sys/syscall.h>
when compiling user-level C code on your virtual machine.For kernel code in your Linux source tree in
/usr/src/linux-5.0.21/
directory on your virtual OS's file system, the numbers are defined for kernel computations in the kernel source file
. This file generates C source files such as arch/x86/include/generated/asm/syscalls_64.h (which only exists in the source tree after compilation).arch/x86/entry/syscalls/syscall_64.tbl
In spite of the different formats of these three files, note that they all associate the same system call numbers to specific system calls.
Note: As illustrated above, we will use these fonts to distinguish between file path locations:This font
indicates a file in the distributed Linux source code.This font indicates a "generated" file in the Linux source tree (only exists after using
make
to build the kernel).This font indicates a file in a user file system (and not in a kernel tree).
See lib.c for an example of user-level code that creates a new library function
dub
that invokes the system calldup2
(system call number 33). This uses the system callsyscall
to invoke the system call, within a new functiondub()
. This file lib.c is a system-call library module for user-level programmers to access your new system call.The file lib.h is a header file for using the functions defined in your library lib.c. Also, the file trylib.c shows how to use lib.h in a C program.
To compile a user-level program trylib.c that uses your system-call library function defined in lib.c, log into your virtual system, then carry out these steps.
Note: Carry out these steps as an ordinary user on your virtual machine.
Use
scp
or another program to copy the files~rab/os/lib.c
,~rab/os/lib.h
, and~rab/os/trylib.c
from a link machine to a directory~/testing
under your (unprivileged) account's home directory on your virtual machine.Spring 2018 notes:
Your virtual OS may not recognize a link-computer name such as
rns202-5.cs.stolaf.edu
, so you may need to use a numerical IP address such as162.210.91.22
instead. To determine the numerical IP address of a link machine, you can log in to a link machine (not on your virtual machine) and enter% arp rns202-5.cs.stolaf.edu rns202-5.cs.stolaf.edu (162.210.91.22) -- no entry
Here%
represents the shell prompt on a link machine, and the second line indicates the output from thearp
program, which includes the numerical IP address for that computerrns202-5
.arp
is the Address Resolution Protocol program, which attempts to show how domain-style names such asrns202-5.cs.stolaf.edu
are translated into numerical IP addresses such as162.210.91.22
.)
Now, on your virtual machine, log into (or switch users to) your unprivileged user account and enter
$ mkdir ~/testing $ cd ~/testing $ scp username@ipaddress:~rab/os/lib.h . $ scp username@ipaddress:~rab/os/lib.c . $ scp username@ipaddress:~rab/os/trylib.c .
whereusername
is your St. Olaf username (e.g.,rab
) andipaddress
is the numerical IP address of a link machine (e.g.,162.210.91.22
).
Compile your library code.
$ gcc -c lib.c
Here, the character$
represents whatever prompt your (non-root) user receives. (These steps could also be carried out by the root user, of course, but carrying them out with an ordinary user may become important for your system-call project later.)Compile your test program.
$ gcc -c trylib.c
Link these modules to produce an executable.
$ gcc -o trylib trylib.o lib.o
Run your executable.
$ ./trylib
Notes:
The command
./trylib
is shown for running your program, because the directory.
may not appear in your ordinary user's path by default, due to security considerations.Expected behavior: The code trylib.c calls the library function
dub2()
defined in lib.c which performs system call number 33 (otherwise known asdup2()
). The calldub2(1, 5)
thus should return the specified alternate file descriptor 5 for standard output. trylib.c then performs awrite()
call with that alternate file descriptor, which should printHello, world!
on standard output. The program should print 3 lines of output that reflect these steps.
Kernel-level implementation of system calls
In the kernel sources for our architecture (64-bit x86 processors, kernel version 5.0.21), system call numbers are defined in a file
. For example, this table specifies that system call number 1 is forarch/x86/entry/syscalls/syscall_64.tbl
write
, and number 57 is forfork
.The data file
arch/x86/entry/syscalls/syscall_64.tbl
is not C source code, but it is used to produce "generated" C source-code files such as arch/x86/include/generated/asm/syscalls_64.h during the process of recompiling the kernel. As noted above, files such as arch/x86/include/generated/asm/syscalls_64.h do not appear in the original source-file tree (e.g., you won't find them in the online kernel source reference).The user-level macros for system-call numbers such as
__NR_fork
are not used internally to compile the kernel. However, they do appear in the kernel source code, in paths that typically containuapi
such asarch/alpha/include/uapi/asm/unistd.h
(for the DEC Alpha hardware architecture). UAPI stands for User API, and refers to a system for maintaining the system-call numbers in user-level source files (such as /usr/include/x86_64-linux-gnu/asm/unistd_64.h) consistent with the kernel source filearch/x86/entry/syscalls/syscall_64.tbl
Different architectures may use different system-call numbers for the same system call. For example,
the system-call number for
fork
is 2 in 32-bit Intel x86 architectures (seearch/x86/entry/syscalls/syscall_32.tbl
), whereasthe system-call number for
fork
is 57 in 64-bit Intel x86 architectures (seearch/x86/entry/syscalls/syscall_64.tbl
).
When a system call is performed in a running kernel, the kernel looks up the handler function for that system call using an array data structure called the system call table. (Don't confuse this runtime data structure in main memory with the source file ending in
tbl
on disk files!) The system call number is used as an index into this array to find that handler function.In our setup, the source code for the system call table is the array named
sys_call_table
, defined in the sourcearch/x86/entry/syscall_64.c
as an array (sequence) of function pointers (addresses of functions, with typesys_call_ptr_t
defined earlier in that file). That source file uses an
to initialize that array from the file arch/x86/include/generated/asm/syscalls_64.h, which is automatically generated using#include
directivearch/x86/entry/syscalls/syscall_64.tbl
when recompiling the kernel as mentioned above. (Reminder: you can see arch/x86/include/generated/asm/syscalls_64.h on your virtual machine's file system after compiling the kernel, but that file does not appear in theuncompiled Linux sources
.)We can now see why arch/x86/include/generated/asm/syscalls_64.h expresses the relationship between system call numbers and their system calls using a preprocessor macro
__SYSCALL_64
.The source file
syscall_64.c
defines that macro__SYSCALL_64
twice! Thefirst definition
produces declaration of all the system-call handler functions (e.g.,sys_fork()
) that appear in arch/x86/include/generated/asm/syscalls_64.h.The
second definition
of the macro__SYSCALL_64
defines a C array initializer for each system-call handler function, assigning that handler's name (a function pointer) to the array element indexed by that system call. For example, one result is the assignmentsys_call_table[57] = sys_fork
The special handler function
sys_ni_syscall()
is a default system-call handler for any unimplemented system calls, such as 335 (which doesn't appear in arch/x86/include/generated/asm/syscalls_64.h). Observe thatsys_ni_syscall
is first assigned as the handler for all elements of the arraysys_call_table[]
, before the
that reassigns the correct handler for implemented system calls.#include
directive
Adding a system call to the kernel
Write a spec for your new system call. This forces you to make decisions about the system call name, arguments, etc., and can be used to describe your system call in your project report. The return value should be integer, with the value -1 indicating an error condition, as with other system calls. We will use the name
rab_mycall
for this example; you can use your own initials instead ofrab
. (Including your initials or username as part of the system call name will make it easier to avoid naming conflicts and easier to identify which calls are new.)Determine a system call number for your new call. Computationally, you can use any number that doesn't appear in
arch/x86/entry/syscalls/syscall_64.tbl
and is less than the generated value of the macro__NR_syscall_max
. For this class you should choose the first unused system-call number inarch/x86/entry/syscalls/syscall_64.tbl
, which is 335 for our setup in the case of your first new system call.Choose a name for your handler function, which will carry out the steps of your system call. In this example, we will choose the name
__x64_sys_rab_mycall
.The prefix
__x64_sys
is required for system-call handlers in Linux version 5.0.21.For this class, follow this
__x64_sys_uname_
pattern for your handler names, whereuname
is your username, to make it easier to identify your handlers.
Add a new line near the end of
arch/x86/entry/syscalls/syscall_64.tbl
to specify the information for your new system call. In our example, we add a line335 common rab_mycall __x64_sys_rab_mycall
The second column indicates machine-level calling conventions for your system-call handler, i.e., how parameters are passed. Technically, this is called the ABI, or Application Binary Interface choice. For our assumed 64-bit setup,
common
is the usual ABI; for a 32-bit setup, the only ABI choice available isi386
.Add the definition of your handler function (e.g.,
__x64_sys_rab_mycall
) to the kernel sources.For this example, we will define
rab_mycall
to be a clone of Linux'sgetpid
system call.The system-call index page provides a list of all system calls in the (unmodified) Linux 5.0.21 source. This indicates that
getpid
is defined in the source file
, in the following lines:kernel/sys.c
at line 891SYSCALL_DEFINE0(getpid) { return task_tgid_vnr(current); }
Here,SYSCALL_DEFINE0(getpid)
is a call of a preprocessor macro that produces the function header for the functionsys_getpid
. The source code uses that macroSYSCALL_DEFINE0
because the system callgetpid
has no arguments.Note: The source code uses macros
SYSCALL_DEFINED1
,SYSCALL_DEFINED2
, etc., for system calls that require arguments. For example, using system-call index we see the 2-argument system callgetpriority
is defined online 266 of that same file
, using the preprocessor macro callkernel/sys.c
SYSCALL_DEFINE2(getpriority, int, which, int, who)
(together with a much larger function body thangetpid()
above). Here,getpriority
is the name of the system call whose handler (sys_getpriority()
) is being defined;the second and third arguments of
SYSCALL_DEFINE2
specify thatsys_getpriority
's first argument is namedwhich
and has typeint
; andthe fourth and fifth arguments of
SYSCALL_DEFINE2
specify thatsys_getpriority
's second argument is namedwho
and has typeint
.
Insert the following lines just before or just after the definition of
getpid
:SYSCALL_DEFINE0(rab_mycall) { return task_tgid_vnr(current); }
Note: We are adding this handler function definition to an existing source file. It is also possible to define your system calls in a separate new file of source-code, but making and testing minimal modifications is best when trying something new, because it's easier to isolate sources of error if something goes wrong (incremental development).
Recompile the kernel (in the top-level directory
/usr/src/5.0.21
), which creates a new kernellinux
that also implements your new system call.Note: If you don't need to make a configuration change, you can skip configuration and potentially reduce recompile time to a fraction of a complete recompile.
However, modifying files that are
#include
d in a lot of other files, such assyscalls.tbl
and files generated by it, will also lengthen recompiles. (But modifying a source file involving a system call that has already been entered insyscall.tbl
typically leads to a much shorter recompile.)
Don't forget to install your new kernel if necessary. (This appears to happen automatically in Spring 2018.)
Boot your new kernel, and login to a user account. The following three steps are performed within this login process.
In your user level library source (e.g., lib.c) described above, make a new library function for the new system call (e.g.,
rab_mycall()
), using the same system call number (e.g.,__NR_rab_mycall
) as you determined for the kernel. Also, add a declaration of your library function to the header file for that library, e.g., lib.h.Write a test program (named, e.g., trylib.c) that calls your new system call, so you can determine whether it is working properly.
For the example, the new system call
rab_mycall
should behave exactly like the existing system callgetpid
, so a good test would be to print the results of both system calls.Compile your library and test program, and link them to create an executable, as described above. Run your test program in a terminal window on your virtual machine, and check for the desired behavior.
For the example, verify whether the system calls
getpid
andrab_newcall
return the same number (PID for thetrylib
process), as we expect. For an additional, more thorough test, you could add a call tosleep(15)
(check the manual page...) totrylib
and check that theps
program returns the same PID value, by running the following command in another terminal window within your running VM during thesleep()
delay intrylib
:$ ps auxww | grep trylib
I. Decide names and specs
II. Update header files and the system call table
You only need to specify the first four columns in your new line; the fifth column (for 64-bit handler function names) is copied from the fourth column by default.
Note: Use sudo
to edit source files, e.g.,
$ sudo emacs
III. Add source code for your new system call
IV. Build and test
Note on debugging: If anything goes wrong, study the procedure
above carefully to determine the stage when the error
must have taken place. For instance, if something goes wrong with
the example rab_newcall
:
If you encounter a linking error when compiling trylib.c, such as the function
rab_newcall()
not being found, then look for an error involving your user-level library lib.c. You may have forgotten to link in your library lib.o (during step 10), or there may be an error when defining the library functionrab_newcall()
within lib.c (step 9), etc.If you encounter a system message that your new system call doesn't exist, an examination of the steps above leads to several possible causes.
The system call number chosen in step 2 and used in steps 4 and 9 enables your user program to access your new system-call handler. Thus, there could be a problem involving the system call number
__NR_rab_mycall
- did you use the same number (e.g., 335) both within the kernel (inarch/x86/entry/syscalls/syscall_64.tbl
) and in your user-level library lib.c? This error could also arise from failing to set up that system-call number in one of these locations.The line you add to
arch/x86/entry/syscalls/syscall_64.tbl
within the kernel in step 4 is supposed to connect the system call number (e.g., 335) to your new handler function__x64_sys_rab_mycall
, which is defined in step 5 using a macroSYSCALL_DEFINEn()
. Check both of these steps to insure that the handler function was in fact defined and entered insyscall_64.tbl
. Don't forget to look for a potential misspelling of "rab_mycall
," etc.It may well be that you did not boot with a new kernel that includes your system call. This could happen if you didn't recompile the kernel (step 6), or forgot to install your new kernel (step 7), or installed it but booted the wrong kernel (step 8). To determine which kernel you are using, try
$ uname -a
which will typically display which version of kernel sources were used (5.0.21) and the time when that kernel's compile finished.Also, insure that you are running your test program in a terminal window on your running virtual machine (step 11), not on a link computer or your laptop.
TO UPDATE: Adding files to the kernel
TO UPDATE: You can optionally define new system calls and other relevant code in new source files, instead of modifying source files from the Linux 5.0.21 distribution. See Patrick's wiki page for more information.