Laboratory: Adding a system call to Linux
CS 273 (OS), Fall 2020
Note the following instructions are for a 64-bit x86 kernel, kernel version 5.0.21.
User-level invocation and implementation of system calls
Recall that each process carries out user-level code (written by the programmer) and kernel-level code. Here are the steps that take place when user-level code makes a system call.
User program calls a system-call library function, e.g.,
fork()oropen().Each system-call library function is implemented as an assembly language call to the operating system, in terms of a system call number
__NR_call. For example,__NR_forkhas the value 57;__NR_getuidhas the value 102.These system call numbers must be defined for compiling both user-level code (e.g., when defining the C library functions such as
fork()andgetuid()) and for kernel source code (for implementing those system calls in the kernel).For user-level code, the numbers are defined in the header file /usr/include/x86_64-linux-gnu/asm/unistd_64.h on your virtual machine's file system. This file is ordinarily accessed by including
#include <sys/syscall.h>when compiling user-level C code on your virtual machine.For kernel code in your Linux source tree in
/usr/src/linux-5.0.21/directory on your virtual OS's file system, the numbers are defined for kernel computations in the kernel source file. This file generates C source files such as arch/x86/include/generated/asm/syscalls_64.h (which only exists in the source tree after compilation).arch/x86/entry/syscalls/syscall_64.tbl
In spite of the different formats of these three files, note that they all associate the same system call numbers to specific system calls.
Note: As illustrated above, we will use these fonts to distinguish between file path locations:This fontindicates a file in the distributed Linux source code.This font indicates a "generated" file in the Linux source tree (only exists after using
maketo build the kernel).This font indicates a file in a user file system (and not in a kernel tree).
See lib.c for an example of user-level code that creates a new library function
dubthat invokes the system calldup2(system call number 33). This uses the system callsyscallto invoke the system call, within a new functiondub(). This file lib.c is a system-call library module for user-level programmers to access your new system call.The file lib.h is a header file for using the functions defined in your library lib.c. Also, the file trylib.c shows how to use lib.h in a C program.
To compile a user-level program trylib.c that uses your system-call library function defined in lib.c, log into your virtual system, then carry out these steps.
Note: Carry out these steps as an ordinary user on your virtual machine.
Use
scpor another program to copy the files~rab/os/lib.c,~rab/os/lib.h, and~rab/os/trylib.cfrom a link machine to a directory~/testingunder your (unprivileged) account's home directory on your virtual machine.Spring 2018 notes:
Your virtual OS may not recognize a link-computer name such as
rns202-5.cs.stolaf.edu, so you may need to use a numerical IP address such as162.210.91.22instead. To determine the numerical IP address of a link machine, you can log in to a link machine (not on your virtual machine) and enter% arp rns202-5.cs.stolaf.edu rns202-5.cs.stolaf.edu (162.210.91.22) -- no entry
Here%represents the shell prompt on a link machine, and the second line indicates the output from thearpprogram, which includes the numerical IP address for that computerrns202-5.arpis the Address Resolution Protocol program, which attempts to show how domain-style names such asrns202-5.cs.stolaf.eduare translated into numerical IP addresses such as162.210.91.22.)
Now, on your virtual machine, log into (or switch users to) your unprivileged user account and enter
$ mkdir ~/testing $ cd ~/testing $ scp username@ipaddress:~rab/os/lib.h . $ scp username@ipaddress:~rab/os/lib.c . $ scp username@ipaddress:~rab/os/trylib.c .
whereusernameis your St. Olaf username (e.g.,rab) andipaddressis the numerical IP address of a link machine (e.g.,162.210.91.22).
Compile your library code.
$ gcc -c lib.c
Here, the character$represents whatever prompt your (non-root) user receives. (These steps could also be carried out by the root user, of course, but carrying them out with an ordinary user may become important for your system-call project later.)Compile your test program.
$ gcc -c trylib.c
Link these modules to produce an executable.
$ gcc -o trylib trylib.o lib.o
Run your executable.
$ ./trylib
Notes:
The command
./trylibis shown for running your program, because the directory.may not appear in your ordinary user's path by default, due to security considerations.Expected behavior: The code trylib.c calls the library function
dub2()defined in lib.c which performs system call number 33 (otherwise known asdup2()). The calldub2(1, 5)thus should return the specified alternate file descriptor 5 for standard output. trylib.c then performs awrite()call with that alternate file descriptor, which should printHello, world!on standard output. The program should print 3 lines of output that reflect these steps.
Kernel-level implementation of system calls
In the kernel sources for our architecture (64-bit x86 processors, kernel version 5.0.21), system call numbers are defined in a file
. For example, this table specifies that system call number 1 is forarch/x86/entry/syscalls/syscall_64.tblwrite, and number 57 is forfork.The data file
arch/x86/entry/syscalls/syscall_64.tblis not C source code, but it is used to produce "generated" C source-code files such as arch/x86/include/generated/asm/syscalls_64.h during the process of recompiling the kernel. As noted above, files such as arch/x86/include/generated/asm/syscalls_64.h do not appear in the original source-file tree (e.g., you won't find them in the online kernel source reference).The user-level macros for system-call numbers such as
__NR_forkare not used internally to compile the kernel. However, they do appear in the kernel source code, in paths that typically containuapisuch asarch/alpha/include/uapi/asm/unistd.h(for the DEC Alpha hardware architecture). UAPI stands for User API, and refers to a system for maintaining the system-call numbers in user-level source files (such as /usr/include/x86_64-linux-gnu/asm/unistd_64.h) consistent with the kernel source filearch/x86/entry/syscalls/syscall_64.tblDifferent architectures may use different system-call numbers for the same system call. For example,
the system-call number for
forkis 2 in 32-bit Intel x86 architectures (seearch/x86/entry/syscalls/syscall_32.tbl), whereasthe system-call number for
forkis 57 in 64-bit Intel x86 architectures (seearch/x86/entry/syscalls/syscall_64.tbl).
When a system call is performed in a running kernel, the kernel looks up the handler function for that system call using an array data structure called the system call table. (Don't confuse this runtime data structure in main memory with the source file ending in
tblon disk files!) The system call number is used as an index into this array to find that handler function.In our setup, the source code for the system call table is the array named
sys_call_table, defined in the sourcearch/x86/entry/syscall_64.cas an array (sequence) of function pointers (addresses of functions, with typesys_call_ptr_tdefined earlier in that file). That source file uses anto initialize that array from the file arch/x86/include/generated/asm/syscalls_64.h, which is automatically generated using#includedirectivearch/x86/entry/syscalls/syscall_64.tblwhen recompiling the kernel as mentioned above. (Reminder: you can see arch/x86/include/generated/asm/syscalls_64.h on your virtual machine's file system after compiling the kernel, but that file does not appear in theuncompiled Linux sources.)We can now see why arch/x86/include/generated/asm/syscalls_64.h expresses the relationship between system call numbers and their system calls using a preprocessor macro
__SYSCALL_64.The source file
syscall_64.cdefines that macro__SYSCALL_64twice! Thefirst definitionproduces declaration of all the system-call handler functions (e.g.,sys_fork()) that appear in arch/x86/include/generated/asm/syscalls_64.h.The
second definitionof the macro__SYSCALL_64defines a C array initializer for each system-call handler function, assigning that handler's name (a function pointer) to the array element indexed by that system call. For example, one result is the assignmentsys_call_table[57] = sys_fork
The special handler function
sys_ni_syscall()is a default system-call handler for any unimplemented system calls, such as 335 (which doesn't appear in arch/x86/include/generated/asm/syscalls_64.h). Observe thatsys_ni_syscallis first assigned as the handler for all elements of the arraysys_call_table[], before thethat reassigns the correct handler for implemented system calls.#includedirective
Adding a system call to the kernel
Write a spec for your new system call. This forces you to make decisions about the system call name, arguments, etc., and can be used to describe your system call in your project report. The return value should be integer, with the value -1 indicating an error condition, as with other system calls. We will use the name
rab_mycallfor this example; you can use your own initials instead ofrab. (Including your initials or username as part of the system call name will make it easier to avoid naming conflicts and easier to identify which calls are new.)Determine a system call number for your new call. Computationally, you can use any number that doesn't appear in
arch/x86/entry/syscalls/syscall_64.tbland is less than the generated value of the macro__NR_syscall_max. For this class you should choose the first unused system-call number inarch/x86/entry/syscalls/syscall_64.tbl, which is 335 for our setup in the case of your first new system call.Choose a name for your handler function, which will carry out the steps of your system call. In this example, we will choose the name
__x64_sys_rab_mycall.The prefix
__x64_sysis required for system-call handlers in Linux version 5.0.21.For this class, follow this
__x64_sys_uname_pattern for your handler names, whereunameis your username, to make it easier to identify your handlers.
Add a new line near the end of
arch/x86/entry/syscalls/syscall_64.tblto specify the information for your new system call. In our example, we add a line335 common rab_mycall __x64_sys_rab_mycallThe second column indicates machine-level calling conventions for your system-call handler, i.e., how parameters are passed. Technically, this is called the ABI, or Application Binary Interface choice. For our assumed 64-bit setup,
commonis the usual ABI; for a 32-bit setup, the only ABI choice available isi386.Add the definition of your handler function (e.g.,
__x64_sys_rab_mycall) to the kernel sources.For this example, we will define
rab_mycallto be a clone of Linux'sgetpidsystem call.The system-call index page provides a list of all system calls in the (unmodified) Linux 5.0.21 source. This indicates that
getpidis defined in the source file, in the following lines:kernel/sys.cat line 891SYSCALL_DEFINE0(getpid) { return task_tgid_vnr(current); }Here,SYSCALL_DEFINE0(getpid)is a call of a preprocessor macro that produces the function header for the functionsys_getpid. The source code uses that macroSYSCALL_DEFINE0because the system callgetpidhas no arguments.Note: The source code uses macros
SYSCALL_DEFINED1,SYSCALL_DEFINED2, etc., for system calls that require arguments. For example, using system-call index we see the 2-argument system callgetpriorityis defined online 266 of that same file, using the preprocessor macro callkernel/sys.cSYSCALL_DEFINE2(getpriority, int, which, int, who)(together with a much larger function body thangetpid()above). Here,getpriorityis the name of the system call whose handler (sys_getpriority()) is being defined;the second and third arguments of
SYSCALL_DEFINE2specify thatsys_getpriority's first argument is namedwhichand has typeint; andthe fourth and fifth arguments of
SYSCALL_DEFINE2specify thatsys_getpriority's second argument is namedwhoand has typeint.
Insert the following lines just before or just after the definition of
getpid:SYSCALL_DEFINE0(rab_mycall) { return task_tgid_vnr(current); }
Note: We are adding this handler function definition to an existing source file. It is also possible to define your system calls in a separate new file of source-code, but making and testing minimal modifications is best when trying something new, because it's easier to isolate sources of error if something goes wrong (incremental development).
Recompile the kernel (in the top-level directory
/usr/src/5.0.21), which creates a new kernellinuxthat also implements your new system call.Note: If you don't need to make a configuration change, you can skip configuration and potentially reduce recompile time to a fraction of a complete recompile.
However, modifying files that are
#included in a lot of other files, such assyscalls.tbland files generated by it, will also lengthen recompiles. (But modifying a source file involving a system call that has already been entered insyscall.tbltypically leads to a much shorter recompile.)
Don't forget to install your new kernel if necessary. (This appears to happen automatically in Spring 2018.)
Boot your new kernel, and login to a user account. The following three steps are performed within this login process.
In your user level library source (e.g., lib.c) described above, make a new library function for the new system call (e.g.,
rab_mycall()), using the same system call number (e.g.,__NR_rab_mycall) as you determined for the kernel. Also, add a declaration of your library function to the header file for that library, e.g., lib.h.Write a test program (named, e.g., trylib.c) that calls your new system call, so you can determine whether it is working properly.
For the example, the new system call
rab_mycallshould behave exactly like the existing system callgetpid, so a good test would be to print the results of both system calls.Compile your library and test program, and link them to create an executable, as described above. Run your test program in a terminal window on your virtual machine, and check for the desired behavior.
For the example, verify whether the system calls
getpidandrab_newcallreturn the same number (PID for thetrylibprocess), as we expect. For an additional, more thorough test, you could add a call tosleep(15)(check the manual page...) totryliband check that thepsprogram returns the same PID value, by running the following command in another terminal window within your running VM during thesleep()delay intrylib:$ ps auxww | grep trylib
I. Decide names and specs
II. Update header files and the system call table
You only need to specify the first four columns in your new line; the fifth column (for 64-bit handler function names) is copied from the fourth column by default.
Note: Use sudo to edit source files, e.g.,
$ sudo emacs
III. Add source code for your new system call
IV. Build and test
Note on debugging: If anything goes wrong, study the procedure
above carefully to determine the stage when the error
must have taken place. For instance, if something goes wrong with
the example rab_newcall:
If you encounter a linking error when compiling trylib.c, such as the function
rab_newcall()not being found, then look for an error involving your user-level library lib.c. You may have forgotten to link in your library lib.o (during step 10), or there may be an error when defining the library functionrab_newcall()within lib.c (step 9), etc.If you encounter a system message that your new system call doesn't exist, an examination of the steps above leads to several possible causes.
The system call number chosen in step 2 and used in steps 4 and 9 enables your user program to access your new system-call handler. Thus, there could be a problem involving the system call number
__NR_rab_mycall- did you use the same number (e.g., 335) both within the kernel (inarch/x86/entry/syscalls/syscall_64.tbl) and in your user-level library lib.c? This error could also arise from failing to set up that system-call number in one of these locations.The line you add to
arch/x86/entry/syscalls/syscall_64.tblwithin the kernel in step 4 is supposed to connect the system call number (e.g., 335) to your new handler function__x64_sys_rab_mycall, which is defined in step 5 using a macroSYSCALL_DEFINEn(). Check both of these steps to insure that the handler function was in fact defined and entered insyscall_64.tbl. Don't forget to look for a potential misspelling of "rab_mycall," etc.It may well be that you did not boot with a new kernel that includes your system call. This could happen if you didn't recompile the kernel (step 6), or forgot to install your new kernel (step 7), or installed it but booted the wrong kernel (step 8). To determine which kernel you are using, try
$ uname -a
which will typically display which version of kernel sources were used (5.0.21) and the time when that kernel's compile finished.Also, insure that you are running your test program in a terminal window on your running virtual machine (step 11), not on a link computer or your laptop.
TO UPDATE: Adding files to the kernel
TO UPDATE: You can optionally define new system calls and other relevant code in new source files, instead of modifying source files from the Linux 5.0.21 distribution. See Patrick's wiki page for more information.