______ Operating Systems (CS 273 (OS), Fall 2020)
Home
>>    




Operating Systems

CS 273 (OS), Fall 2020

Directory structure of source code

The following are subdirectories of the root directory linux, with examples of their content (last updated for kernel level 5.0.21).

  • Documentation/ -- technical documentation related to implementation and configuration of Linux

  • arch/ -- architecture-dependent code

    • Subdirectories for each supported architecture, including arch/i386 (Intel IA32, including Pentium), arch/x86 (AMD/Intel 64-bit superset of IA32), arch/ia64 (Intel 64-bit, now called Itanium), arch/alpha (DEC/Compaq Alpha RISC machine), arch/m68k (Motorola 68000 series), arch/ppc (Power PC), arch/sparc (Sun Sparc), etc.
    • Each such subdirectory contains a Makefile, code for creating bootable images for that architecture, assembly code needed by various parts of the kernel, etc.

  • include/ -- header files

    • include/asm-generic -- architecture-level header files that can be shared among architectures. (See arch/x86/include for header files specific to x86 architectures.
    • include/linux subdirectory contains (over 1000) header files used throughout the Linux kernel source code
    • The process table in Linux is called task_struct, and is (partially) defined in the crucial header file include/linux/sched.h.

  • kernel/ -- source code for fundamental process management

    See the system-call index page for specific locations of each system call.

  • init/ -- source code for the init process (pid 1).

    • As the parent of all user-initiated processes, init s source code must be included as part of the kernel

  • ipc/ -- source code for IPC support

    • ipc/sem.c, implementing semaphores
    • ipc/shm.c, support for shared memory multiprocessors
    • ipc/msg.c, implements message passing system calls msgget(), msgsnd(), msgrcv(), msgctl().

  • mm/ -- source code for memory management

    • swapping, paging (virtual memory), shared memory, memory protection

  • drivers/ -- source code for device drivers

  • fs/ -- source code for file systems

    • Code used by all file systems, e.g., main implementation module for ioctl() system call in fs/ioctl.c, shared code used by file systems to interact with I/O devices in fs/driver.c
    • Implementation of each file system in subdirectories
      • fs/ext3, basic Linux file system
      • fs/bfs, Berkeley File System, used by BSD systems
      • fs/fat, core code for MS-DOS FAT files system
      • fs/ntfs, Windows NT
      • fs/minix
      • fs/nfs, Network File System used for remote mounting of disk volumes over a network
      • fs/proc, a pseudo file system that provides a convenient interface to runtime system information (see "/proc" on a Linux machine)
      Note: Linux is remarkable for supporting so many file systems, making sharing of information convenient.

  • net/ -- source code for network protocols

  • lib/ -- miscellaneous source code

    • lib/errno.c defines the integer variable errno (and does nothing else!)

  • scripts/ -- shell scripts, Perl scripts, and other programs for managing the kernel and related documents

Examining the source code

The Elixir Cross-Referencer project provides a searchable hypertext version of various versions of Linux kernel source.

  • Source navigation allows you to browse the source code according to directory structure

  • Search box on the source navigation main page returns links to variable definitions and every appearance of each of the 50,000-60,000 identifiers used in Linux (type names, variable names, and preprocessor macro names). Although this service is not precise (e.g., declarations of a struct are counted as definitions, and Linux's heavy use of the C preprocessor is an indexing "nightmare"), it is quite handy. As a convenience, each identifier in a source-code page is hyperlinked to its index entry. Regular expression searches are supported.

Some miscellaneous features in the source code

  • Note that many parts of the source code use a standard doubly linked list implementation found in . This implementation includes the following definitions.

    • Type struct list_head, which consists of two pointers to struct list_head. This type is used for nodes within a list as well as for the head of a list.
    • Macro LIST_HEAD with one argument, an identifier name, that defines a variable of type struct list_head named name and initializes both fields to point to that same struct variable name.
    • Inline functions
      • list_add, for adding an element at the beginning of a list (like a stack).
      • list_add_tail, for adding an element at the end of a list (like a queue).
      • list_del, for deleting an element of a list.
    • More macros
      • list_empty, predicate.
      • list_for_each, which iterates over a list, given the head of that list.

  • Occasionally, a seemingly gratuitous C do loop appears in the source code. Such loops represent a tricky way to control the way a segment of code is compiled. See discussion of context switching for examples.

  • Within the kernel, system memory is allocated dynamically using the function kmalloc. Such memory is deallocated using kfree.

    • The filenames slab.* refer to the allocation strategy used for portions of kernel memory (each portion is called a "slab"), which are stockpiled according to known sizes of data structures used in the kernel.

    • Note that the function kmalloc() is implemented in a header file include/linux/slab.h, because it is an inline function (i.e., the whole assembly code of the function body is substituted at compile time for each call of kmalloc(), for efficiency). But the function kfree() is implemented in a code file mm/slab.c (since that function is not inline).

  • Kernel functions that implement system calls are defined using macros SYSCALL_DEFINE0, SYSCALL_DEFINE1(), SYSCALL_DEFINE2(), SYSCALL_DEFINE3(), SYSCALL_DEFINE4(), SYSCALL_DEFINE5(), and SYSCALL_DEFINE6(). For example, see the macro call that defines the kernel implementation of the open system call.