Recipe ID: hsts-r56
This article gives an overview of a Linux system. First, the major services provided by the operating system are described. Then, the programs that implement these services are described with a considerable lack of detail. The purpose of this article is to give an understanding of the system as a whole, so that each part is described in detail elsewhere.
1. Various parts of an operating system
UNIX and 'UNIX-like' operating systems (such as Linux) consist of a kernel and some system programs. There are also some application programs for doing work. The kernel is the heart of the operating system. In fact, it is often mistakenly considered to be the operating system itself, but it is not. An operating system provides many more services than a plain kernel.
It keeps track of files on the disk, starts programs and runs them concurrently, assigns memory and other resources to various processes, receives packets from and sends packets to the network, and so on. The kernel does very little by itself, but it provides tools with which all services can be built. It also prevents anyone from accessing the hardware directly, forcing everyone to use the tools it provides. This way the kernel provides some protection for users from each other.
The system programs use the tools provided by the kernel to implement the various services required from an operating system. System programs, and all other programs, run `on top of the kernel', in what is called the user mode. The difference between system and application programs is one of intent: applications are intended for getting useful things done (or for playing, if it happens to be a game), whereas system programs are needed to get the system working. A word processor is an application; mount is a system program. The difference is often somewhat blurry, however, and is important only to compulsive categorizers.
An operating system can also contain compilers and their corresponding libraries (GCC and the C library in particular under Linux), although not all programming languages need be part of the operating system. Documentation, and sometimes even games, can also be part of it. Traditionally, the operating system has been defined by the contents of the installation tape or disks; with Linux it is not as clear since it is spread all over the FTP sites of the world.
Important parts of the kernel
The Linux kernel consists of several important parts: process management, memory management, hardware device drivers, filesystem drivers, network management, and various other bits and pieces. Figure 1 shows some of them.
Figure 1. Some of the more important parts of the Linux kernel
Probably the most important parts of the kernel (nothing else works without them) are memory management and process management (see How Linux Operating System Memory Management worksfor details). Memory management takes care of assigning memory areas and swap space areas to processes, parts of the kernel, and for the buffer cache. Process management creates processes, and implements multitasking by switching the active process on the processor.
At the lowest level, the kernel contains a hardware device driver for each kind of hardware it supports. Since the world is full of different kinds of hardware, the number of hardware device drivers is large. There are often many otherwise similar pieces of hardware that differ in how they are controlled by software. The similarities make it possible to have general classes of drivers that support similar operations; each member of the class has the same interface to the rest of the kernel but differs in what it needs to do to implement them. For example, all disk drivers look alike to the rest of the kernel, i.e., they all have operations like `initialize the drive', `read sector N', and `write sector N'.
Some software services provided by the kernel itself have similar properties, and can therefore be abstracted into classes. For example, the various network protocols have been abstracted into one programming interface, the BSD socket library. Another example is the virtual filesystem (VFS) layer that abstracts the filesystem operations away from their implementation. Each filesystem type provides an implementation of each filesystem operation. When some entity tries to use a filesystem, the request goes via the VFS, which routes the request to the proper filesystem driver. A more in depth discussion on linux file system can be found atComprehensive Review of Linux File System Architecture and Management and Comprehensive Review of How Linux File and Directory System Works.
3. Major services in a UNIX system
This section describes some of the more important UNIX services, but without much detail. They are described more thoroughly in later articles.
The single most important service in a UNIX system is provided by init init is started as the first process of every UNIX system, as the last thing the kernel does when it boots. When init starts, it continues the boot process by doing various startup chores (checking and mounting filesystems, starting daemons, etc).
The exact list of things that init does depends on which flavor it is; there are several to choose from. init usually provides the concept of single user mode, in which no one can log in and root uses a shell at the console; the usual mode is called multiuser mode. Some flavors generalize this as run levels; single and multiuser modes are considered to be two run levels, and there can be additional ones as well, for example, to run X on the console.
Linux allows for up to 10 runlevels, 0-9, but usually only some of these are defined by default. Runlevel 0 is defined as ``system halt''. Runlevel 1 is defined as ``single user mode''. Runlevel 3 is defined as "multi user" because it is the runlevel that the system boot into under normal day to day conditions. Runlevel 5 is typically the same as 3 except that a GUI gets started also. Runlevel 6 is defined as ``system reboot''. Other runlevels are dependent on how your particular distribution has defined them, and they vary significantly between distributions. Looking at the contents of /etc/inittab usually will give some hint what the predefined runlevels are and what they have been defined as.
In normal operation, init makes sure getty is working (to allow users to log in) and to adopt orphan processes (processes whose parent has died; in UNIX all processes must be in a single tree, so orphans must be adopted).
When the system is shut down, it is init that is in charge of killing all other processes, unmounting all filesystems and stopping the processor, along with anything else it has been configured to do
3. Logins from terminals
Logins from terminals (via serial lines) and the console (when not running X) are provided by the getty program. init starts a separate instance of getty for each terminal upon which logins are to be allowed. getty reads the username and runs the loginprogram, which reads the password. If the username and password are correct, login runs the shell. When the shell terminates, i.e., the user logs out, or when login terminated because the username and password didn't match, init notices this and starts a new instance of getty. The kernel has no notion of logins, this is all handled by the system programs.
The kernel and many system programs produce error, warning, and other messages. It is often important that these messages can be viewed later, even much later, so they should be written to a file. The program doing this is syslog . It can be configured to sort the messages to different files according to writer or degree of importance. For example, kernel messages are often directed to a separate file from the others, since kernel messages are often more important and need to be read regularly to spot problems.
3.4. Periodic command execution: cron and at
Both users and system administrators often need to run commands periodically. For example, the system administrator might want to run a command to clean the directories with temporary files (/tmp and /var/tmp) from old files, to keep the disks from filling up, since not all programs clean up after themselves correctly.
The cron service is set up to do this. Each user can have a crontab file, where she lists the commands she wishes to execute and the times they should be executed. The cron daemon takes care of starting the commands when specified.
The at service is similar to cron, but it is once only: the command is executed at the given time, but it is not repeated.
See the manual pages cron(1), crontab(1), crontab(5), at(1) and atd(8) for more in depth information.
3.5. Graphical user interface
UNIX and Linux don't incorporate the user interface into the kernel; instead, they let it be implemented by user level programs. This applies for both text mode and graphical environments.
This arrangement makes the system more flexible, but has the disadvantage that it is simple to implement a different user interface for each program, making the system harder to learn.
The graphical environment primarily used with Linux is called the X Window System (X for short). X also does not implement a user interface; it only implements a window system, i.e., tools with which a graphical user interface can be implemented. Some popular window managers are: fvwm , icewm , blackbox , and windowmaker . There are also two popular desktop managers, KDE and Gnome.
Networking is the act of connecting two or more computers so that they can communicate with each other. The actual methods of connecting and communicating are slightly complicated, but the end result is very useful.
UNIX operating systems have many networking features. Most basic services (filesystems, printing, backups, etc) can be done over the network. This can make system administration easier, since it allows centralized administration, while still reaping in the benefits of microcomputing and distributed computing, such as lower costs and better fault tolerance.
3.7. Network logins
Network logins work a little differently than normal logins. For each person logging in via the network there is a separate virtual network connection, and there can be any number of these depending on the available bandwidth. It is therefore not possible to run a separate getty for each possible virtual connection. There are also several different ways to log in via a network, telnet and ssh being the major ones in TCP/IP networks.
These days many Linux system administrators consider telnet and rlogin to be insecure and prefer ssh, the ``secure shell'', which encrypts traffic going over the network, thereby making it far less likely that the malicious can ``sniff'' your connection and gain sensitive data like usernames and passwords. It is highly recommended you use ssh rather than telnet or rlogin.
Network logins have, instead of a herd of gettys, a single daemon per way of logging in (telnet and ssh have separate daemons) that listens for all incoming login attempts. When it notices one, it starts a new instance of itself to handle that single attempt; the original instance continues to listen for other attempts. The new instance works similarly to getty.
3.8. Network file systems
One of the more useful things that can be done with networking services is sharing files via a network file system. Depending on your network this could be done over the Network File System (NFS), or over the Common Internet File System (CIFS). NFS is typically a 'UNIX' based service. In Linux, NFS is supported by the kernel. CIFS however is not. In Linux, CIFS is supported by Samba.
With a network file system any file operations done by a program on one machine are sent over the network to another computer. This fools the program to think that all the files on the other computer are actually on the computer the program is running on. This makes information sharing extremely simple, since it requires no modifications to programs.
Electronic mail is the most popularly used method for communicating via computer. An electronic letter is stored in a file using a special format, and special mail programs are used to send and read the letters.
Each user has an incoming mailbox (a file in the special format), where all new mail is stored. When someone sends mail, the mail program locates the receiver's mailbox and appends the letter to the mailbox file. If the receiver's mailbox is in another machine, the letter is sent to the other machine, which delivers it to the mailbox as it best sees fit.
The mail system consists of many programs. The delivery of mail to local or remote mailboxes is done by one program (the mail transfer agent (MTA) , e.g., sendmail or postfix ), while the programs users use are many and varied (mail user agent (MUA) , e.g., pine , or evolution . The mailboxes are usually stored in /var/spool/mail until the user's MUA retrieves them.
Only one person can use a printer at one time, but it is uneconomical not to share printers between users. The printer is therefore managed by software that implements a print queue: all print jobs are put into a queue and whenever the printer is done with one job, the next one is sent to it automatically. This relieves the users from organizing the print queue and fighting over control of the printer. Instead, they form a new queue at the printer, waiting for their printouts, since no one ever seems to be able to get the queue software to know exactly when anyone's printout is really finished. This is a great boost to intra-office social relations.
The print queue software also spools the printouts on disk, i.e., the text is kept in a file while the job is in the queue. This allows an application program to spit out the print jobs quickly to the print queue software; the application does not have to wait until the job is actually printed to continue. This is really convenient, since it allows one to print out one version, and not have to wait for it to be printed before one can make a completely revised new version.
3.11. The filesystem layout
The filesystem is divided into many parts; usually along the lines of a root filesystem with /bin , /lib , /etc , /dev , and a few others; a /usr filesystem with programs and unchanging data; /var filesystem with changing data (such as log files); and a /home for everyone's personal files. Depending on the hardware configuration and the decisions of the system administrator, the division can be different; it can even be all in one filesystem.
Resources for Linux Kernel Programmers
Linux File System Dictionary
Comprehensive Review of How Linux File and Directory System Works
Hands-on Linux classes
Linux Operating System Distributions
We provide private tutoring classes online and offline (at our DC site or your preferred location) with custom curriculum for almost all of our classes for $50 per hour online or $75 per hour in DC. Give us a call or submit our private tutoring registration form to discuss your needs.