This article describes the important parts of a standard Linux directory tree, based on the Filesystem Hierarchy Standard . It outlines the normal way of breaking the directory tree into separate filesystems with different purposes and gives the motivation behind this particular split. Not all Linux distributions follow this standard slavishly, but it is generic enough to give you an overview.
This article is loosely based on the Filesystems Hierarchy Standard (FHS). version 2.1, which attempts to set a standard for how the directory tree in a Linux system is organized. Such a standard has the advantage that it will be easier to write or port software for Linux, and to administer Linux machines, since everything should be in standardized places. There is no authority behind the standard that forces anyone to comply with it, but it has gained the support of many Linux distributions. It is not a good idea to break with the FHS without very compelling reasons. The FHS attempts to follow Unix tradition and current trends, making Linux systems familiar to those with experience with other Unix systems, and vice versa.
This article is not as detailed as the FHS. A system administrator should also read the full FHS for a complete understanding.
This article does not explain all files in detail. The intention is not to describe every file, but to give an overview of the system from a filesystem point of view. Further information on each file is available elsewhere in this manual or in the Linux manual pages.
The full directory tree is intended to be breakable into smaller parts, each capable of being on its own disk or partition, to accommodate to disk size limits and to ease backup and other system administration tasks. The major parts are the root (/ ), /usr , /var , and /home filesystems (see Figure 1). Each part has a different purpose. The directory tree has been designed so that it works well in a network of Linux machines which may share some parts of the filesystems over a read-only device (e.g., a CD-ROM), or over the network with NFS.
Figure 1. Parts of a Unix directory tree. Dashed lines indicate partition limits.
The roles of the different parts of the directory tree are described below.
- The root filesystem is specific for each machine (it is generally stored on a local disk, although it could be a ramdisk or network drive as well) and contains the files that are necessary for booting the system up, and to bring it up to such a state that the other filesystems may be mounted. The contents of the root filesystem will therefore be sufficient for the single user state. It will also contain tools for fixing a broken system, and for recovering lost files from backups.
- The /usr filesystem contains all commands, libraries, manual pages, and other unchanging files needed during normal operation. No files in /usr should be specific for any given machine, nor should they be modified during normal use. This allows the files to be shared over the network, which can be cost-effective since it saves disk space (there can easily be hundreds of megabytes, increasingly multiple gigabytes in /usr). It can make administration easier (only the master /usr needs to be changed when updating an application, not each machine separately) to have /usr network mounted. Even if the filesystem is on a local disk, it could be mounted read-only, to lessen the chance of filesystem corruption during a crash.
- The /var filesystem contains files that change, such as spool directories (for mail, news, printers, etc), log files, formatted manual pages, and temporary files. Traditionally everything in /var has been somewhere below /usr , but that made it impossible to mount /usr read-only.
- The /home filesystem contains the users' home directories, i.e., all the real data on the system. Separating home directories to their own directory tree or filesystem makes backups easier; the other parts often do not have to be backed up, or at least not as often as they seldom change. A big /home might have to be broken across several filesystems, which requires adding an extra naming level below /home, for example /home/students and /home/staff.
Although the different parts have been called filesystems above, there is no requirement that they actually be on separate filesystems. They could easily be kept in a single one if the system is a small single-user system and the user wants to keep things simple. The directory tree might also be divided into filesystems differently, depending on how large the disks are, and how space is allocated for various purposes. The important part, though, is that all the standard names work; even if, say, /var and /usr are actually on the same partition, the names /usr/lib/libc.a and /var/log/messages must work, for example by moving files below /var into /usr/var, and making /var a symlink to /usr/var.
The Unix filesystem structure groups files according to purpose, i.e., all commands are in one place, all data files in another, documentation in a third, and so on. An alternative would be to group files files according to the program they belong to, i.e., all Emacs files would be in one directory, all TeX in another, and so on. The problem with the latter approach is that it makes it difficult to share files (the program directory often contains both static and sharable and changing and non-sharable files), and sometimes to even find the files (e.g., manual pages in a huge number of places, and making the manual page programs find all of them is a maintenance nightmare).
2. The root filesystem
The root filesystem should generally be small, since it contains very critical files and a small, infrequently modified filesystem has a better chance of not getting corrupted. A corrupted root filesystem will generally mean that the system becomes unbootable except with special measures (e.g., from a floppy), so you don't want to risk it.
The root directory generally doesn't contain any files, except perhaps on older systems where the standard boot image for the system, usually called /vmlinuz was kept there. (Most distributions have moved those files the the /boot directory. Otherwise, all files are kept in subdirectories under the root filesystem:
- Commands needed during bootup that might be used by normal users (probably after bootup).
- Like /bin, but the commands are not intended for normal users, although they may use them if necessary and allowed. /sbin is not usually in the default path of normal users, but will be in root's default path.
- Configuration files specific to the machine.
- The home directory for user root. This is usually not accessible to other users on the system
- Shared libraries needed by the programs on the root filesystem.
- Loadable kernel modules, especially those that are needed to boot the system when recovering from disasters (e.g., network and filesystem drivers).
- Device files. These are special files that help the user interface with the various devices on the system.
- Temporary files. As the name suggests, programs running often store temporary files in here.
- Files used by the bootstrap loader, e.g., LILO or GRUB. Kernel images are often kept here instead of in the root directory. If there are many kernel images, the directory can easily grow rather big, and it might be better to keep it in a separate filesystem. Another reason would be to make sure the kernel images are within the first 1024 cylinders of an IDE disk. This 1024 cylinder limit is no longer true in most cases. With modern BIOSes and later versions of LILO (the LInux LOader) the 1024 cylinder limit can be passed with logical block addressing (LBA). See the lilo manual page for more details.
- Mount point for temporary mounts by the system administrator. Programs aren't supposed to mount on /mnt automatically. /mnt might be divided into subdirectories (e.g., /mnt/dosa might be the floppy drive using an MS-DOS filesystem, and /mnt/exta might be the same with an ext2 filesystem).
- /proc, /usr, /var, /home
- Mount points for the other filesystems. Although /proc does not reside on any disk in reality it is still mentioned here. See the section about /proc later in the article.
The /etc directory
The /etc maintains a lot of files. Some of them are described below. For others, you should determine which program they belong to and read the manual page for that program. Many networking configuration files are in /etc as well, and are described in the Networking Administrators' Guide.
- /etc/rc or /etc/rc.d or /etc/rc?.d
- Scripts or directories of scripts to run at startup or when changing the run level.
- The user database, with fields giving the username, real name, home directory, and other information about each user. The format is documented in the passwd manual page.
- /etc/shadow is an encrypted file the holds user passwords.
- Floppy disk parameter table. Describes what different floppy disk formats look like. Used by setfdprm . See the setfdprm manual page for more information.
- Lists the filesystems mounted automatically at startup by the mount -a command (in /etc/rc or equivalent startup file). Under Linux, also contains information about swap areas used automatically by swapon -a . See the mount manual page for more information. Also fstab usually has its own manual page in section 5.
- Similar to /etc/passwd, but describes groups instead of users. See the group manual page in section 5 for more information.
- Configuration file for init.
- Output by getty before the login prompt. Usually contains a short description or welcoming message to the system. The contents are up to the system administrator.
- The configuration file for file. Contains the descriptions of various file formats based on which file guesses the type of the file. See the magic and file manual pages for more information.
- The message of the day, automatically output after a successful login. Contents are up to the system administrator. Often used for getting information to every user, such as warnings about planned downtimes.
- List of currently mounted filesystems. Initially set up by the bootup scripts, and updated automatically by the mount command. Used when a list of mounted filesystems is needed, e.g., by the df command.
- Configuration file for the login command. The login.defs file usually has a manual page in section 5.
- Like /etc/termcap /etc/printcap , but intended for printers. However it uses different syntax. The printcap has a manual page in section 5.
- /etc/profile, /etc/bash.rc, /etc/csh.cshrc
- Files executed at login or startup time by the Bourne, BASH , or C shells. These allow the system administrator to set global defaults for all users. Users can also create individual copies of these in their home directory to personalize their environment. See the manual pages for the respective shells.
- Identifies secure terminals, i.e., the terminals from which root is allowed to log in. Typically only the virtual consoles are listed, so that it becomes impossible (or at least harder) to gain superuser privileges by breaking into a system over a modem or a network. Do not allow root logins over a network. Prefer to log in as an unprivileged user and use su or sudo to gain root privileges.
- Lists trusted shells. The chsh command allows users to change their login shell only to shells listed in this file. ftpd, is the server process that provides FTP services for a machine, will check that the user's shell is listed in /etc/shells and will not let people log in unless the shell is listed there.
- The terminal capability database. Describes by what ``escape sequences'' various terminals can be controlled. Programs are written so that instead of directly outputting an escape sequence that only works on a particular brand of terminal, they look up the correct sequence to do whatever it is they want to do in /etc/termcap. As a result most programs work with most kinds of terminals. See the termcap, curs_termcap, and terminfo manual pages for more information.
4. The /dev directory
The /dev directory contains the special device files for all the devices. The device files are created during installation, and later with the /dev/MAKEDEV script. The /dev/MAKEDEV.local is a script written by the system administrator that creates local-only device files or links (i.e. those that are not part of the standard MAKEDEV, such as device files for some non-standard device driver).
This list which follows is by no means exhaustive or as detailed as it could be. Many of these device files will need support compiled into your kernel for the hardware. Read the kernel documentation to find details of any particular device.
If you think there are other devices which should be included here but aren't then let me know. I will try to include them in the next revision.
- Digital Signal Processor. Basically this forms the interface between software which produces sound and your soundcard. It is a character device on major node 14 and minor
- The first floppy drive. If you are lucky enough to have several drives then they will be numbered sequentially. It is a character device on major node 2 and minor 0.
- The first framebuffer device. A framebuffer is an abstraction layer between software and graphics hardware. This means that applications do not need to know about what kind of hardware you have but merely how to communicate with the framebuffer driver's API (Application Programming Interface) which is well defined and standardized. The framebuffer is a character device and is on major node 29 and minor 0.
- /dev/hda is the master IDE drive on the primary IDE controller. /dev/hdb the slave drive on the primary controller. /dev/hdc , and /dev/hdd are the master and slave devices on the secondary controller respectively. Each disk is divided into partitions. Partitions 1-4 are primary partitions and partitions 5 and above are logical partitions inside extended partitions. Therefore the device file which references each partition is made up of several parts. For example /dev/hdc9 references partition 9 (a logical partition inside an extended partition type) on the master IDE drive on the secondary IDE controller. The major and minor node numbers are somewhat complex. For the first IDE controller all partitions are block devices on major node The master drive hda is at minor 0 and the slave drive hdb is at minor 64. For each partition inside the drive add the partition number to the minor minor node number for the drive. For example /dev/hdb5 is major 3, minor 69 (64 + 5 = 69). Drives on the secondary interface are handled the same way, but with major node 22.
- The first IDE tape drive. Subsequent drives are numbered ht1 etc. They are character devices on major node 37 and start at minor node 0 for ht0 1 for ht1 etc.
- The first analogue joystick. Subsequent joysticks are numbered js1, js2 etc. Digital joysticks are called djs0, djs1 and so on. They are character devices on major node 15. The analogue joysticks start at minor node 0 and go up to 127 (more than enough for even the most fanatic gamer). Digital joysticks start at minor node 128.
- The first parallel printer device. Subsequent printers are numbered lp1, lp2 etc. They are character devices on major mode 6 and minor nodes starting at 0 and numbered sequentially.
- The first loopback device. Loopback devices are used for mounting filesystems which are not located on other block devices such as disks. For example if you wish to mount an iso9660 CD ROM image without burning it to CD then you need to use a loopback device to do so. This is usually transparent to the user and is handled by the mount command. Refer to the manual pages for mount and losetup. The loopback devices are block devices on major node 7 and with minor nodes starting at 0 and numbered sequentially.
- First metadisk group. Metadisks are related to RAID (Redundant Array of Independent Disks) devices. Metadisk devices are block devices on major node 9 with minor nodes starting at 0 and numbered sequentially.
- This is part of the OSS (Open Sound System) driver. It is a character device on major node 14, minor node 0.
- The bit bucket. A black hole where you can send data for it never to be seen again. Anything sent to /dev/null will disappear. This can be useful if, for example, you wish to run a command but not have any feedback appear on the terminal. It is a character device on major node 1 and minor node
- The PS/2 mouse port. This is a character device on major node 10, minor node 1.
- Parallel port IDE disks. These are named similarly to disks on the internal IDE controllers (/dev/hd*). They are block devices on major node 45. Minor nodes need slightly more explanation here. The first device is /dev/pda and it is on minor node 0. Partitions on this device are found by adding the partition number to the minor number for the device. Each device is limited to 15 partitions each rather than 63 (the limit for internal IDE disks). /dev/pdb minor nodes start at 16, /dev/pdc at 32 and /dev/pdd at 48. So for example the minor node number for /dev/pdc6 would be 38 (32 + 6 = 38). This scheme limits you to 4 parallel disks of 15 partitions each.
- Parallel port CD ROM drives. These are numbered from 0 onwards. All are block devices on major node 46. /dev/pcd0 is on minor node 0 with subsequent drives being on minor nodes 1, 2, 3 etc.
- Parallel port tape devices. Tapes do not have partitions so these are just numbered sequentially. They are character devices on major node 96. The minor node numbers start from 0 for /dev/pt0, 1 for /dev/pt1, and so on.
- The raw parallel ports. Most devices which are attached to parallel ports have their own drivers. This is a device to access the port directly. It is a character device on major node 99 with minor node 0. Subsequent devices after the first are numbered sequentially incrementing the minor node.
- /dev/random or /dev/urandom
- These are kernel random number generators. /dev/random is a non-deterministic generator which means that the value of the next number cannot be guessed from the preceding ones. It uses the entropy of the system hardware to generate numbers. When it has no more entropy to use then it must wait until it has collected more before it will allow any more numbers to be read from it. /dev/urandom works similarly. Initially it also uses the entropy of the system hardware, but when there is no more entropy to use it will continue to return numbers using a pseudo random number generating formula. This is considered to be less secure for vital purposes such as cryptographic key pair generation. If security is your overriding concern then use /dev/random, if speed is more important then /dev/urandom works fine. They are character devices on major node 1 with minor nodes 8 for /dev/random and 9 for /dev/urandom.
- The first SCSI drive on the first SCSI bus. The following drives are named similar to IDE drives. /dev/sdb is the second SCSI drive, /dev/sdc is the third SCSI drive, and so forth.
- The first serial port. Many times this it the port used to connect an external modem to your system.
- This is a simple way of getting many 0s. Every time you read from this device it will return 0. This can be useful sometimes, for example when you want a file of fixed length but don't really care what it contains. It is a character device on major node 1 and minor node 5.
5. The /usr filesystem.
The /usr filesystem is often large, since all programs are installed there. All files in /usr usually come from a Linux distribution; locally installed programs and other stuff goes below /usr/local. This makes it possible to update the system from a new version of the distribution, or even a completely new distribution, without having to install all programs again. Some of the subdirectories of /usr are listed below (some of the less important directories have been dropped; see the FSSTND for more information).
- The X Window System, all files. To simplify the development and installation of X, the X files have not been integrated into the rest of the system. There is a directory tree below /usr/X11R6 similar to that below /usr itself.
- Almost all user commands. Some commands are in /bin or in /usr/local/bin.
- System administration commands that are not needed on the root filesystem, e.g., most server programs.
- /usr/share/man, /usr/share/info, /usr/share/doc
- Manual pages, GNU Info documents, and miscellaneous other documentation files, respectively.
- Header files for the C programming language. This should actually be below /usr/lib for consistency, but the tradition is overwhelmingly in support for this name.
- Unchanging data files for programs and subsystems, including some site-wide configuration files. The name lib comes from library; originally libraries of programming subroutines were stored in /usr/lib.
- The place for locally installed software and other files. Distributions may not install anything in here. It is reserved solely for the use of the local administrator. This way he can be absolutely certain that no updates or upgrades to his distribution will overwrite any extra software he has installed locally.
6. The /var filesystem
The /var contains data that is changed when the system is running normally. It is specific for each system, i.e., not shared over the network with other computers.
- A cache for man pages that are formatted on demand. The source for manual pages is usually stored in /usr/share/man/man?/ (where ? is the manual section. See the manual page for man in section 7); some manual pages might come with a pre-formatted version, which might be stored in /usr/share/man/cat* . Other manual pages need to be formatted when they are first viewed; the formatted version is then stored in /var/cache/man so that the next person to view the same page won't have to wait for it to be formatted.
- Any variable data belonging to games in /usr should be placed here. This is in case /usr is mounted read only.
- Files that change while the system is running normally.
- Variable data for programs that are installed in /usr/local (i.e., programs that have been installed by the system administrator). Note that even locally installed programs should use the other /var directories if they are appropriate, e.g., /var/lock.
- Lock files. Many programs follow a convention to create a lock file in /var/lock to indicate that they are using a particular device or file. Other programs will notice the lock file and won't attempt to use the device or file.
- Log files from various programs, especially login(/var/log/wtmp, which logs all logins and logouts into the system) and syslog(/var/log/messages, where all kernel and system program message are usually stored). Files in /var/log can often grow indefinitely, and may require cleaning at regular intervals.
- This is the FHS approved location for user mailbox files. Depending on how far your distribution has gone towards FHS compliance, these files may still be held in /var/spool/mail.
- Files that contain information about the system that is valid until the system is next booted. For example, /var/run/utmp contains information about people currently logged in.
- Directories for news, printer queues, and other queued work. Each different spool has its own subdirectory below /var/spool, e.g., the news spool is in /var/spool/news . Note that some installations which are not fully compliant with the latest version of the FHS may have user mailboxes under /var/spool/mail.
- Temporary files that are large or that need to exist for a longer time than what is allowed for /tmp . (Although the system administrator might not allow very old files in /var/tmp either.)
7. The /proc filesystem
The /proc filesystem contains a illusionary filesystem. It does not exist on a disk. Instead, the kernel creates it in memory. It is used to provide information about the system (originally about processes, hence the name). Some of the more important files and directories are explained below. The /proc filesystem is described in more detail in the proc manual page.
- A directory with information about process number 1. Each process has a directory below /proc with the name being its process identification number.
- Information about the processor, such as its type, make, model, and performance.
- List of device drivers configured into the currently running kernel.
- Shows which DMA channels are being used at the moment.
- Filesystems configured into the kernel.
- Shows which interrupts are in use, and how many of each there have been.
- Which I/O ports are in use at the moment.
- An image of the physical memory of the system. This is exactly the same size as your physical memory, but does not really take up that much memory; it is generated on the fly as programs access it. (Remember: unless you copy it elsewhere, nothing under /proc takes up any disk space at all.)
- Messages output by the kernel. These are also routed to syslog.
- Symbol table for the kernel.
- The `load average' of the system; three meaningless indicators of how much work the system has to do at the moment.
- Information about memory usage, both physical and swap.
- Which kernel modules are loaded at the moment.
- Status information about network protocols.
- A symbolic link to the process directory of the program that is looking at /proc. When two processes look at /proc, they get different links. This is mainly a convenience to make it easier for programs to get at their process directory.
- Various statistics about the system, such as the number of page faults since the system was booted.
- The time the system has been up.
Note that while the above files tend to be easily readable text files, they can sometimes be formatted in a way that is not easily digestible. There are many commands that do little more than read the above files and format them for easier understanding. For example, the freeprogram reads /proc/meminfo converts the amounts given in bytes to kilobytes (and adds a little more information, as well).
Additional Linux Resources
Here is a list of resources for learning Linux:
Resources for System Administrators
Resources for Linux Kernel Programmers
Linux File System Dictionary
Comprehensive Review of How Linux File and Directory System Works
Hands-on Linux classes
Linux Operating System Distributions