The Linux/Unix File Model - Files and Users

Part I: Files and Users

1.1. The Linux/Unix File Model

One of the driving goals in the original Unix design was simplicity. Simple concepts are easy to learn and use.

When the concepts are translated into simple APIs, simple programs are then easy to design, write, and get correct. In addition, simple code is often smaller and more efficient than more complicated designs.

The quest for simplicity was driven by two factors. From a technical point of view, the original PDP-11 minicomputers on which Unix was developed had a small address space: 64 Kilobytes total on the smaller systems, 64K code and 64K of data on the large ones. These restrictions applied not just to regular programs (so-called user level code), but to the operating system itself (kernel level code). Thus, not only "Small Is Beautiful" aesthetically, but "Small Is Beautiful" because there was no other choice!

The second factor was a negative reaction to contemporary commercial operating systems, which were needlessly complicated, with obtuse command languages, multiple kinds of file I/O, and little generality or symmetry. (Steve Johnson once remarked that "Using TSO is like trying to kick a dead whale down a beach."

TSO is one of the obtuse mainframe time-sharing systems just described.)

1.1.1. Files and Permissions

The Unix file model is as simple as it gets: A file is a linear stream of bytes. Period. The operating system imposes no preordained structure on files: no fixed or varying record sizes, no indexed files, nothing. The interpretation of file contents is entirely up to the application. (This isn't quite true, as we'll see shortly, but it's close enough for a start.)

Once you have a file, you can do three things with the file's data: read them, write them, or execute them.

Unix was designed for time-sharing minicomputers; this implies a multiuser environment from the get-go. Once there are multiple users, it must be possible to specify a file's permissions: Perhaps user jane is user fred's boss, and jane doesn't want fred to read the latest performance evaluations.

For file permission purposes, users are classified into three distinct categories: user: the owner of a file; group: the group of users associated with this file (discussed shortly); and other: anybody else. For each of these categories, every file has separate read, write, and execute permission bits associated with it, yielding a total of nine

permission bits. This shows up in the first field of the output of 'ls -l':

$ ls -l progex.texi

-rw-r--r-- 1 arnold devel 5614 Feb 24 18:02 progex.texi

Here, arnold and devel are the owner and group of progex.texi, and -rw-r--r-- are the file type and permissions. The first character is a dash for regular file, a d for directories, or one of a small set of other

characters for other kinds of files that aren't important at the moment. Each subsequent group of three characters represents read, write, and execute permission for the owner, group, and "other," respectively.

In this example, progex.texi is readable and writable by the owner, and readable by the group and other. The dashes indicate absent permissions, thus the file is not executable by anyone, nor is it writable by the group or other.

The owner and group of a file are stored as numeric values known as the user ID (UID) and group ID (GID);

standard library functions that we present later in the book make it possible to print the values as human-readable names.

A file's owner can change the permission by using the chmod (change mode) command. (As such, file permissions are sometimes referred to as the "file mode.") A file's group can be changed with the chgrp (change group) and

chown (change owner) commands.^[1]

[1] Some systems allow regular users to change the ownership on their files to someone else, thus "giving them away."

The details are standardized by POSIX but are a bit messy. Typical GNU/Linux configurations do not allow it.

Group permissions were intended to support cooperative work: Although one person in a group or department may own a particular file, perhaps everyone in that group needs to be able to modify it. (Consider a collaborative marketing paper or data from a survey.)

When the system goes to check a file access (usually upon opening a file), if the UID of the process matches that of the file, the owner permissions apply. If those permissions deny the operation (say, a write to a file with -r--rw-rw- permissions), the operation fails; Unix and Linux do not proceed to test the group and other

permissions.^[2] The same is true if the UID is different but the GID matches; if the group permissions deny the operation, it fails.

[2] The owner can always change the permission, of course. Most users don't disable write permission for themselves.

Unix and Linux support the notion of a superuser: a user with special privileges. This user is known as root and has the UID of 0. root is allowed to do anything; all bets are off, all doors are open, all drawers unlocked.^[3]

(This can have significant security implications, which we touch on throughout the book but do not cover exhaustively.) Thus, even if a file is mode ---, root can still read and write the file. (One exception is that the file can'1t be executed. But as root can add execute permission, the restriction doesn't prevent anything.)

[3] There are some rare exceptions to this rule, all of which are beyond the scope of this book.

The user/group/other, read/write/execute permissions model is simple, yet flexible enough to cover most

situations. Other, more powerful but more complicated, models exist and are implemented on different systems, but none of them are well enough standardized and broadly enough implemented to be worth discussing in a general-purpose text like this one.

1.1.2. Directories and Filenames

Once you have a file, you need someplace to keep it. This is the purpose of the directory (known as a "folder" on Windows and Apple Macintosh systems). A directory is a special kind of file, which associates filenames with particular collections of file metadata, known as inodes. Directories are special because they can only be updated by the operating system, by the system calls described inChapter 4, "Files and File I/O," page 83. They are also special in that the operating system dictates the format of directory entries.

Filenames may contain any valid 8-bit byte except the / (forward slash) character and ASCII NUL, the character whose bits are all zero. Early Unix systems limited filenames to 14 bytes; modern systems allow individual

filenames to be up to 255 bytes.

The inode contains all the information about a file except its name: the type, owner, group, permissions, size, modification and access times. It also stores the locations on disk of the blocks containing the file's data. All of these are data about the file, not the file's data itself, thus the term metadata.

Directory permissions have a slightly different meaning from those for file permissions. Read permission means the ability to search the directory; that is, to look through it to see what files it contains. Write permission is the ability to create and remove files in the directory. Execute permission is the ability to go through a directory when opening or otherwise accessing a contained file or subdirectory.

Note

If you have write permission on a directory, you can remove files in that directory, even if they don't belong to you! When used interactively, the rm command notices this, and asks you for confirmation in such a case.

The /tmp directory has write permission for everyone, but your files in /tmp are quite safe because

/tmp usually has the so-called sticky bit set on it:

$ ls -ld /tmp

drwxrwxrwt 11 root root 4096 May 15 17:11 /tmp

Note the t is the last position of the first field. On most directories this position has an x in it. With the sticky bit set, only you, as the file's owner, or root may remove your files. (We discuss this in more detail in Section 11.5.2, "Directories and the Sticky Bit," page 414.)

1.1.3. Executable Files

Remember we said that the operating system doesn't impose a structure on files? Well, we've already seen that that was a white lie when it comes to directories. It's also the case for binary executable files. To run a program, the kernel has to know what part of a file represents instructions (code) and what part represents data. This leads to the notion of an object file format, which is the definition for how these things are laid out within a file on disk.

Although the kernel will only run a file laid out in the proper format, it is up to user-level utilities to create these files. The compiler for a programming language (such as Ada, Fortran, C, or C++) creates object files, and then a linker or loader (usually named ld) binds the object files with library routines to create the final executable. Note that even if a file has all the right bits in all the right places, the kernel won't run it if the appropriate execute permission bit isn't turned on (or at least one execute bit for root).

Because the compiler, assembler, and loader are user-level tools, it's (relatively) easy to change object file formats as needs develop over time; it's only necessary to "teach" the kernel about the new format and then it can be used. The part that loads executables is relatively small and this isn't an impossible task. Thus, Unix file formats have evolved over time. The original format was known as a. out (Assembler OUTput). The next format, still used on some commercial systems, is known as COFF (Common Object File Format), and the current, most widely used format is ELF (Extensible Linking Format). Modern GNU/Linux systems use ELF.

The kernel recognizes that an executable file contains binary object code by looking at the first few bytes of the file for special magic numbers. These are sequences of two or four bytes that the kernel recognizes as being special. For backwards compatibility, modern Unix systems recognize multiple formats. ELF files begin with the four characters "\177ELF".

Besides binary executables, the kernel also supports executable scripts. Such a file also begins with a magic number: in this case, the two regular characters #!. A script is a program executed by an interpreter, such as the shell, awk, Perl, Python, or Tcl. The #! line provides the full path to the interpreter and, optionally, one single argument:

#! /bin/awk -f

BEGIN { print "hello, world" }

Let's assume the above contents are in a file named hello.awk and that the file is executable. When you type 'hello.awk', the kernel runs the program as if you had typed '/bin/awk -f hello.awk'. Any additional command-line arguments are also passed on to the program. In this case, awk runs the program and prints the universally known hello, world message.

The #! mechanism is an elegant way of hiding the distinction between binary executables and script executables.

If hello.awk is renamed to just hello, the user typing 'hello' can't tell (and indeed shouldn't have to know) that

hello isn't a binary executable program.

1.1.4. Devices

One of Unix's most notable innovations was the unification of file I/O and device I/O.^[4] Devices appear as files in the filesystem, regular permissions apply to their access, and the same I/O system calls are used for opening, reading, writing, and closing them. All of the "magic" to make devices look like files is hidden in the kernel. This is just another aspect of the driving simplicity principle in action: We might phrase it as no special cases for user code.

[4] This feature first appeared in Multics, but Multics was never widely used.

Two devices appear frequently in everyday use, particularly at the shell level: /dev/null and /dev/tty.

/dev/null is the "bit bucket." All data sent to /dev/null is discarded by the operating system, and attempts to read from it always return end-of-file (EOF) immediately.

/dev/tty is the process's current controlling terminal—the one to which it listens when a user types the interrupt character (typically CTRL-C) or performs job control (CTRL-Z).

GNU/Linux systems, and many modern Unix systems, supply /dev/stdin, /dev/stdout, and /dev/stderr

devices, which provide a way to name the open files each process inherits upon startup.

Other devices represent real hardware, such as tape and disk drives, CD-ROM drives, and serial ports. There are also software devices, such as pseudo-ttys, that are used for networking logins and windowing systems.

/dev/console represents the system console, a particular hardware device on minicomputers. On modern computers, /dev/console is the screen and keyboard, but it could be a serial port.

Unfortunately, device-naming conventions are not standardized, and each operating system has different names for tapes, disks, and so on. (Fortunately, that's not an issue for what we cover in this book.) Devices have either a

b or c in the first character of 'ls -l' output:

$ ls -l /dev/tty /dev/hda

brw-rw---- 1 root disk 3, 0 Aug 31 02:31 /dev/hda crw-rw-rw- 1 root root 5, 0 Feb 26 08:44 /dev/tty

The initial b represents block devices, and a c represents character devices. Device files are discussed further in Section 5.4, "Obtaining Information about Files," page 139.

Dans le document What You Will Learn (Page 25-29)