The hardware interface, I/O and communications

Study questions

Chapter 3. The hardware interface, I/O and communications

3.1 Overview

3.2 Device interfacing

3.3 Exceptions

3.4 Multiprocessors

3.5 User-level input and output

3.6 Communications management

3.7 Communications networks, interfaces and drivers

3.8 Communications software

3.9 Communications handling within and above an operating system

3.10 Summary

[ Team LiB ]

3.1 Overview

The kinds of hardware on which a concurrent system might be built were outlined in Chapter 1. In most application areas we do not have to program devices directly but are likely to use an operating system in each component of the system. An operating system provides a high-level interface to the hardware which abstracts away from the specific details of how each device is programmed. This makes life easier for the programmer, but as designers of concurrent systems we have to take care that nothing crucial is lost in this process: that the interface we use gives us the performance and functionality we require. For example, if a hardware event must have a response within a specified time we need to understand all that can happen in the software system that responds to the event. Some operating system designs make them unable to guarantee to meet timing requirements. For this reason, a concurrent system designer needs to know the basics of how devices are controlled.

Some devices are dedicated to a particular task, such as the terminals allocated to individual users, or sensors and actuators for monitoring or controlling industrial processes. Others, such as disks, printers and network interfaces, are shared among users. Both types are managed by the operating system. We consider the low-level interface between devices and the software which controls them.

This study forms the basis on which software design decisions can be taken, in particular, the allocation of processes to modules concerned with device handling. Hardware events are one source of concurrent activity in a system; we need to study the precise mechanisms involved.

When a program is loaded and runs it may contain errors which are detected by the hardware. For example, the arithmetic logic unit (ALU) may detect a division by zero or an illegal address; the address of a data operand may be at an odd byte address when an even one is expected. Whenever such a program runs it will cause the same errors at the same point in the execution. These can be classified as synchronous hardware events.

In Chapter 5 we shall see that a program may not all be in main memory. When a 'page' that is not in memory is referenced the

hardware signals a 'page fault'. Page faults are synchronous events because they are caused by running programs and must be handled before the program can continue to run.

When a program runs, events may occur in the system that are nothing to do with that program. The disk may signal that it has finished transferring some data and is ready for more work; the network may have delivered a packet of data. Such events are asynchronous with program execution and occur at unpredictable times.

These aspects of the interaction of a process with the hardware are considered in detail, as is the system call mechanism through which a user-level process may request use of a device.

The network connection of a computer may be considered at a low level as just another device. However, it is a shared device and computer–computer communication can generate a large volume of data associated with multiple simultaneous process interactions.

The communications-handling subsystem is therefore a large concurrent system. The design of communications-handling software is introduced in the later sections of the chapter, although a complete study would require a book in its own right (Comer, 1991; Halsall, 1996; Tanenbaum, 1988).

[ Team LiB ]

3.2 Device interfacing

In this section the basics of how devices are controlled by program are given. Figure 3.1 gives an operating system context for all the levels associated with I/O handling.

Figure 3.1. Device-handling subsystem overview.

In Section 2.5 it was pointed out that programs running above the operating system must be prevented from programming devices directly.

This section will show how this restriction can be enforced and how users may request input or output by making a system call, since they are not allowed to program it for themselves (see Section 3.4). Figure 3.1 indicates this difference in privilege between the operating system and the user level. It is clear that when a system call is made a mechanism is needed to change the privilege from user (unprivileged) to system (privileged).

Three interfaces are indicated. The lowest-level interface is with the hardware itself. Only the operating system module concerned with handling the device needs detailed information on its characteristics, for example, the amount of data that is transferred on input or output.

The operating system creates a higher-level interface, creating virtual devices that are easier to use than the real hardware. This interface is language independent. Finally, the language libraries offer an I/O interface in the form of a number of procedures that are called for doing I/O. These may differ from language to language and from the operating system's interface. Each language system must invoke the operating system's system call interface on behalf of its programs.

Processes are required to execute these modules. This is discussed in detail in Chapter 4; in particular, Figure 4.1 shows processes allocated to the modules of Figure 3.1.

In Chapter 27, on Windows 2000, we see a more complex (and flexible) I/O architecture in which multiple, layered device drivers may be used.

In Figure 3.1 an interface between the device hardware and the operating system software is shown. Figure 3.2 focuses on this interface and shows a hardware component and a software component. Section 3.2.3 describes one form of hardware interface to a device and Sections 3.2.4–3.2.8 introduce the lowest level of software interface and the mechanism by which it is invoked by hardware.

Figure 3.2. Hardware–software interface.

3.2.1 Processor and device speeds

It is important to realize that devices are, in general, very slow compared with processors. This was always the case, and the gap has widened, since processor speeds have increased faster than device speeds and are likely to continue to do so. Consider a user typing ten characters a second to an editor and suppose the processor executing the editor executes an instruction in a microsecond. The disparity in these speeds becomes more obvious if we scale them up. If we scale one microsecond up to one second then on this scale the user types a character once a day while the processor executes an instruction every second. Current processors would be more likely to be executing a thousand instructions per second on this scale whereas users keep on typing at the same speed. Another way of appreciating the speed of processors is to quantify the number of instructions executed per second. In 1980 a processor would execute about 400 000 instructions per second and in 1990 about ten million. In 2002 a PC may have a processor which operates at a clock frequency of 1.5 GHz and which, with carefully written code, can execute several instructions in each clock cycle; the number of instructions per second is therefore greater than 1500 million. In addition, processors that support simultaneous multi-threading (SMT) are becoming commodity items. A single SMT processor can execute instructions for several threads, switching between them in hardware.

Current workstations and terminals typically have graphical user interfaces which have been made possible by these increases in processor speeds and memory sizes. As you type to an editor which is displaying a page of your text in a window, your screen changes significantly as you type. There is a good deal of processing to be done as a result of the input of a character.

In many concurrent systems disk storage is needed. Although disk density has doubled every three years for the past twenty years, disk access time is limited by the electromechanical nature of the device and only increased by a third between 1980 and 1990. An example illustrates the increase in capacity and performance and decrease in cost:

In 1963 an 80 megabyte storage system on 32 cartridges cost £100,000. The data rate was 50–100 kilobits per second and it took 50–200 milliseconds to position the heads.

In 1986 a 765 megabyte storage system on 8 cartridges cost £1000. The data rate was 2 megabits per second and it took 2–35 milliseconds to position the heads.

In 1992, the HP C3010 had a formatted capacity of 2000 megabytes on 19 surfaces, an average seek time of 11.5 milliseconds, a transfer rate of 10 megabits per second from the disk buffer to a SCSI-2 bus. The cost was US$3.75 per megabyte.

In 1997 the Seagate Elite 23 disk drive had an unformatted capacity of 29.6 gigabytes and a formatted (512 byte sectors) capacity of 23.4 gigabytes. It had 6880 cylinders, average seek time of 13.2 milliseconds (read) and 14.2 milliseconds (write). It rotated at 5400 revolutions per minute and had an average latency of 5.55 milliseconds. The controller had a 2048 kilobyte cache, the internal transfer rate varied from 86 to 123 megabytes per second and the transfer rate to/from memory varied from 20 to 40 megabits per second. The cost was about 16 pence (25 US cents) per megabyte.

In 2002, an example of a modern Seagate drive is the BarracudaATA IV which is 80 gigabytes in size, has a maximum burst transfer rate of 100 megabytes per second and costs around £80: £1 (US $1.50) per gigabyte.

Section 3.2.8 shows how disks are programmed.

Many computer systems are network based. Current networks in widespread use, such as Ethernet, typically operate at 100 megabits per second, but 1000 megabit through to gigabit networks are becoming available. We shall consider communications handling later in this chapter, but it is clear that an operating system has less time to handle communications devices than peripherals.

For a comprehensive coverage of the characteristics and performance of processors, memory and I/O see Hennessy and Patterson (2003).

3.2.2 CISC and RISC computers

Throughout the 1980s there was a move towards simpler processors. If the instruction set is simple and instructions are of equal length, techniques such as instruction pipelining can more easily be used to achieve high performance. If the processor is simple, the chip fabrication time and therefore time to the marketplace is faster. There may also be space on the chip for cache memory or address translation (see Chapter 5).

When real programs were measured it was found that the complex instructions and addressing modes, designed to support specific data structures, are rarely used by compiler writers. Their complexity implies inflexibility and they are never quite right for the job in hand. The virtual disappearance of assembly language programming means that the machine instruction set need not be high level and attempt to 'bridge the semantic gap' between the programmer and the machine. Compact code is no longer an aim, since current machines have large address spaces, typically 32 bits for a 4-gigabyte address space, and physical memory is cheap.

These arguments have led computer design from Complex Instruction Set Computers (CISC), which were in the mainstream of architectural development until the early 1980s, to the current generation of Reduced Instruction Set Computers (RISC), mentioned in Section 1.3.2. An excellent treatment is given in Hennessy and Patterson (2003). Some aspects of RISC designs are similar in their simplicity to the minicomputers of the 1970s and the early generations of microprocessors. Other aspects, such as the address range, have outstripped early mainframes.

3.2.3 A simple device interface

Figure 3.3 shows a simple device interface which has space to buffer a single character both on input and on output. An interface of this kind would be used for a user's terminal, for some kinds of network connections and for process control systems. In the case of a user terminal, when the user types, the character is transferred into the input buffer and a bit is set in the status register to tell the processor that

a character is ready in the buffer for input. On output, the processor needs to know that the output buffer is free to receive a character and another status bit is used for this purpose. The processor can test these status bits and, by this means, output data at a speed the device can cope with and input data when it becomes available.

Dans le document [ Team LiB ] (Page 98-103)