Concurrency from a sequential language - Figure 4.18. A common storage arrangement for a single

Figure 4.18. A common storage arrangement for a single process

4.15.2 Concurrency from a sequential language

Although many procedures (methods) may be active at any time, because of nested or recursive invocations, there is only one thread of control, one stack and one heap when a program that was written in a sequential programming language is executed (Figure 4.24).

Figure 4.24. Sequential language, separate activities in user-level code.

It is therefore difficult, and unnatural, to write independent subsystems (collections of related objects or modules) within a single program or to manage simultaneous activation of such code on behalf of a number of clients. There is no assistance for implementing a subsystem which may get to a point where it must wait for an event, remember where it got to and resume later (as discussed in Section 4.4). In particular, it is not possible to freeze the runtime stack at some point and go back to it later. There is a single runtime stack which is used on behalf of whatever computation succeeds the computation that can no longer run.

It would be extremely tedious to attempt to program A, B or C as a single sequential program and D would revert to a sequential algorithm running on a single processor. At any point where a wait was required, state would have to be saved by the programmer and resumed correctly at the appropriate time. There would be no chance of achieving a fast response to events (which C must have) as there is a single thread of control and the operating system sees only one process. If an interrupt occurs, the state of the process is saved, a wake-up waiting is recorded and the process is resumed. Whatever the desired effect of the interrupt might be within the concurrent program, there is no immediate transfer of control. The program could be written to test from time to time to see which events had

happened, using a polling system with interrupts disabled. Alternatively, it could execute a system-provided WAIT (set of events) primitive and transfer control accordingly.

It would therefore be out of the question to use such a scheme for C (because timing requirements could not be guaranteed to be met) and highly undesirable for A and B, even if they are to run on a uniprocessor.

It would be impossible to exploit a multiprocessor with such a computational model. Since any internal structuring is transparent outside the program and there is only one thread of control it is only suitable for execution as one process on one processor.

In summary, if separate activities are implemented within a single program written in a sequential programming language:

There is no assistance from the runtime system or the operating system for managing them as separate activities.

If one activity must be suspended (because it cannot proceed until some event occurs) and another resumed, the user-level code must manage state saving and (later) restoring.

There is no possibility of making an immediate response to an event by a transfer of control within the user-level code. After an interrupt, control returns to exactly where it was within the process.

4.15.3 Coroutines

Some programming languages, for example Modula-2 and BCPL, provide coroutines for expressing independent subprograms within a single program. The motivation for having a single program with internal coroutines is that data can be shared where appropriate but private data is also supported: each coroutine has its own stack. The language provides instructions to create and delete a coroutine and to allow a coroutine to suspend its execution temporarily but retain its state. Control may be passed explicitly from the suspending coroutine to another. Alternatively, control may be returned to the caller of a coroutine when it suspends. Figure 4.25 gives an overview of how coroutines are supported.

Figure 4.25. Language support for coroutines.

When a coroutine activation is created, the name of the associated code module, the start address and the space required for the stack are specified.

co-id = coroutine-create(name, start address, stack size);

A stack is initialized and a control block is set up for the coroutine activation which holds the stack pointer and the start address. At this stage, the coroutine is in the suspended state at the main procedure entry point. Later, when the coroutine is suspended, the control block will hold the address at which execution will resume. An identifier co-id is returned for use in subsequent references to this coroutine activation.

kill (co-id)

frees the space occupied by the activation. The coroutine scheme must specify who can execute kill. It is likely that it is illegal for a coroutine to terminate itself, and that one cannot kill a dynamic ancestor. We assume that an active list is maintained by the coroutine management system of the dynamic call sequence of coroutines, see below.

At any time at most one of the coroutine activations can be running. Two types of control flow can be used:

call (co-id) // pass control to the activation co-id at the address specified in its control block. co-id is deemed to be a child of the caller and is added to the active list.

suspend // pass control back to the parent of this child on the active list. Remove the executor of suspend from the active list.

resume (co-id)

// remove the executor of resume from the active list, pass control to the activation co-id and add it to the active list.

Figure 4.26 shows these alternatives. Note that to use call (co-id) repeatedly would cause the active list to grow indefinitely. We assume that suspend or resume is executed when a coroutine cannot proceed until some condition becomes true. Note that this supension is voluntary; that is, the coroutine executes until suspend, resume or another call.

Figure 4.26. Alternatives for management of transfer of control in a coroutine system. (a) A parent calls coroutines in turn, which pass control back to it by suspend. (b) A parent calls a coroutine. This resumes a coroutine at the same

level which replaces it on the active list.

Programming the examples using coroutines

We can program example A by creating a coroutine activation for each client of the file server. The same code may be executed for each client and they may share system data but each will have a separate stack for private data. The arrangement of a main loop which decides which coroutine to call, followed by suspend in the coroutine (Figure 4.26(a)), is appropriate for this application provided that immediate response to events is not required and that the operating system offers asynchronous (non-blocking) system calls.

Example B could be programmed using a coroutine scheme. However, there is only one thread of control through the program and the sequence of control is programmed explicitly; that is, the device handlers are called in an order that does not depend on whether they have work they can do. Figure 4.27 shows a coroutine associated with each device and a polling loop in which the coroutines associated with the devices are called in turn and the devices are polled from within the coroutines.

Figure 4.27. Coroutines for a device-handling subsystem.

Such a scheme could be used with device interrupts disabled. In this case, data could be lost if a device was not polled often enough and a second data item arrived before the first had been detected and transferred from the interface.

If interrupts are enabled, the interrupt service routine for a device could transfer a small amount of data onto the stack associated with the corresponding coroutine or could set a flag to indicate that a block of data had arrived in memory. A ring of buffers can be exploited by some interfaces (Section 3.7.3). After execution of the interrupt service routine, control returns to the interrupted instruction.

It can be arranged that the devices with the shortest critical times or the highest throughput are polled more frequently than other devices.

It is impossible to respond instantly to any particular device.

It is impossible to program the monitoring and control activities of example C using coroutines. An interrupt service routine would be entered automatically (Section 3.2) but after that we would return to our predefined sequence of control, no matter how high the priority of the interrupt, since the operating system is aware only of one process and not of its internal coroutine structure. The chemical plant might blow up before we tested to see whether an alarm signal came in. The only way to take effective action on an alarm would be for the interrupt service routine to do all the work. This might be appropriate in a crisis situation, but is unsuitable as a general approach for achieving timely response.

A multiprocessor cannot be exploited using coroutines since there is only a single thread of control. D cannot be programmed as a concurrent algorithm to exploit a multiprocessor.

In summary, if separate activities are implemented as coroutines within a single program:

The runtime system supports creation of coroutine activations, shared and private data and transfer of control between activations.

The scheduling of the coroutine activations must be implemented in the user-level code. Explicit transfer of control takes place at user level.

There is a single thread of control. Scheduling of coroutines and execution of coroutines takes place within this single thread.

There is no possibility of making an immediate response to an event by a transfer of control within the user level code. After an interrupt, control returns to exactly where it was within the process.

Suspension is voluntary; control stays with a coroutine (except for interrupt and return) until it executes suspend or resume. It may therefore be assumed that any shared data structure that is accessed by a coroutine is left in a consistent state by it; it cannot be preempted part-way through updating a data structure.

Transfer of control between coroutine activations involves very little overhead, typically of the order of ten machine instructions.

The address for resumption is stored in the control block and the activation list is managed. There is no need to save data.

The Tripos operating system (Richards et al., 1979) was written in BCPL and has a coroutine structure. A filing system (example A) and device handling (example B) were included. Tripos was designed as a single-user system and a single shared address space is used for the operating system and applications.

4.15.4 Processes

Some programming languages, such as Java, Modula-3, Mesa, Concurrent Pascal, Pascal Plus and occam, support processes within a program. Again, as described above for coroutines, a program may be written to contain independent sub-programs. Each subprogram is now executed by a separate application process, or user thread, as defined by the runtime system. Like coroutines, each user thread has its own stack, but unlike coroutines, control is managed by an out side agency, the runtime system, and is not programmed explicitly within the subprograms (Figure 4.28).

Figure 4.28. Runtime system support for processes.

A user thread may need to wait for a shared application resource (such as a shared data structure) or for another user thread to complete a related activity. A wait operation is provided and will be implemented as a call into the runtime system. Another user thread will then be selected to run.

A major question in system design is the relationship between the user threads, created by the runtime system, and the processes or kernel threads scheduled by the operating system. It might be that the operating system can support only one process for one program.

The processes within the program are then managed as user threads by the runtime system which effectively reimplements a scheduler.

The runtime system is multiplexing one operating system process among user threads known only to itself.

The scheme is similar to a coroutine scheme, but the application programmer does not have to program the transfer of control between the user threads since their scheduling is provided by the runtime system.

A problem with this scheme is that if any one of the user threads makes a system call to the operating system to do I/O and becomes blocked then no other user thread in the program can run. Recall that an operating system might only provide system calls for I/O which are synchronous (Section 3.5.2). This is operating system dependent; for example, UNIX system calls are synchronous, IBM MVS or Windows 2000 calls may be synchronous or asynchronous. Even if the user threads were to use the runtime system as an intermediary for the purpose of making a system call, a blocking system call made by the manager will still block the whole process. This is a fundamental problem if the operating system does not support multi-threaded processes.

If the user threads defined in a program may be made known by the runtime system to the operating system they become separately schedulable, operating system processes or kernel threads. They are then scheduled by the operating system to run on processors and may run concurrently on the separate processors of a multiprocessor. In Figure 4.28 the runtime system's create routine would include an operating system call to create a thread (which would return a thread identifier), its wait routine would include a system call to block the thread and its signal routine would include a system call to unblock the thread. Each thread may then make operating system calls that block without affecting the others.

Programming the examples using processes

Processes not known to the operating system (user threads only)

For A each client's request is handled by a user thread which executes the file service code. In B each device handler is a user thread, as shown in Figure 4.29. The way in which the threads might be used is as discussed above for coroutines.

Figure 4.29. Language-level processes (user threads) for a device-handling subsystem.

The scheme is inadequate for example C, where instant response to an alarm is essential, and for D which requires the user threads to be scheduled to run on separate processors of a multiprocessor. The only difference between coroutines and user threads is that scheduling is provided by the runtime system, whereas coroutine scheduling must be written by the application programmer.

Processes known to the operating system (kernel threads)

Now let us suppose that the runtime system registers each user thread with the operating system as a kernel thread, and consider whether C and D can now be programmed. The parallel search activities of D can be programmed as threads which may run on separate processors. For C we have further requirements on the operating system. Suppose the data gathering process is running. An interrupt arrives to indicate that an alarm condition has developed; the interrupt service routine passes control to the thread manager to change the state of the control thread from waiting to runnable. It is a high-priority thread and therefore should run immediately, preempting the data gathering and analysis. The operating system must support this.

The scheduling policy of the operating system is therefore crucial. If the scheduling algorithm is non-preemptive a thread may continue to run until it blocks. Even if an interrupt occurs and the corresponding interrupt service routine is executed, control passes back to the interrupted thread. The scheduling algorithm is only invoked when the current thread blocks, in spite of the fact that a high-priority thread may have been made runnable. It might be the case that the requirements of A and B could be met with non-preemptive scheduling, but it is essential to have preemptive scheduling for C, as described in Section 4.3. Even this may not be sufficient for implementing real-time systems in general, see Section 4.8.

Dans le document [ Team LiB ] (Page 173-179)