Handling Exceptional Situations - Implicit Assumptions

Part 2 The Software Side: Disappointments and

7 Implicit Assumptions

7.1 Handling Exceptional Situations

Implicit Assumptions

About This Chapter

In algorithms there are obvious assumptions that are tacitly presumed to be satisfied. However, a program must be able to handle exceptional situations that violate these tacit assumptions. Programs must also test for assumptions that are fundamental for the correct functioning of the approach. What may be quite obvious for the algorithm designer is nevertheless to be verified in a program.

Implicit assumptions occur frequently in algorithm design; they are less common and far less acceptable in software. This tension between reasonable assumptions and unreasonable attention to detail is an important aspect of the difference between algorithms and software. We examine exceptions as well as more fundamental issues related to this topic.

7.1 Handling Exceptional Situations

Algorithms presume that the reader has some intelligence. Therefore, they tend to be formulated without covering every possibility or aspect. In contrast, programs must be written so that even unexpected input and results do not cause them to crash. This implies that programs do not have the luxury of concentrating on the important aspects of a problem’s solution — all aspects must be covered comprehensively. Modern programming languages recognize this need and provide facilities for handling exceptions, but these facilities must be used by the programmer or they will not improve the programs. Another aspect an algorithm may ignore but a program must not is the initialization of function calls, especially recursive ones. Algorithms may be able to get away with assuming reasonable starting conditions, but programs must test for them.

Moreover, if the assumptions are not satisfied, some specific action must be taken so that the program does not crash!¹ Finally, problems related to incorrect

1 This assessment may be tempered by considerations related to the question of whether the pro-gram is a batch propro-gram or interactive. We will take up this issue later.

C6730_C007.fm Page 167 Monday, July 3, 2006 1:43 PM

168 A Programmer’s Companion to Algorithm Analysis input tend to be ignored in algorithms; programs, however, must ensure that the input is in the format required by the program.

7.1.1 Exception Handling

The need for exception handling can arise in various ways. A typical exception is division by zero. The occurrence of a division by zero may be the conse-quence of several events. It may be the result of incorrect input, the result of a sloppy algorithm², or the result of a rounding error. We will argue below that testing whether the input satisfies the requisite conditions is necessary for code, even though it may not be considered in an algorithm. We will deal with problems caused by the finite representation of numbers in programs in the next chapter. Here, we want to emphasize that exception handling is crucial for the correct functioning of programs. This implies in particular that code must be provided that specifies the action to be taken when an exception is thrown. Before one can do this, it is necessary that the programmer analyze where things can go wrong — where exceptions may occur.

This brings us to a serious weakness of general exception handling: Not every exception of a specific type should be treated in the same way. For example, consider division by zero. If one were to deal with this generally, the exception-handling mechanism could consist only of a generic notifica-tion of this event. It usually does not allow us to deal with it in a specific way. For example, we may want to determine the average salary of a group of employees. The algorithm may not make any provisions for the case where that group, defined in some way, contains no employees, resulting in a careless division by zero if no test was carried out for this special case.

Generic exception handling would merely notify us of a division by zero; it does not allow us to ignore the empty group and the effects this may have on the overall computation. To be able to do this requires a careful analysis of all implicit assumptions together with a careful design of code for each of the possible violations of them.³

2 While we do assume that the algorithm we start out with is correct, its designer may not have paid sufficient attention to all details to be directly translatable into code. For example, we may use BinarySearch to determine an index ind of an item in a sorted array. The value of ind may then be used to access information related to that item. This works only if the item is present in the array; otherwise ind has a value that signals that the item is not present, for example 0, and use of this value to access information related to this (nonexisting) item results in an error. In other words, the algorithm formulates a solution assuming the item is present and ignores the alternative. A program must specify explicitly what to do if the alternative is encountered.

3 Not all programming languages provide facilities for general exception handling. These com-ments suggest that this lack is not nearly as serious as one might assume. Many exceptions must be handled in specific ways, geared to the concrete instance where they arise, and this cannot be done through generic exception handling. By and large, generic exception handling allows con-tinued execution, that is, the program does not crash, but it does not allow actions to be taken that make sense in a specific situation. This means generic exception handling is essentially syn-tactic, not semantic, error handling. Semantic error handling requires an understanding of the meaning of the program and can therefore be carried out only by the programmer.

C6730_C007.fm Page 168 Monday, July 3, 2006 1:43 PM

Implicit Assumptions 169 7.1.2 Initializing Function Calls

The initialization of function calls, especially recursive ones, is another trouble spot that differentiates algorithms from programs. As with excep-tions, the problem arises from values that were computed or determined elsewhere but may now cause our function to misbehave. For general functions, the problem is often that the values of the actual parameters do not make sense for reasons that are too obvious to belabor in the formulation of the algorithm. Unfortunately, once we transition to a pro-gram, these nonsensical situations must be dealt with explicitly since otherwise the program may either produce incorrect results or crash. In the special case of recursive functions, we also have the problem of ensur-ing that every possible call will eventually end up in a basis case, thereby terminating the recursive calls.

Values that do not make sense (and are therefore not considered in the formulation of the algorithm) might be 0 for the number n of elements in a group, for example in the computation of the average salary. If this value is not positive, problems may arise (for example, division by 0). For an algo-rithm, this may be obvious and tacitly understood; for a program, the test n≥ 1 should be explicitly carried out, with additional code provided when the test is not satisfied. A similar situation occurs if we want to program matrix multiplication of two [1:n,1:n] matrices. It would not occur to anyone formulating the algorithm to test explicitly for n≥ 1, yet not doing so might result in code that crashes if this condition happens not to be satisfied. A last example is related to the range checks mentioned in the previous chapter.

If a function has a parameter that accesses an array element, it is highly desirable that this index be within the proper range. If the programming language does not test for this as a matter of course, it is probably advisable that the programmer carry out this test explicitly. Again, an algorithm may not bother with this, but a program should. It is important to realize that both the test and the additional code specifying what is to be done when the test fails are not optional, but are mandatory for good software.

It is not merely sufficient to test for these implicit assumptions. To ensure proper functioning of our programs we must also provide code that addresses the consequences of such a test not being satisfied. By and large, it is unsafe to assume that input to functions will always be as expected.

Surprises may occur because of a variety of issues, from incorrect user input to rounding errors and other exceptions.

For recursive functions, we must additionally ensure that every recursive call terminates. This requires that for any legal input combination (and these parameters must have been tested), eventually a basis case is reached, ter-minating the recursion. Again, there is a subtle gap between algorithm and program. This should be particularly of concern if the variable that governs the recursion is not an integer. For example, consider the following recursive skeleton:

C6730_C007.fm Page 169 Monday, July 3, 2006 1:43 PM

170 A Programmer’s Companion to Algorithm Analysis F(x)

if x=0.0 do { basis case }

else { statements; F(x-0.1); statements }

The idea is that we apply recursion, reducing the argument by 1/10, so if the actual parameter is an integer, the algorithm will terminate. As we pointed out above, this is playing with fire since at a minimum, we must test whether this assumption (that the actual parameter is an integer) is satisfied. However, even if this assumption holds, it is unlikely that a direct implementation of this algorithm will terminate. The problem lies in the representation of floating point numbers and the test for equality, which are taken up in the next chapter. Here, we merely want to state unequivocally that proper functioning of recursion requires that for all inputs, a basis case must be reached. This assumption is violated in this example program, as it would be in similar examples based on floating point numbers, even though the original algorithm does satisfy it.

Finally, users may simply provide incorrect input, possibly through care-lessness or because of transmission errors or an incompatibility of the input device with the receiving unit. Programs will differ depending on whether they are batch programs or interactive ones. For batch programs, an incorrect input usually means the program must terminate execution. There is no way to correct the erroneous input supplied. An interactive program would ordi-narily be expected to prompt the user for input; thus, if the input does not conform to the specifications, another prompt may be in order. Even in this case the program may have to stop executing if the user insists on supplying erroneous input (for example, if there is an incompatibility).

It is not always possible to determine that an input is wrong; this would amount to being able to predict what the input should be. Instead, a program can only test whether a certain general format is complied with. Thus, if a pair of integers specifying a date within a year is required, certain combina-tions are obviously incorrect, for example “50 50”. It is less obvious whether

“31 12” is incorrect. It would be if the month comes first (American-style dates), but if the day comes first (European-style dates), it is correct. Finally, there is no way to tell whether “9 10” is correct (maybe it should be “10 9”), but it conforms to the expectations of a date and should therefore be accepted.⁴ Another difficulty arises if an integer input is required but the user inputs a real value or a character string. In such a case, it may be quite cumbersome to produce code that rejects input whose type is not valid. This kind of input error will probably have to be assessed case by case. For example, when being prompted for a percentage value, a user may input

“15” while another user may input “0.15”. What is clear is that programs must pay far more attention to these questions than any algorithm ever would (or should).

4 Input such as “2 29” is also problematic, since (assuming American-style format) it is only valid in leap years. Thus, without knowledge of the year, it is impossible to decide whether this is a valid input.

C6730_C007.fm Page 170 Monday, July 3, 2006 1:43 PM

Implicit Assumptions 171

Dans le document A ProgrAmmer’s ComPAnion to Algorithm AnAlysis (Page 177-181)