Exam Computer Architecture (INF559) Ecole Polytechnique
2008–2009
– The exam lasts 3 hours.
– The text contains 5 pages, including a summary on the LC-2 at the end.
– All documents are authorized.
– The grades attributed to each section are only there to help you weigh the questions, they will not necessarily correspond to the final total.
– It is compulsory that you comment your programs ; almost each line/instruction must have a comment which enables to understand their role.
As transistors get smaller they also get more prone to errors. As a result, within any gate, a bit may be flipped : it may switch from 0 to 1 or vice-versa. Consequently, processor designers must now take this reliability issue into account. In this text, we investigate such errors, and how they can be coped with at the hardware and the software levels.
This text contains 3 sections which can be treated independently, though we recommend to treat them in order.
Exercice 1 - Error detection and correction (7 points)
In this part we build circuits capable of detecting and correcting errors (bit flips).
Question 1.1
We consider a word of 4 bits b3b2b1b0. We want to build a circuit where the output is 1 if the number of bits equal to 1 is odd, and 0 if the number of bits equal to 1 is even. Present your solution first as a 4-entry Karnaugh map, where the columns correspond to b3b2and the rows correspond to b1b0.
Using one of the logical operators seen in the course, the logical expression of this circuit can be made very simple. Indicate what is this operator and the corresponding logical expression.
Réponse
The Karnaugh map is the following :
0 1 0 1
1 0 1 0
0 1 0 1
1 0 1 0
The logical operator is XOR, and the corresponding logical expression is b3⊕b2⊕b1⊕b0.
Question 1.2
We call this bit p (for parity), and now consider that such a bit is added to each 4-bit word b3b2b1b0resulting into 5-bit words pb3b2b1b0. Explain how this bit can be used for error detection purposes. How many simultane- ously occurring errors can be detected ? Is it possible to correct the error(s) as well ?
Réponse
In order to detect errors, the parity of the number of bits equal to 1 among b3b2b1b0is computed and compared to p. If they differ, then an error has occurred.
Only an odd number of simultaneous bit flips can be detected. Usually this mechanism is used to detect 1 error. It is not possible to correct the error, we only know that an error occurred, but not on which bit it occurred.
Question 1.3
We now consider a more complex error detection scheme where we add the following three parity bits : p0=parity(b1b2b3)
p1=parity(b0b2b3)
1
2 p2=parity(b0b1b3)
Therefore, we now have a 7-bit word p2p1p0b3b2b1b0instead of the 4-bit word b3b2b1b0. How many errors is it possible to detect with this scheme ? Provide a formal justification of your reply.
Réponse
Two errors can be detected.
Consider first that one of the two flipped bits is b3. If the other flipped bit is bi(i6=3), then piwill differ. If the other flipped bit is a pibit, then the error is directly detected when comparing parity bits.
Consider now that the two flipped bits are bi1and bi2, with i16=3and i26=3. Then pj/j=i1|j=i2will both be flipped since they cover only one of the flipped bits.
Finally if at least one of the two flipped bits is a pjbit, then the error is directly detected when comparing the parity bits.
Question 1.4
Provide a logical expression (as a function of p2,p1,p0,b3,b2,b1,b0) which is equal to 1 (true) if there is an error, and 0 otherwise. Do not try to minimize the cost of the corresponding circuit.
Réponse
Let us call p()the parity function of question 1.1. Two simultaneous errors occur if :
((p(b0,b1,b3)6=p2) + (p(b0,b2,b3)6=p1) + (p(b1,b2,b3)6=p0))
The above expression becomes a combinational logic expression by substituting6=with the negation of XOR (⊕).
Question 1.5
Formally prove that this error detection scheme can also be used to correct one error. Explain how. Also give a counter-example showing that it cannot always correct two simultaneously occurring errors.
Réponse
We assume only one error occurred. In that case, either 1, 2 or 3 parity bits have been flipped.
If only one pidoes not match, then that parity bit has been flipped by the error, there is no correction necessary on the data bits bj.
If pi1and pi2do not match, then bi3/i36=i1∧i36=i2∧i36=3has been flipped.
If p0,p1,p2do not match, then b3has been flipped.
As a result, all cases where a single bit has been flipped can be corrected.
However, two errors cannot be corrected. For instance, if b0and b1have been flipped, then p1and p0do not match ; p2 stays unchanged because two of its data bits have been flipped. As a result, this is wrongly interpreted as the case where a single error occurred on bit b2.
Question 1.6
Based on your answer to the previous question, provide the logical expression of an error correction circuit.
The input of this circuit are the 7 bits, and the output are c3c2c1c0the corrected versions of b3b2b1b0. Do not try to minimize the cost of this circuit.
Réponse
c3=b3×p0×p1×p2+b3×p0×p1×p2 c2=b2×p0×p1+b2×p0×p1
c1=b1×p0×p2+b1×p0×p2 c0=b0×p1×p2+b0×p1×p2
Exercice 2 - A fault-tolerant processor (7 points)
In this part, we try to adapt the LC-2 so that it can keep functioning even when errors occur. We use the single- bit (parity) error detection scheme of Question 1.1. 1We assume all registers in the circuit (IR, PC, MDR, MAR, NZP, BEN, R0, . . . ,R7) have been augmented with a parity bit, and that they are immediately followed by the error detection circuit of Question 1.1. Therefore, there are now error signals EIR,EPC, . . .which indicate if an error was detected after each of these registers.
Question 2.1
The control circuit detects that an error occurs by doing an OR of all the error signals. When an error occurs, the instruction is “squashed”, i.e., it is stopped and restarted. During the execution of an instruction, when is it no longer possible to squash it ? You may want to detail your reply per instruction, or at least per instruction category.
We will assume that errors which occur after an instruction is no longer squashable are simply not recoverable.
Réponse
The state of the processor must not have changed. That means :
1You can do this exercise without having done that circuit, you simply must read the text of that question.
3 – PC has not been modified (all instructions, and branch instructions during their execution).
– The instruction has not modified a register Ri(for the instructions which write a result).
– The instruction has not written to memory (for store instructions).
Question 2.2
What do you suggest to modify in the following control sequence of instruction ADD so that it remains squashable for the longest possible time ?
MAR←PC,PC←PC+1 MDR←MAR IR←MDR NZP,DR←SR1+SR2
Réponse
The PC must now be updated at the very end of the instruction control sequence, instead of at the beginning.
Question 2.3
The two inputs SR1 and SR2 of the above ADD instruction correspond to registers which contain 16 data bits and 1 parity bit. An error signal EADDindicates if an error occurred during an addition. What check should be done right after the ALU to detect if an error has occurred during the addition ? Clearly justify your reply and deduce the logical expression of signal EADD.
Réponse
When adding two numbers, the parity of the result is the XOR of the parity of each number : if the two numbers are even, or odd, the result is even (0) ; if one number is odd and the other one is even, the result is odd (1).
Therefore, if EADD= (p(SR1)⊕p(SR2))6=p(SR1+SR2) = (p(SR1)⊕p(SR2))⊕p(SR1+SR2)is equal to 1 (true) then an error occurred during the computation.
Question 2.4
Same question for instruction NOT, with signal ENOT.
Réponse
When the word size is an even number of bits (like in the LC-2, i.e., 16 bits), the parity of the result should be the same as the parity of the input.
ENOT=p(SR1)6=p(SR1).
Question 2.5
Same question for instruction AND, with signal EAND.
Indication : On conseille de considérer d’abord le nombre de 1s plutôt que directement le bit de parité.
Réponse
This question is significantly more difficult than for ADD and NOT. Recall that the parity bit is related to the number of 1s. So one solution is to find how the number of 1s evolves when doing the AND of two numbers.
Let us call one(X)the number of 1s in binary number X . We are going to show that one(AND(SR1,SR2))+one(OR(SR1,SR2)) = one(SR1)+one(SR2). The bit positions where only one of the SR1 and SR2 bit is equal to 1 is counted once in one(OR(SR1,SR2); as a result, for these bit positions, one(OR(SR1,SR2)) =one(SR1) +one(SR2); since the AND of these bits is 0, then one(AND(SR1,SR2)) +one(OR(SR1,SR2)) =one(SR1) +one(SR2)for these bit positions. The bit positions where both bits are equal to 0 are not counted in one(AND(SR1,SR2))nor in one(OR(SR1,SR2)), so one(AND(SR1,SR2))+one(OR(SR1,SR2)) = one(SR1)+one(SR2)for these bit positions. Finally, the bit positions where both bits are 1 are counted into both one(AND(SR1,SR2)) and one(OR(SR1,SR2)), so one(AND(SR1,SR2)) +one(OR(SR1,SR2)) =one(SR1) +one(SR2)for these bit positions. As a result, we indeed have one(AND(SR1,SR2)) +one(OR(SR1,SR2)) =one(SR1) +one(SR2).
Or, one(AND(SR1,SR2)) =one(SR1)+one(SR2)−one(OR(SR1,SR2)). Therefore, EOR=p(SR1)+p(SR2)−p(OR(SR1,SR2)).
Question 2.6
In some cases, the errors are not occasional, they are permanent because a given wire, or a given transistor has a defect. As a result, one of the bits (i) in a register, or one of the bits in the ALU operators, is always erroneous.
Instead of considering the processor has become useless, it is possible to deactivate bit i throughout the processor, and instead of having a 16-bit processor, we would now have a degraded, but still useful, 15-bit processor.
For the remaining questions, we assume there is no parity bit, but we are going to modify the processor architecture so that it can cope with such permanent defects. Let us assume that, somewhere in the processor (either in a register or in an operator), a single bit i,06i615, is no longer usable.
We first focus on the register bank. How the register bank should be modified to accomodate such a faulty bit ?
Réponse
There is no modification required in the register bank itself, since there is no issue with storing and propagating a faulty bit.
4
Question 2.7
Same question for the NOT operator of the ALU ?
Réponse
No modification required, except for the NZP registers. The logic circuit which computes the NZP conditions should not factor in the faulty bit. We introduce 16 individual deactivation signals DNZPi(1 if bit is deactivated).
For the N circuit, if the 16th bith is the faulty one it should be replaced by the 15th bit. So Nnew=DNZP16×N(ALU16) + DNZP16×N(ALU15).
For the Z circuit, we need to AND each of the 16 input bits coming from the bus with the corresponding deactivation bit.
So Znew=NOR(AND(DNZPi,ALUi)).
The P circuit remains unchanged.
Question 2.8
Same question for the AND operator of the ALU ?
Réponse
No modification required beyond the NZP modification.
Question 2.9
Same question for the ADD operator of the ALU ?
Réponse
The carry propagation must become defect tolerant. If bit i has a defect, then adder i should be skipped and the carry out of adder i−1should be directly propagated to the carry in of adder i+1. For that purpose, instead of having cini=couti−1, we now have cini=DADDi−1×couti−1+DADDi−1×couti−2, where DADDiis the deactivation signal for adder i.
Exercice 3 - A software approach to fault-tolerance (7 points)
In this section, we make no modification of the architecture, and we attempt to implement fault tolerance at the software level only.
Question 3.1
Write an LC-2 assembly program which computes the number of 1s in a word, and uses it to find the parity bit of a word, see Question 1.1. 2The input word is stored in register R0and the parity bit will be returned in R1.
Réponse
Question 3.2
We now assume that the NOT operator of the ALU has been found to be faulty (it sometimes provides an erroneous result). Consequently, for each invocation of the NOT operator, we want in fact to compare the parity of the input against the parity of the output to make sure they match, and that no fault occurred.
Write a function which implements this secure NOT by using the program of the previous question. The program of the previous question will itself be modified so as to be used as a function.
IMPORTANT : Do not use the stack conventions explained in the course. Use the following simplified con- ventions :
– Register R6always contains the address of the last used element of the stack.
– Before a function calls another function, it must first back up on the stack all registers it wants to preserve.
It must always assume that any register it needs may be modified.
– A function passes parameters using the stack only.
– At the end of a function call, the called function stores the result to be returned in the last element of the stack.
Réponse
See program at the end of the text.
2You can do this exercise without having done that circuit, you simply must read the text of that question.
5 .ORIG x3000
LDR R0, R6, #0 ; gets number from stack STR R7, R6, #0 ; push number on stack ADD R6, R6, #1 ; allocate one stack entry STR R0, R6, #0 ; backup number on stack ADD R6, R6, #1 ; allocate one stack entry STR R0, R6, #0 ; push number on stack for call JSR PARITY
LDR R1, R6, #0 ; gets parity from stack ADD R6, R6, #-1 ; deallocate one stack entry LDR R0, R6, #0 ; gets number from stack ADD R6, R6, #-1 ; deallocate one stack entry NOT R0, R0 ; ALU computation
ADD R6, R6, #1 ; allocate one stack entry STR R1, R6, #0 ; backup parity on stack ADD R6, R6, #1 ; allocate one stack entry STR R0, R6, #0 ; push NOT(number) on stack JSR PARITY
LDR R2, R6, #0 ; gets parity of NOT(number) from stack ADD R6, R6, #-1 ; deallocate one stack entry
LDR R1, R6, #0 ; gets parity of number from stack ADD R6, R6, #-1 ; deallocate one stack entry AND R3, R3, #0 ; initialize result
ADD R1, R1, #0 BRz ZERO ; R1 == 0 ADD R2, R2, #0
BRz DIFFER ; R1 != R2 JMP END ; R1 == R2 ZERO ADD R2, R2, #0 BRz END ; R1 == R2 DIFFER ADD R3, R3, #1
END STR R3, R6, #0 ; store result where R7 was stored RET
PARITY LDR R0, R6, #0 ; gets number from stack ADD R6, R6, #-1 ; deallocate number from stack AND R1, R1, #0 ; stores nb of 1s
AND R2, R2, #0
ADD R2, R2, #15 ; counter = 15
LOOP ADD R0, R0, #0 ; most significant bit = 0 ? BRzp POS
ADD R1, R1, #1 ; most significant bit = 1 POS ADD R0, R0, R0 ; shift left
ADD R2, R2, #-1 ; decrement counter BRzp LOOP ; loop if not end
AND R1, R1, #1 ; get parity of result
ADD R6, R6, #1 ; allocate one entry for result STR R1, R6, #0 ; store result on stack
RET STOP HALT .END
6
LC-2
We provide a summary of the LC-2 instruction set in Figure 2 and of the LC-2 architecture in Figure 1.
+1 2
IR LD.IR
R PCMX LD.PC
MARMX 2
+ ZEXT
ZEXT
@
MDR MAR
R.W
KBDR KBSR
CRTDR CRTSR 2 INMUX
LD.MAR
MIO.EN GateMDR
MEM.EN LD.MDR
Input Output
MIO.EN
GatePC
R7 R0
ALU OUT
SR2
OUT SR1
2 ALUK
GateALU SR1
SR2
SEXT
LD.REG DR
SR1MX DRMX
2
2
des
bus 16 bits
contrôle adresses mémoire
adressable sur 16 bits
N Z P LD.CC combinatoire combinatoire
BEN LD.BEN SR2MX contrôle microprogrammé BEN
GateMARMX
PC
CarryOut CarryIn
16 [8 :0]
16 [15 :9]
[5 :0]
[7 :0]
16 16
16 16
3
3 3
16 16
[4 :0]
[5]
[8 :6]
[11 :9]
[11 :9]
[15 :11]
[0 :2]
FIG. 1 – LC-2 architecture Note :PC,IR,MAR,MDR,BEN,N,ZetPare registers.
– LD.MAR/1, LD.MDR/1, LD.IR/1, LD.REG/1, LD.CC/1and LD.PC/1are write controls for LC-2 registers.
– LD.BEN/1is a write control for the BEN register (branch enable) ; 1 if branch is taken.
– GatePC/1, GateMDR/1, GateALU/1and GateMARMX/1control bus access.
– MIO.EN/1set to 1 for memory or I/O access.
– R.W/1: 0 for read, 1 for write.
– ALUK/2: 00 forADD, 01 forAND, 10 forNOT, 11 for passing through input 1.
– PCMX/2: from right to left : 00, 01, 10 et 11.
– MARMX/2: from left to right : 00, 01 et 10 (11 unused).
– SR1MX/2: from top to bottom : 00 et 01 (10 and 11 unused).
– DRMX/2: destination register (signal DR) : IR[11 : 9]for 00, IR[8 : 6]for 01 (10 and 11 unused).
– SR2MX/1: directly set by bit 5 of instruction, does not need to be controled ; used to differentiate between immediate and register mode for ALU operations.
7
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ADD DR, SR1, SR2
(DR ← SR1 + SR2) 0 0 0 1 DR SR1 0 0 0 SR2
ADD DR, SR1, imm5
(DR ← SR1 + SEXT(imm5)) 0 0 0 1 DR SR1 1 imm5 : immediate 5-bits signed AND DR, SR1, SR2
(DR ← AND(SR1, SR2)) 0 1 0 1 DR SR1 0 0 0 SR2
AND DR, SR1, imm5
(DR ← AND(SR1, SEXT(imm5))) 0 1 0 1 DR SR1 1 imm5 : immediate 5-bits signed NOT DR, SR
(DR ← NOT(SR)) 1 0 0 1 DR SR 1 1 1 1 1 1
BRnzp label (PC = PC[15:9]@offset9
ifn.N+z.Z+p.P) 0 0 0 0 n z p offset 9-bits not signed in current page JMP label
(PC = PC[15:9]@offset9) 0 1 0 0 0 0 0 offset 9-bits not signed in current page JSR label
(R7 ← PCand
PC = PC[15:9]@offset9) 0 1 0 0 1 0 0 offset 9-bits not signed in current page JMPR indexed address
(PC = BaseR + ZEXT(offset6)) 1 1 0 0 0 0 0 BaseR index 6-bits not signed JSR indexed address
(R7 ← PCand
PC = BaseR + ZEXT(offset6)) 1 1 0 0 1 0 0 BaseR index 6-bits not signed LEA DR, label
(DR ← PC[15:9]@offset9) 1 1 1 0 DR offset 9-bits not signed in current page LD DR, label
(DR ← MEM(PC[15:9]@offset9)) 0 0 1 0 DR offset 9-bits not signed in current page LDR DR, indexed address
(DR ← MEM(BaseR + ZEXT(offset6))) 0 1 1 0 DR BaseR index 6-bits not signed ST SR, label
(SR → MEM(PC[15:9]@offset9)) 0 0 1 1 SR offset 9-bits not signed in current page STR SR, indexed address
(SR → MEM(BaseR + ZEXT(offset6))) 0 1 1 1 SR BaseR index 6-bits not signed
(PC ← RETR7) 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0
FIG. 2 – LC-2 instructions