• Aucun résultat trouvé

Floating-Point Unit (FPU)

Dans le document SPARC RISC USER'S GUIDE (Page 110-113)

RT620 hyperSPARC Central Processing Unit

3.5 Floating-Point Unit (FPU)

The hyperSPARC RT620 features a high-performance pipelined floating-point unit capable of launching one fp operation per cycle. The floating-point unit (illustrated in Figure 3-5) is comprised of a floating-point queue (FPQ), a floating-point register file, a floating-point status register (FSR), and two execution units.

The two execution units are a 64-bit floating-point arithmetic unit (FAU) and a 64- x 32-bit Floating-Point Multiplier Unit, which enable single-precision addition, subtraction, and multiplication operations to execute in one cycle. To further enhance performance, the floating-point unit utilizes condition code for-warding to the integer unit to allow one-cycle FP compares.

The general operation of the floating-point unit is illustrated by Figure 3-6, which represents the high level state transitions for the floating-point unit. There are four states through which the floating-point unit transi-tions:

1. EXECUTE. This is the normal mode of operation of the floating-point unit.

2. EXCEPTION PENDING. The floating-point unit enters this state when an exception takes place in the floating-point unit on which a trap should be taken. It remains in this state until the integer unit acknowl-edges the exception.

3. EXCEPTION. The floating-point unit enters this state when a pending exception is acknowledged by the integer unit. In this state, only fp store instructions can be executed. The floating -point unit remains in this state as long as the queue not empty (qne) bit in the FSR is not clear or if an instruction other than a fp store instruction is executed.

4. FLOATING-POINT UNIT FREEZE. The floating-point unit enters this state when the integer unit sig-nals a hold. The hold could be due to an external hold (e.g., cache miss) or an internal hold. All activities in the floating-point unit are frozen in this state.

T E e H N 0 L 0 • Y,

,$

============;;;;;R;;;;;T;;;;;6;;;;;2;;;;;O;;;;;h;!;:yp=e;;;;;r;;;;;SP;;;;;1\.;;;;;R;;;;;C=C;;;;;P=U

Floating-Point Queue and Controller

Load/Store Bus (64 bits)

Floating-Point

I

Status Register Floating-point register file 32-x32-bits

3 read ports: OPA, OPB, STORE 2 write ports: RESULT, LOAD

I

mux

I I

load reg.

I I

store reg.

+ I

I

OPAmux

I + +

I

I

OPBmux

I

Floating-point Multiply

1 1

I

result mux

I

I

Figure 3-5. Floating-Point Unit Block Diagram

no fp exception detected

queue empty or reset FP exception = IEEE exception or unfinished exception or FP unimplemented exception

hold

IU detects FP instruction in execute cycle

no FP instruction in Execute cycle

Figure 3-6. Floating-Point Unit State Transition Diagram

T E C H N O L O G Y ,

~~~~~~~~~~~~~~R=T=6=2=O=h~yp=e=r=SP=~=R=C~C=P~U

3.5.1 Floating-Point Instruction Decode-SCHEDule-and-Dispatch Controller (FPSCHED) The IFETCH and ISCHED blocks provide instruction fetch and global decoding for the RT620. All instruc-tions recognized as a floating-point instruction are forwarded to the floating-point instruction scheduler for local fp instruction decoding and fp instruction launch. The task of the FPSCHED is largely performed by the floating-point queue (FPQ) and the floating-point queue control (FPQC) blocks. The FPQ stores both instructions awaiting execution and those in the process of execution. The FPQC provides control for the FPQ, as well as local fp instruction decoding and execution scheduling. The following sections describe the FPQ and FPQC.

3.5.1.1 Floating-Point Queue (FPQ)

The floating-point queue (FPQ) is divided into two parts, a pre-queue and a post-queue, as illustrated in Figure 3-7. The post-queue consists of three queue entries corresponding to the three stages of the fp execu-tion pipeline (Executel, Execute2, and Round). The post-queue tracks instrucexecu-tions which have begun execution until an exception is detected or result generation is completed.

In order to support exception handling, the post-queue retains both the instruction address and a copy of the instruction as it passes through successive stages of the fp execution pipeline. Since the floating-point unit and integer unit pipelines operate somewhat independently, the exception detected by the floating-point unit is delayed with regards to the integer unit pipeline. The address of the exception causing fp instruction is used by trap handlers to determine the point in the instruction stream where the exception occurred.

The pre-queue is a performance enhancement which largely eliminates stalls of the RT620 due to the execu-tion of multiple-cycle floating-point unit instrucexecu-tions. The pre-queue contains four entries, and behaves like an auxiliary set of instruction fetch buffers. When a series of fp instructions is fetched, the fp instructions are deposited in the pre-queue until execution, thereby allowing the integer unit to continue fetching and processing additional instructions.

T E e H N 0 LOG Y,

,$ ============;;;:R;;;:T;;;:6;;;:2;;;:O;;;:h:!::yp~e;;;:r;;;:SP;;;:i\;;;:R;;;:C=C;;;:P=U

r---Figure 3-7. Floating-Point Queue (FPQ) 3.5.1.2 Floating-Point Queue Control (FPQC)

The FPQC provides the following functions :

• It provides FPQ management, including: queue advance, queue load and queue store.

• It decodes and launches fp instructions.

qne bit ofFSR

• It selects the appropriate fp operands for fp instructions. This includes forwarding any fp operands.

• It performs dependency checking against other fp instructions before launching an fp instruction.

• It interacts with the integer unit to perform dependency checking between fp instructions and fp Load and Store instructions.

• It directs loads and stores to and from the fp register file and the floating-point status register (FSR), and stores from the FPQ.

• It maintains the state of the FSR. The FPQC also forwards the floating-point condition codes (fcc) to the integer unit.

• It performs exception handling based on the status of the fp operations reported by the fp computational units and the FPQ.

3.5.1.2.1 FPQ Management

The floating-point unit unit operates in a pipelined manner. One fp instruction is launched per cycle assum-ing no constraints exist. There are three stages of execution for each fp instruction: Execute 1, Execute2, and

Dans le document SPARC RISC USER'S GUIDE (Page 110-113)