• Aucun résultat trouvé

ARCHITECTURE 1.1 Look-Ahead Pipeline

Dans le document Floating-Point Components • (Page 84-89)

A multi·purpoae pin accommodating traps, output disable and react

1.0 ARCHITECTURE 1.1 Look-Ahead Pipeline

Logically, the Look-Ahead pipeline is split into two halves: the f1l'St, located at the instruction and data ports; and the second, located at the address port. Each half of the pipeline (input vs.

output) has a transparent latch which operates out of phase with the other; the address latch is transparent during the f1l'St half of the cycle (clock HI), while the input latches (instruction and data) are transparent during the second half of the cycle (clock LO). This complementary arrangement allows new instructions

to be decoded (in preparation for the following cycle) while the program address for the current cycle is held steady.

1.2 Instruction Port

The instruction port receives 7-bit instructions defming the next operation to perform from microcode. The ADSP-I401 has a built-in Look-Ahead pipeline latch, eliminating the need for an external microcode latch to hold instructions. This implementation has the further benefit of allowing instruction "look-ahead"; the sequencer is able to decode the next instruction during execution of the current cycle. During the "look-ahead" period, the sequencer precalculates the next address, allowing its output as early as possible in the next cycle.

External instructions are internally latched during clock HI, and passed directly to the instruction decoder during clock LO (transparent phase); thus, implementing the fll'st half of the Look-Ahead pipeline latch.

The use of the instruction hold mode (see: Instruction Set De-scription, 2.7; and Instruction Hold Control, appendix 4.1) allows an instruction to be held in the instruction latch for execution over several cycles (freeing microcode for use by other devices).

1.3 Address Port and Multiplexer Sources

The address port provides 16-bit program addresses with three-state drivers designed for driving large microcode memories.

Addresses come from a four-to-one microprogram address mul-tiplexer. Between the multiplexer and output port is a transparent latch which is transparent during clock HI and latched during clock LO, permitting addresses to be output as early as possible during phase one (clock HI) while holding the address constant during phase two (clock LO) - implementing the second half of the Look-Ahead pipeline latch.

Inputs to the microprogram address multiplexer are the:

• 16-Bit Program Counter

• 16-Bit Adder

• Interrupt Vector File and

• Internal 64-Word RAM.

Addressing Modes

The ADSP-I401 supports two addressing modes: direct and indirect. The direct addressing mode uses the internal adder to generate either absolute addresses from the data port (without modification) or relative addresses from the program counter (with or without extension: see Status Register, 1.4.4). The indirect addressing mode uses the lower order bits at the data port to access the contents of internal RAM for output.

ADSP-1401

Output Drivers

The address port output drivers are always active unless placed in the high-impedance state by the IDLE instruction or appro-priately asserting the TTR pin (see TTR Pin, 1.7). This allows other devices to supply microcode addresses, which is particularly useful in multi-tasking or context switching applications where several ADSP-I40ls may be sharing common microcode memory.

1.3.1 Program Counter

The program counter (PC) consists of a 16-bit incrementing counter. For most instructions, the PC is incremented by the end of the cycle (post-increment) as follows:

PC

<

= output address

+

I.

1.3.2 Adder and Width Control •

For absolute jumps, data from the data port is passed unchanged through the adder directly to the microprogram address port.

For relative jumps, a twos complement offset is supplied from the data port and added with the 16~bit PC. Since the PC normally points to the next instruction, the jump distance is (offset

+

I) from the jump instruction. See Status Register (1.4.4) for more details.

The width control block permits microcode width to be reduced in systems not requiting full, 16-bit jump distances. Internal width control logic sign-extends reduced offsets of 8- and 12-bits to full 16-bit precision, accommodating jumps in either direction (positive or negative displacement).

1.3.3 Interrupt Vector File

Ten prioritized interrupt vectors may be stored in the interrupt vector file. The associated interrupts are internally latched and may be individually masked or entirely disabled by the "Disable Interrupts" (DISIR) instruction. The highest priority interrupt vector displaces the usual address on the next cycle following its detection. See Interrupts (1.4.3) for more details.

1.3.4 Intemal RAM

Any of the 64 words of RAM may be output on the address port. Four distinct address sources may access the RAM:

• Local Stack Pointer

• Global Stack Pointer

• Subroutine Stack Pointer and

• Lower Order Data Port Bits.

The use of internal RAM and its various address sources are described in section 1.4.2.

1.4 Bidirectional Data Port

The 16-bit bidirectional data port (Dls_o) supplies direct or indirect jump addresses and permits loading or dumping of all internal registers. The input data latch freezes incoming data (for counter or register writes executed during that cycle) during the first half-cycle (clock HI) and is transparent for the remainder of the cycle. The output data driver asserts output data only during the first half-cycle of a data output instruction and is independent of the address port drivers. This complementary 1/0 arrangement permits data to be output from the sequencer (as in a read register instruction) during the first half-cycle while accommodating external data setups (for the next cycle) during the second half-cycle.

MICROCODED SUPPORT COMPONENTS 3-7

Direct addressing via the data port may be either relative or absolute. For indirect addressing, the six LS data bits (Ds_o) are used to address internal RAM, containing the desired jump address (see Internal RAM, 1.4.2).

1.4.1 Counters

Four independent l6-bit counters are provided for maintaining loops and event tracking. These counters hold twos complement values that may be decremented or preloaded through dedicated instructions. The sign bit associated with the most recently used counter, prior to its decrement, is always saved in the status register (SRI)' Simultaneously, the sign bit is also made available to control various conditional instructions or for asserting the lowest priority interrupt, IRo, reserved for counter underflow (see: Instruction Set Description, 2.0; and Interrupts, 1.4.3).

Note that interrupt IRo is primarily used for ending writeable control store downloads (see Instruction Set Description - WCS, 2.7). Use of IRo in the context of a "Decrement Counter and Interrupt on Underflow" operation represents the worst case instruction and flag setup times because of the additional overhead in processing the interrupt after determining whether the counter was underflowed. These setup times are specified two ways:

1. all conditions and 2. IRo masked.

The source of SIGN (applied to the condition test) depends upon the type of instruction used (see Instruction Set Description, 2.1). Two possibilities exist:

I. If an explicit counter is selected, then the sign applied is that of the counter, prior to the decrement.

2. If no counter is selected, then the sign applied is implicitly that of the status register, SRI'

1.4.2 Internal RAM

The ADSP-I40I's internal 64-word RAM implements two distinct stacks: a Subroutine Stack (SS) and a Register Stack (RS). The subroutine stack has a dedicated, Subroutine Stack Pointer (SSP), while the register stack shares two pointers: the Local Stack Pointer (LSP) and the Global Stack Pointer (GSP). The three stack pointers are each held in 6-bit, preloadable, upldown counters.

Upon reset, (TTR pin held HI for three cycles, see TTR Pin, 1. 7) the SSP is initialized to 0 (top of RAM). The RS pointers (LSP and GSP) are typically configured as shown in Figure 2 using the "Write RSP" instruction (WRRSP). The SSP pushes down while the RS pointers push up. Selection of the active RS pointer (LSP or GSP) is made in the status register.

Stack overflow detection is provided via a stack limit register to protect software integrity and allow stack expansion (see In-struction Set Description - SLRIVP, 2.5).

Each RS pointer may be explicitly initialized by performing the

"Write RS Pointer" (WRRSP) instruction. The LSP should be located above the GSP, allowing the local stack to grow upwards as the level of nested subroutines increases. Finally, indirect jump address space (as needed) should be reserved below the global stack.

The sequencer will generate a stack underflow interrupt whenever RAM location zero is popped. This facility may be used in support of stack paging. IV 9 should be masked if not using stack paging, allowing location zero to be used as the first stack location without interrupting. When using paged stacking, location zero must be reserved as an underflow buffer to avoid a subsequent

3-8 MICROCODED SUPPORT COMPONENTS

stack POP (which may otherwise occur, depending upon the next instruction) prior to the interrupt routine saving the stack.

00

Figure 2. Typical RAM Initialization Register Stack Pointers (LSP and GSP)

Upon entering a routine, up to four jump addresses may be pushed onto the register stack. A Push onto the register stack first decrements the RS pointer (either LSP or GSP, depending upon the status register) and then writes the appropriate data to RAM. A Pop from the register stack first reads the RAM location and then increments the RS pointer (LSP or GSP).

Four registers are available within context of any routine which are addressed relative to the stack pointer (LSP or GSP) by the two LSBs of the relevant instruction. For example, the instruction:

IF CONDITION, JMP R2

acces~es the location (LSP

+

2 or GSP

+

2) in RAM as the condi-tional address source. Prior to exiting a routine, local or global registers can be effectively removed from the RS by the "ADD i TO RSP" (AIRSP) instruction (see Instruction Set Description, 2.2).

Often, the same set of jump addresses are used by several different routines. The GSP is available for addressing these common registers - conserving RAM space and eliminating repeated stack pushes and pops. Global registers can be pushed, popped, and used by conditional instructions in the same way that local registers are handled. In addition, the GSP can itself be pushed and popped tolfrom the subroutine stack, allowing different routines to access different subsets of the global stack area.

Subroutine Stack Pointer (SSP)

A Push onto the SS (jump subroutine or interrupt) first increments the SSP and then writes the return address to RAM. A pop from the SS first reads the return location and then decrements the SSP, effectively removing the data from the stack (although the data remains in RAM). For interrupts, the return address is the one that would have been output in the cycle when the

interrupt vector was output. For subroutine jumps, the return address is the instruction immediately following the subroutine call. For further information, see: Return from Interrupt with Pending Interrupt, appendix 4.2; and the Instruction Set De-scription, 2.0.

The subroutine stack

can

also be used to save key program parameters such as the status register, GSP, or counter values.

After entering a new routine, critical parameters from the calling routine are pushed onto the stack, thus freeing the associated hardware for use by the new routine. Prior to the end of the routine, the original parameters are restored with their former values for continued use by the calling routine.

The Stack Usage Example (appendix 4.3) illustrates the state of RAM after three subroutine calls.

Stack Limit Register and Stack Owrjlqw

The preloadable Stack Limit Register (SLR) and associated circuitry

warns

the user of impending stack overflows, permitting stack overflow recovery. The highest priority interrupt, IR" is assigned to stack overflow, although it may be masked. A stack overflow interrupt will occur under any of the following three cir-cumstances:

The three location buffer between the SLR and the RS pointer allows for three extra pushes that may occur (in a worst case) prior to entering the stack overflow service routine. These pushes would be:

1. the push causing the initial overflow

2. a possible push operation while IV 9 is output and 3. the IR, return address push.

See: Interrupts, 1.4.3; and Three Stack Pushes on Stack Overflow (appendix 4.2.5) for more details.

The SLR is only 4-bits wide and is compared to the 4 MS bits of the 6-bit RAM address. Therefore, stack limits may only be set at integer multiples of 22, i.e., RAM locations 0, 4, 8, 12,

••. , 60. The SLR is right-filled the additional two bits with zeros or ones, depending upon the direction of the push being performed ('00' for SS pushes and 'U' for RS pushes, see In-struction Set Description - SLRIVP, 2.S). In the cycle following a stack overflow, the highest priority interrupt vector IRV9 (also used for trapping; see TTR Pin, 1.7) is output. To determine the cause of this interrupt, both SS and RS pointers must be tested in the first several cycles of the service routine. Prior to returning from the overflow interrupt routine, the SLRIVP instruction must be executed, to clear the calling IR, from the interrupt latch.

1.4.3 interrupti

The ADSP-I401 processes eight externsl and two internal inter-rupts. All external interrupts are level sensitive (positive logic:

see IR

Latch,

this section) and are

processed

by the interrupt 10gic block. The block eLlments (see Figure 4) are comprised of an interrupt de-multiplexcr followed by an interrupt latch, masking

I.

and priority decoder for selecting the most urgent interrupt (I~ having the

hiahest

priority, and IRa the lowest), and special OIIe-shot to override the address multiplexer with the interrupt

ADSP-1401

vector (IV 9-0) on the cycle following the interrupt request.

The external interrupts (IRs_I) may be used for any purpose, however, unused inputs IIIIISt not be left floating (i.e., tie them to logic LO so as to preclude the associated interrupt). Two additional interrupts which are internal are reserved for stack overflow - IR, (see Stack Limit Register and Stack Overflow, 1.4.2) and counter underflow - IRa (see Counters, 1.4.1). See Counters (1.4.1) for implications of using IRa for other than writable control store downloading.

Interrupt vectors are always output (aasuming interrupts are enabled and the associated interrupt is not masked) on the cycle immediately following the acceptance of the interrupt request.

Contextual saves (stacking and storing) should be made im-mediately upon entering the interrupt service routine and restored immediately prior to its exit.

Up to four external interrupts may be connected directly to the external interrupt pins, EXIR._1t and are treated as interrupts IRs_s, respectively. Lower priority interrupts, IR._1t must be masked out in this case.

Up to eight external interrupts may be accommodated using time-division multiplexing.

An

external 2:1 multiplexer reduces the eight externsl interrupts to two groups of four (see Figure 3). An internsl de-multip1exer automatically restores the external interrupts back to eight.

The interrupt vector file may be directly read and written via the data bus with the aid of the Interrupt Vector Pointer (see Instruction Set Description, Interrupts, 2.S).

IROB

Figure 3. Expanding External Interrupts IR Latch

Interrupt requests IRs_s are latched during the first half-cycle (clock lfi), while IR._I are latched during the second half-cycle (clock LO). Once latched, externsl interrupt requests are held until processed, even if the externsl request signal goes away.

This latching technique allows removal of externsl interrupt sources after they have been recognized by the sequencer.

Latched user interrupt requests (IRs-I) are held until: i) the interrupt is processed and a "Return from Interrupt" (RTNIR) instruction is executed; ii) the interrupt service routine executes a ''Clear Current Interrupt" instruction (allowing nested inter-rupts); or, iii) a "Clear All Interrupts" instruction is executed.

Reserved interrupts (IR, and IRa) are cleared from the interrupt latch by utilizing the SLRIVP and CLRS instructions, respectively.

See Internal IR Control Logic (1.4.3) for details.

The user may byps6S the interrupt latch with the "Select Trans-parent Interrupts" (STIR) instruction (setting status register bit SRo). In the traneparent mode, the interrupting device must assert the interrupt request until the interrupt service routine resets the request source.

MICROCODED SUPPORT COMPONENTS 3-9

EXIR4-'

T R A P - - - . - _ .

INTERRUPT VECTOR

FILE (lVI·OI

UNDRF~

(lROI

TO ADDRESS PORT

SIGNISR11

Q PRIORITY DECODER IRVP IRO

10

INTERRUPT IN PROGRESS (lRIPI

Figure 4. Internal Interrupt Control Logic IRMask

All ten interrupts may be independently masked using status register bits SR15 _ 6 (corresponding to interrupts I~_o). Setting a particular mask bit prevents the interrupt from being executed.

Note that the status register may be read or written via the Data port, and also pushed and popped to/from the subroutine stack, allowing nesting and servicing of interrupts in any desired order (see: Internal IR Control Logic, 1.4.3; and Status Register, 1.4.4).

Two instructions allow bitwise clearing or setting of the interrupt mask. "IR Mask Bit Clear" (IRMBC) will clear those mask bits for which the corresponding data bits (015_6, as applied to I~_o) are set, while "IR Mask Bit Set" (IRMBS) will set those mask bits for which the corresponding data bits are set. In both cases, zeros in the data field will preserve the corresponding mask bit. See Instruction Set Description - Status Register, 2.3.

IR Priorily Decoder

Unmasked interrupts are passed to the priority decoder which determines the most urgent, valid interrupt and generates an internal Interrupt'Request Signal (IRS). The corresponding vector is then fetched from the interrupt vector file and passed to the address port.

Minimum IR Servicing Requirements

Interrupt vectors are output on the cycle following the acceptance of an interrupt request. Interrupt jumps differ from subroutine jumps in that subroutine jumps push the return address in the same cycle as the jump address is output, whereas interrupt return addresses are not pushed until the folluwing cycle. This is 3-10 MICROCODED SUPPORT COMPONENTS

because the instruction executing while the interrupt vector is output may be utilizing RAM and must complete its execution prior to pushing the interrupt return address. Thus, the PC (interrupt return address) is pushed automatically in the first cycle of the interrupt service routine, i.e., the cycle folluwing the interrupt request acceptance.

For this reason, the first instruction of any interrupt service routine is always ignored; it must be a no-op (CONT). Note that a minimum interrupt service routine would be a CONT followed by a RTNIR.

lmernallR Control Logic

The interrupt enable bit of the status register, SR2, must be set for interrupt servicing to occur. Interrupt servicing may be inhibited by clearing this bit, although external interrupt reques~

will continue to be latched.

Only one interrupt is ever active at a time. Additional interrupts are "locked out" by an internal "Interrupt In Progress" signal (IRIP) during interrupt servicing (except for TRAP), although they continue to be latched. The IRIP signal is automatically reset upon the "Return from Interrupt" (RTNIR) instruction which pops the return address from the subroutine stack to the PC.

Normally, multiple interrupts are accumulated in the interrupt latch. Whenever a valid interrupt is pending, the internal signal

"Interrupt Request" (IRQ) is asserted. Upon each RTNIR, the highest priority, unmasked, pending interrupt is serviced.

Nested interrupts are supported with two instructions: "Clear Current Interrupt" (CCIR) or "Clear All Interrupts" (CAIR).

The CCIR instruction clears the IRIP signal and interrupt latch bit for the interrupt in progress. This action re-enables inter-rupting, relegating the interrupt in progress to a subroutine status. If an external interrupt is pending, the associated IR vector will be output on the cycle following CCIR. To cancel all pending interrupt requests, the CAIR instruction clears the IRIP signal and the entire interrupt latch.

Normally, it is good practice to convert interrupts to subroutines.

This can be done by executing the "Clear Current Interrupt"

(CCIR) instruction (resetting IRIP) and should be done as early as possible in the interrupt service routine. There are two reasons for changing the status of an interrupt to that of a subroutine.

Firstly, if IRIP is allowed to remain active throughout the interrupt service routine, then the occurrence of either internal interrupt (stack overflow or counter underflow, IR9 or IRo, respectively) will remain undetected until the current interrupt concludes;

the user will be unaware of these interrupt requests.

the user will be unaware of these interrupt requests.

Dans le document Floating-Point Components • (Page 84-89)