• Aucun résultat trouvé

Instruction and Data Cache Operation

3.2 Data Cache Organization and Control

The data cache supplies data to the GPRs and FPRs by means of the load/store unit, and provides buffers for load and store bus operations. The data cache also provides storage for the cache tags required for memory coherency and performs the cache block replacement LRU function.

3.2.1 Data Cache Organization

The organization of the data cache is shown in Figure 3-2. Each cache block contains eight contiguous words from memory that are loaded from an eight-word boundary (that is, bits A27-A31 of the logical (effective) addresses are zero); as a result, cache blocks are aligned with page boundaries.

3-4 PowerPC 603e RiSe Microprocessor User's Manual

Note that address bits A20-A26 provide an index to select a set. Bits A27-A31 select a byte within a block. The tags consists of bits PAO-PA19. Address translation occurs in parallel, such that higher-order bits (the tag bits in the cache) are physical. Note that the replacement algorithm is strictly an LRU algorithm; that is, the least recently used block is filled with new data on a cache miss.

Figure 3-2. Data Cache Organization 3.2.2 Data Cache Fill Operations

~

r r

-~I

The 603e's data cache blocks are filled in four beats of 64 bits each, with the critical double word loaded first. The data cache is blocked to internal accesses after a cache miss until the load from memory completes. On a cache miss, the critical double word read from memory is simultaneously written to the data cache and forwarded to the requesting unit, thus minimizing stalls due to cache filllatentcy.

3.2.3 Data Cache Control

The 603e provides several means of data cache control through the use of the WIMG bits in the page tables, control bits in the HIDO register, and user- and supervisor-level cache control instructions. While memory page level cache control is provided by the WIMG bits, the on-chip data cache can be invalidated, disabled, or locked by the three control bits in the HIDO register described in this section. (Note that, user- and supervisor-level are referred to as problem and privileged state, respectively, in the architecture specification.) 3.2.3.1 Data Cache Invalidation

While the data cache is automatically invalidated when the 603e is powered up and during a hard reset, assertion of the soft reset signal does not cause data cache invalidation.

Software may invalidate the contents of the data cache using the data cache flash invalidate (DCFI) control bit in the HIDO register. Flash invalidation of the data cache is accomplished by setting and clearing the DCFI bit in two consecutive store operations.

Chapter 3. Instruction and Data Cache Operation 3-5

3.2.3.2 Data Cache Disabling

The data cache may be disabled through the use of the data cache enable (DCE) control bit in the HIDO register. When the data cache is in the disabled state, the cache tag state bits are ignored, and all accesses are propagated to the bus as single-beat transactions. The DCE bit is cleared on power-up, causing the data cache to be disabled. The setting of the DCE bit must be preceded by a syne instruction to prevent the cache from being enabled or disabled in the middle of a data access.

Note that while snooping is not performed when the data cache is disabled, cache operations (caused by the debz, debf, debst, and debi instructions) are not affected by disabling the cache, causing potential coherency errors. An example of this would be a debf instruction that hits a modified cache block in the disabled cache, causing a copyback to memory of potentially stale data.

Regardless of the state of HIDO[DCE], load and store operations are assumed to be weakly ordered. Thus the LSU can perform load operations that occur later in the program ahead of store operations, even when the data cache is disabled. However, strongly ordered load and store operations can be enforced through the setting of the I bit (of the page WIMG bits) when address translation is enabled. Note that when address translation is disabled, the default WIMG bits cause the I bit to be cleared (accesses are assumed to be cacheable), and thus the accesses are weakly ordered. Refer to Section 3.5.2, "Caching-Inhibited Attribute (I)," for a description of the operation of the I bit and Section 5.2, "Real Addressing Mode,"

for a description of the WIMG bits when address translation is disabled.

3.2.3.3 Data Cache Locking

The contents of the data cache may be locked through the use of the DLOCK control bit in the HIDO register. A locked data cache supplies data normally on a cache hit, but cache misses are treated as cache-inhibited accesses. The cache inhibited (CI) signal is asserted if a cache access misses into a locked cache. The setting of the DLOCK bit in HIDO must be preceded by a syne instruction to prevent the data cache from being locked during a data access.

3.2.4 Data Cache Touch Load Support

Touch load operations allow an instruction stream to prefetch data from memory prior to a cache miss. The 603e supports touch load operations through a temporary cache block buffer located between the BID and the data cache. The cache block buffer is essentially a floating cache block that is loaded by the BID on a touch load operation, and is then read by a load instruction that requests that data. After a touch load completes on the bus, the BIU continues to compare the touch load address with subsequent load requests from the data cache. If the load address matches the touch load address in the BIU, the data is forwarded to the data cache from the touch load buffer, the read from memory is canceled, and the touch load address buffer is invalidated.

3-6 PowerPC 603e RISC Microprocessor User's Manual

To avoid the storage of stale data in the touch load buffer, touch load requests that are mapped as write-through or caching-inhibited by the MMU are treated as no-ops by the BIU. Also, subsequent load instructions after a touch load that are mapped as write-through or caching-inhibited do not hit in the touch load buffer, and cause the touch load buffer to be invalidated on a matching address.

While the 603e provides only a single cache block buffer, other PowerPC microprocessor implementations may provide buffering for more than one cache block. Programs written for other implementations may issue several debt or debtst instructions sequentially, reducing the performance if executed on the 603e. To improve performance in these situations, the NOOPTI bit (bit 31) in the HIDO register may be set. This causes the debt and debtst instructions to be treated as no-ops, cause no bus activity, and incur only one processor clock cycle of execution latentcy. The default state of the NOOPTI bit is cleared after a power-on reset operation, enabling the use of the debt and debtst instructions.