• Aucun résultat trouvé

Write Buffers and When You Need to Worry

Dans le document 0.1 Style and Limits (Page 117-120)

Cache Error Handling

4.13 Write Buffers and When You Need to Worry

The write-through cache common to all 32-bit MIPS CPUs demands that all CPU stores be immediately sent to main memory, which would be a big per-formance bottleneck if the CPU waited for each write to finish.

In an average C program compiled for MIPS, about 10% of instructions executed are stores, but these accesses tend to come in bursts, for example when a function prologue saves a few registers.

96 4.13. Write Buffers and When You Need to Worry

DRAM memory frequently has the characteristic that the first write of a group takes quite a long time (5-10 clock cycles is typical on these CPUs), and subsequent ones are relatively fast so long as they follow quickly.

If the CPU simply waits while a write completes, the performance hit will be huge. So it is common to provide a write buffer, a FIFO store in which each entry contains both data to be written and the address at which to write it. MIPS CPUs have used FIFOs with between one and eight entries.

The 32-bit MIPS CPUs with write-through caches depend heavily on write buffers. In these CPUs, a four-entry queue has proved efficient for well-tuned local DRAM with CPU clock rates up to 40MHz.

Later MIPS CPUs (with write-back caches) retain the write buffer as a holding area for cache line write backs and as a time saver on uncached writes.

Most of the time the operation of the write buffer is completely transparent to software. But sometimes the programmer needs to be aware of what is happening:

• Timing relations for 1/O register accesses: This affects all MIPS CPUs.

When you perform a store to an I/O register, the store reaches memory after a small, but indeterminate, delay. Other communication with the I/O system (e.g., interrupts) may happen more quickly — for example, you may see an active interrupt from a device “after” you have told it to generate no interrupts. In a different case, if an I/O device needs some time to recover after a write you must ensure that the write buffer FIFO is empty before you start counting out that time period. Here, you must ensure that the CPU waits while the write buffer empties. It is good practice to define a subroutine that does this job; it is traditionally calledwbflush().See Section4.13.1below for hints on implementing it.

The above describes what can happen on any MIPS R4x00 (MIPS III ISA) or subsequent CPU implemented to date. It’s also enough For the whole IDT R3051 family, the most popular embedded component CPUs. But on some earlier 32-bit systems, even stranger things can happen:

• Reads overtaking writes: When a load instruction (uncached or missing in the cache) executes while the write buffer FIFO is not empty, the CPU has a choice: Should it finish off the write or use the memory interface to fetch data for the load? It’s more efficient to do the read first — the CPU is certainly stopped until the read data arrives, but there’s a good chance that the write can be deferred and still performed in parallel with later CPU activity.1

1You may observe that there is some danger that the overtaking read may be trying to fetch locations for which there is still a write pending, which would be disastrous; however, CPUs allowing read overtaking will compare read and write addresses and give the write priority if the addresses overlap.

The original R3000 hardware left this decision in the hands of the sys tem hardware implementation. The most popular integrated MIPS I CPUs from IDT don’t permit reads to overtake writes — they have un-conditional write priority. Most MIPS III CPUs have not permitted read overtaking, but robust software doesn’t have to assume this any more.

See the description of the sync instruction in Section 8.4.9.

If you believe that your MIPS I CPU might not have unconditional write priority, then when you are dealing with I/O registers the necessary address check may not save you; a load may misbehave because an earlier store to a different address is still pending. In this case you need to call wbflush().

• Byte gathering: Some write buffers watch for partial-word writes within the same memory word and will combine those partial writes into a single operation. This is not done by any current R3051-family CPU, but it can wreak havoc with IIO register writes.

It is not a bad idea to map your I/O registers such that each register is in a separate word location (i.e., 8-bit registers should be at least 4 bytes apart). You can’t always do it.

4.13.1 Implementing wbfiush

Unless your CPU is one of the peculiar type above, you can ensure that the write buffer is empty by performing an uncached load from anywhere (which will stall the CPU until the writes have finished and the load has finished too). This is inefficient; you can minimize the overhead by loading from the fastest memory available to you.

For thase who never want to think about it again, a write to memory fol-lowed by an uncached read from the same address (with a sync in between the two if you’re running on a MIPS III or later CPU) will flush out the write FIFO on any MIPS CPU built to date (and it’s difficult to see how a CPU with-out this behavior could be a correct implementation).

Some systems use a hardware signal that indicates whether the FIFO is empty, wired to an input that the CPU can sense directly. But this isn’t done on any MIPS CPU to date.

CAUTION!

Systems often have write buffers outside the CPU. Any bus or memory interface that boosts of having write posting as a feature is

behav-ing similarly. Write buffers outside the CPU can give you just the same sort of trouble as those inside it. Take carre with your programming.

98 4.14. More about MIPS Caches

Dans le document 0.1 Style and Limits (Page 117-120)