A Flavor of MIPS Assembly Language

MIPS Architecture

2.1 A Flavor of MIPS Assembly Language

Assembly language is the human-writable (and readable) version of the CPU’s raw binary instructions, and there’s a whole chapter devoted to it later. Read-ers who have never seen any assembly language will find some parts of this book mystifying.

Most MIPS assembler programs interpret a rather stark language, full of register numbers. But toolchains often make it easy to use a microprocessor language, at least to allow the programmer to write names where the strict assembler language requires numbers. Mast use the C preprocessor because

of its familiarity. The C preprocessor strips out C-style comments, which therefore become usable in assembler code.

With the help of the C preprocessor, MIPS assembler code almost invari-ably uses names for the registers. The names reflect each register’s conven-tional use (which we’ll talk about in Section2.2).

For readers familiar with assembly language, but not the MIPS version, here are some examples of what you might see:

/* this is a comment */

# also is this

entrypoint: # that’s a label

addu $1, $2, $3 # (registers) $1 = $2 + $3

Like most assembler languages, it is line oriented. The end of a line de-limits instructions, and the assembler’s native comment convention is that it ignores any text on a line beyond a “#” character. But it is possible to put more than one instruction on a line, separated by semicolons.

A label is a word followed by a colon “:” — word is interpreted loosely, and labels can contain all sorts of strange characters. Labels are used to define entry points in code and to name storage locations in data sections.

A lot of instructions are three-operand, as shown. The destination register is on the left (watch out, that’s opposite to the Intel x86 convention). In general, the register result and operands are shown in the same order you’d use to write the operation in C or any other algebraic language, so

subu $1, $2, $3

means exactly

$1 = $2 - $3

That should be enough for now.

2.2 Registers

There are 32 general-purpose registers for your program to use: $0 to $31.

Two, and only two, behave differently from the others:

$0 always returns zero, no matter what you store in it.

22 2.2. Registers

$31 is always used by the normal subroutine-calling instruction (jal) for the return address. Note that the call-by-register version (jalr) can use any register for the return address, though use of anything except $31 would be eccentric.

In all other respects all these registers are identical and can be used in any instruction (you can even use $0 as the destination of instructions, though the resulting data will disappear without a trace).

In the MIPS architecture the program counter is not a register, and it is probably better for you not to think of it that way-in a pipelined CPU there are multiple candidates for its value, which gets confusing. The return address of a jal is the next instruction but one in sequence:

...

jal printf move $4, $6

xxx # return here after call

That makes sense because the instruction immediately after the Call is the call’s delay slot-remember, the rules say it must be executed before the branch target. The delay slot instruction of the call is rarely wasted, because it is typically used to set up a parameter.

There are no condition codes; nothing in the status register or other CPU internals is of any consequence to the user-level programmer.

There are two register-sized result ports (calledhiandlo) associated with the integer multiplier. They are not general-purpose registers, nor are they useful for anything except multiply and divide operations. However, there are instructions defined that insert an arbitrary value back into these ports

— after some reflection, you may be able to see that this is required when restoring the state for a program that has been interrupted.

The floating-point math coprocessor (floating-point accelerator, or FPA), if available, adds 32 floating-point registers; in simple assembler language they are called $f0 to $f31.

Actually, for MIPS I and MIPS II machines only the 16 even-numbered registers are usable for math. However, they can be used for either single-precision (32-bit) or single-precision (64-bit) numbers; when you do double-precision arithmetic, register $f1 holds the remaining bits of the register identified as $f0. Only moves between integer and FPA, or FPA load/store instructions, ever refer to odd-numbered registers (and even then the assem-bler helps you forget).

MIPS III CPUs have 32 genuine FP registers, but even then software might not use the odd-numbered on es, preferring to maintain software compatibil-ity with the old family.

2.2.1 Conventional Names and Uses of General-Purpose Registers

We’re a couple of pages into an architecture description and here we are talking about software. But I think you need to know this now.

Table 2.1: Conventional names of registers with usage mnemonics

0 zero Always return 0

1 at (assembler temporary) Reserved for use by assembler 2-3 v0,v1 Value returned by subroutine

4-7 a0-a3 (arguments) First few parameters for a subroutine 8-15 t0-t7 (temporaries) Subroutines can use without saving

24,25 t8,t9

16-23 s0-s7 Subroutine register variables; a subroutine that writes one of these must save the old value and restore it before it exits, so thecallingroutine sees the values preserved

26,27 k0,k1 Reserved for use by interrupt/trap handler; may change under your feet

28 gp Global pointer; some run-time systems maintain this to give easy access to (some) “static” or “extern” variables

29 sp stack pointer

30 s8/fp Ninth register variable; subroutines that need one can use this as a frame pointer

31 ra Return address fro subroutine

Although the hardware makes few rules about the use of registers, their practical use is governed by a forest of conventions. The hardware cares nothing for these conventions, but if you want to be able to use othe r people’s subroutines, compilers, or operating systems, then you had better fit in.

With the conventional uses of the registers go a set of conventional names.

Given the need to fit in with the conventions, use of the conventional names is prettv much mandatory. The common names are listed in Table 2.1.

Somewhere about 1996 Silicon Graphics began to introduce compilers that use new conventions. The new conventions can be used to build pro-grams that use 32-bit addresses or that use 64-bit addressing, and in those two cases they are called respectively “n32” and “n64”. We’ll ignore them for now, but we describe them in detail in Chapter 10.

Conventional Assembler Names and Usages for Registers

at : This register is reserved for the synthetic instructions generated by the assembler. Where you must use it explicitly (such as when saving or

24 2.2. Registers

restoring registers in an exception handler) there’s an assembler direc-tive to stop the assembler from using it behind your back (but then some of the assembler’s macro instructions won’t be available).

v0, v1 : Used when returning non-floating-point values from a subroutine.

If you need to return anything too big to fit in two registers, the compiler will arrange to do it in memory. See Section10.1 for details.

a0-a3 : Used to pass the first four non-FP parameters to a subroutine. That’s an occasionally false oversimplification — see Section10.1 for the grisly details.

t0-t9 : By convention, subroutines may use these values without preserving them. This makes them a good choice for “temporaries” when evaluat-ing expressions — but the compiler/programmer must remember that values stored in them may be destroyed by a subroutine call.

s0-s8 : By convention, subroutines must guarantee that the values of these registers on exit are the same as they were on entry, either by not using them or by saving them on the stack and restoring them before exit.

This makes them eminently suitable for use as register variables or for storing any value that must be preserved over a subroutine call.

k0, k1 : Reserved for use by an OS’s trap/interrupt handlers, which will use them and not restore their original value; so they are of little use to anyone else.

gp : If a global pointer is present, it will point to a load-time-determined location in the midst of your static data. This means that loads and stores to data lyng within 32KB of either side of the gp value can be performed in a single instruction using gpas the base register.

Without the global pointer, loading data from a static memory area takes two instructions: one to load the most significant bits of the 32-bit con-stant address computed by the compiler and loader and one to do the data load.

To use gpa compiler must know at compile time that a datum will end up linked within a 64KB range of memory locations. In practice it can’t know; it can only guess. The usual practice is to put small global data items (8 bytes and less in size) in the gp area and to get the linker to complain if it still gets too big.

Not all compilation systems and not all run-time systems support gp.

sp : It takes explicit instructions to raise and lower the stack pointer, so MIPS code usually adjusts the stack only on subroutine entry and exit;

it is the responsibility of the subroutine being called to do this. sp is normally adjusted, on entry, to the lowest point that the stack will need to reach at any point in the subroutine. Now the compiler can access

stack variables by a constant offset from sp. Once again, see Section 10.1 for conventions about stack usage.

fp : Also known as s8, a frame pointer will be used by subroutine to keep track of the stack if it wants to do things that involve extending the statck by amount that is determined at run time. Some languages may do this explicitly; assembler programmers are always welcome to exper-iment; and C programs that use the alloca() library routine will find themselves doing so.

If the stack bottom can’t be computed at compile time, you can’t access stack vanaUles from sp, so fp is initialized by the function prologue to a constant position relative to the function’s stack frame. Cunning use of register conventions means that this behavior is local to the function and doesn’t affect either the calling code or any nested function calls.

ra : On entry to any subroutine, return address holds the address to which control should be returned — so a subroutine typically ends with the instruction jr ra.

Subroutines that themselves call subroutines must first savera, usually on the stack.

There is a corresponding set of standard uses for floating-point registers too, which we’ll summarize in Section 7.5. We’ve described here the original promulgated by MIPS; some evolution has occured in recent times, but we’ll keep that back until Section10.8, which discusses the details of some newer standards for calling conventions.

Dans le document 0.1 Style and Limits (Page 42-47)