• Aucun résultat trouvé

T

N/A
N/A
Protected

Academic year: 2022

Partager "T"

Copied!
140
0
0

Texte intégral

(1)

COMPUTER PROGRAMMING

FASCICLE 1

MMIX

DONALD E.KNUTH StanfordUniversity

ADDISON{WESLEY

6

77

(2)

Seealsohttp://www-s-faulty.stanford.edu/~knuth/mmix.htmlfordownloadable

software,andhttp://mmixmasters.soureforge.netforgeneralnewsaboutMMIX.

Copyright 1999byAddison{Wesley

Allrightsreserved.Nopartofthispubliationmaybereprodued,storedinaretrieval

system,ortransmitted, in anyform, orby anymeans, eletroni, mehanial,photo-

opying, reording, or otherwise, without the prior onsent of the publisher, exept

that the oÆial eletroni le may be used to print single opies for personal (not

ommerial)use.

Zerothprinting(revision15),15February2004

(3)

fasile / fas_

e

k e

l / n ::: 1: asmallbundle::: aninoresene onsistingof

a ompated ymeless apitatethan a glomerule

::: 2: oneof the divisions ofa bookpublishedin parts

|P. B.GOVE, Webster'sThirdNew InternationalDitionary (1961)

This is the first of a series of updates that I plan to make available at

regularintervals as IontinueworkingtowardtheultimateeditionsofTheArt

ofComputerProgramming.

IwasinspiredtopreparefasileslikethisbytheexampleofCharlesDikens,

whoissuedhisnovelsinserialform;hepublisheda dozeninstallmentsofOliver

Twist before havinganyideawhat would beomeof BillSikes! I was thinking

also of James Murray, who began to publish 350-page portions of the Oxford

English Ditionary in 1884, nishing the letter B in 1888 and the letter C in

1895. (Murraydiedin1915whileworkingontheletterT;mytaskis,fortunately,

muh simplerthanhis.)

UnlikeDikensandMurray,I haveomputerstohelpmeeditthematerial,

sothat Ianeasilymakehangesbeforeputtingeverything togetherinitsnal

form. Although I'mtrying mybestto write omprehensiveaountsthat need

nofurtherrevision, Iknowthateverypagebringsmehundredsofopportunities

tomakemistakesandtomissimportantideas. Mylesareburstingwithnotes

aboutbeautifulalgorithmsthathavebeendisovered,butomputersienehas

grown to thepointwhere I annot hopeto beanauthorityonall thematerial

I wish to over. ThereforeI need extensivefeedbak from readersbeforeI an

nalizetheoÆial volumes.

Inotherwords,IthinkthesefasileswillontainalotofGoodStu,andI'm

exited about the opportunity to presenteverything I write to whoeverwants

to read it, but I also expet that beta-testers like you an help me make it

Way Better. As usual, I will gratefully pay a reward of $2.56 to the rst

person who reports anything that is tehnially, historially, typographially,

orpolitiallyinorret.

Charles Dikensusually published his work onea month, sometimes one

aweek;JamesMurraytendedtonisha 350-pageinstallmentaboutoneevery

18months. Mygoal,Godwilling,istoproduetwo128-pagefasilesperyear.

Most of the fasiles will represent new material destined for Volumes 4 and

higher; but sometimes I will be presenting amendments to one or more of the

earliervolumes. Forexample,Volume 4will needtorefertotopis thatbelong

in Volume 3, but weren't inventedwhen Volume 3 rst ame out. Withluk,

theentire workwillmakesense eventually.

(4)

FasileNumberOneisaboutMMIX,thelong-promisedreplaementforMIX.

Thirty years have passedsine theMIX omputer was designed, and omputer

arhiteture has been onverging during those years towardsa rather dierent

styleofmahine. ThereforeIdeidedin1990toreplaeMIXwithanewomputer

thatwouldontainevenless saturatedfatthanitspredeessor.

Exerise 1.3.1{25 in the rst three editions of Volume 1 spoke of an ex-

tendedMIXalledMixMaster,whihwasupwardompatiblewiththeoldversion.

ButMixMaster itself haslong beenhopelessly obsolete. It allowed forseveral

gigabytes of memory, but one ouldn't even use it with ASCII ode to print

lowerase letters. And ouh, its standard subroutine alling onvention was

irrevoably based onself-modifying instrutions! Deimal arithmeti and self-

modifying ode were popularin 1962, but they sure havedisappeared quikly

asmahineshavegottenbiggerandfaster. FortunatelythenewRISCmahines

haveaveryappealingstruture,soI'vehada hanetodesignanewomputer

thatisnotonlyuptodatebut alsofun.

Many readers are no doubt thinking, \Why does Knuth replae MIX by

anothermahine insteadofjuststikingto ahigh-levelprogramminglanguage?

Hardlyanybodyuses assemblersthese days." Suhpeople areentitledto their

opinions, and theyneed notbother reading the mahine-language parts of my

books. But the reasons for mahine language that I gave in the prefae to

Volume 1,writtenin theearly1960s,remainvalidtoday:

Oneoftheprinipal goals ofmybooksis toshowhowhigh-levelonstru-

tionsare atually implemented in mahines, notsimply to show how they

are applied. I explain oroutine linkage, tree strutures, random number

generation, high-preision arithmeti, radix onversion, paking of data,

ombinatorialsearhing, reursion,et.,from thegroundup.

The programs needed in my books are generally so short that their main

pointsanbegraspedeasily.

Peoplewhoare morethan asuallyinterestedin omputers shouldhaveat

least some idea of what the underlying hardware is like. Otherwise the

programstheywrite willbeprettyweird.

Mahinelanguageisneessaryinanyase,asoutputofsomeofthesoftware

thatI desribe.

Expressingbasi methods like algorithmsfor sorting andsearhing in ma-

hinelanguagemakesitpossibletoarryoutmeaningfulstudiesoftheeets

ofaheand RAMsizeandotherhardwareharateristis (memoryspeed,

pipelining, multiple issue, lookaside buers, thesize of ahe bloks, et.)

whenomparingdierentshemes.

Moreover, if I did use a high-level language, what language should it be? In

the 1960s I would probably have hosen Algol W; in the 1970s, I would then

have had to rewrite my books using Pasal; in the 1980s, I would surely have

hangedeverything toC; in the1990s,I would havehadto swith to C ++

and

thenprobablyto Java. Inthe2000s, yet anotherlanguage will nodoubtbede

(5)

rigueur. I annot aord thetime to rewritemy books as languages goin and

outoffashion; languagesaren'tthepointofmybooks,thepointisratherwhat

youan doin yourfavoritelanguage. Mybooksfousontimeless truths.

ThereforeIwillontinuetouseEnglishasthehigh-levellanguageinTheArt

of Computer Programming, and I will ontinue to use a low-level language

to indiate how mahines atually ompute. Readers who only want to see

algorithmsthatarealreadypakagedinaplug-inway,usingatrendylanguage,

shouldbuyotherpeople'sbooks.

Thegoodnewsis thatprogramming forMMIX ispleasantandsimple. This

fasilepresents

1) a programmer's introdution to the mahine (replaing Setion 1.3.1 of

Volume1);

2) theMMIXassembly language(replaingSetion1.3.2);

3) newmaterialonsubroutines,oroutines,andinterpretiveroutines(replaing

Setions1.4.1,1.4.2,and1.4.3).

Ofourse, MIX appearsin manyplaesthroughout Volumes1{3,anddozens of

programsneed to berewritten for MMIX. Readers whowould like to help with

thisonversion proessareenouraged tojoin the MMIXmasters, ahappy group

ofvolunteersbasedatmmixmasters.soureforge.net.

I am extremely grateful to all the people whohelped me with the design

of MMIX. In partiular, John Hennessy and Rihard L. Sites deserve speial

thanksfortheirativepartiipationandsubstantialontributions. Thanksalso

toVladimirIvanoviforvolunteeringtobetheMMIX grandmaster/webmaster.

Stanford, California D.E.K.

May 1999

You an, if you want, rewrite forever.

| NEIL SIMON,Rewrites: A Memoir (1996)

(6)

Chapter 1|Basi Conepts . . . 1

1.3. MMIX . . . 2

1.3.1. DesriptionofMMIX . . . 2

1.3.2. TheMMIXAssemblyLanguage . . . 28

1.4. SomeFundamentalProgrammingTehniques . . . 52

1.4.1. Subroutines . . . 52

1.4.2. Coroutines . . . 66

1.4.3. InterpretiveRoutines . . . 73

Answers to Exerises . . . 94

IndexandGlossary . . . 127

(7)

1.3. MMIX

Inmanyplaesthroughoutthisbookwewillhaveoasiontorefertoaom-

puter'sinternalmahinelanguage. Themahineweuseisa mythialomputer

alled \MMIX." MMIX|pronounedEM-miks|is very muh likenearly every

general-purposeomputer designedsine1985,exeptthatitis, perhaps,nier.

Thelanguage of MMIX ispowerfulenoughto allowbriefprogramsto bewritten

formostalgorithms,yetsimple enoughsothatitsoperationsareeasilylearned.

The reader is urged to study this setion arefully, sine MMIX language

appears in so many parts of this book. There should be no hesitation about

learningamahinelanguage;indeed,theauthoronefounditnotunommonto

bewritingprogramsinahalfdozendierentmahinelanguagesduringthesame

week! Everyonewithmorethanaasualinterestinomputerswillprobablyget

toknowatleastonemahine languagesooneror later. Mahinelanguagehelps

programmersunderstand whatreallygoesoninsidetheiromputers. Andone

onemahinelanguage hasbeenlearned,theharateristis ofanotherare easy

to assimilate. Computer sieneis largelyonerned withan understanding of

howlow-leveldetailsmakeitpossibletoahievehigh-levelgoals.

Software for runningMMIX programs on almost any real omputer an be

downloaded from the website for this book (seepage ii). The omplete soure

odefortheauthor'sMMIXroutinesappearsinthebookMMIXware[LetureNotes

in Computer Siene 1750 (1999)℄; that book will be alled \the MMIXware

doument"inthefollowingpages.

1.3.1. Desriptionof MMIX

MMIXisa polyunsaturated,100%natural omputer. Likemostmahines, ithas

anidentifyingnumber|the2009. Thisnumberwas foundbytaking 14atual

omputers very similar to MMIX and on whih MMIX ould easily be simulated,

thenaveragingtheirnumberswith equalweight:

CrayI +IBM801 +RISCII+ClipperC300+AMD29K+Motorola88K

+IBM601+Inteli960+Alpha21164+POWER2+MIPSR4000

+HitahiSuperH4+StrongARM110+Spar64

=14

=28126=14=2009: (1)

The same number may also be obtained in a simpler way by taking Roman

numerals.

Bits and bytes. MMIX works with patterns of 0s and 1s, ommonly alled

binarydigits or bits,and itusually deals with 64 bitsat a time. For example,

the64-bitquantity

1001111000110111011110011011100101111111010010100111110000010110 (2)

isa typialpattern that themahine mightenounter. Longpatternslikethis

anbeexpressedmore onvenientlyifwegroupthebits fourat atimeand use

(8)

hexadeimaldigits torepresenteah group. Thesixteen hexadeimaldigitsare

0=0000;

1=0001;

2=0010;

3=0011;

4=0100;

5=0101;

6=0110;

7=0111;

8=1000;

9=1001;

a=1010;

b=1011;

=1100;

d=1101;

e=1110;

f=1111:

(3)

Weshallalwaysuseadistintivetypefaeforhexadeimaldigits,asshownhere,

sothat theywon'tbeonfusedwiththedeimaldigits0{9;andwewillusually

alsoputthesymbol

#

justbeforeahexadeimalnumber,tomakethedistintion

evenlearer. Forexample,(2) beomes

#

9e3779b97f4a716 (4)

in hexadeimalese. Upperase digits ABCDEF are oftenused instead of abdef ,

beause

#

9E3779B97F4A7C16 looks better than

#

9e3779b97f4a716 in some

ontexts;thereisnodiereneinmeaning.

A sequene of eight bits, or two hexadeimal digits, is ommonly alled

a byte. Most omputers now onsider bytes to be their basi, individually

addressable units of information; we will see that an MMIX program an refer

toasmanyas2 64

bytes,eahwithitsownaddressfrom

#

0000000000000000 to

#

ffffffffffffffff. Letters, digits, and puntuationmarks of languages like

English areoften representedwithone byte perharater, using theAmerian

Standard Code for Information Interhange (ASCII). For example, the ASCII

equivalent of MMIX is

#

4d4d4958. ASCII is atually a 7-bit ode with ontrol

haraters

#

00{

#

1f,printingharaters

#

20{

#

7e,anda\delete"harater

#

7f

[see CACM 8 (1965), 207{214; 11 (1968), 849{852; 12 (1969), 166{178℄. It

wasextendedduringthe1980stoaninternationalstandard8-bitodeknownas

Latin-1or ISO8859-1,therebyenodingaentedletters: p^ate is

#

70e274e9.

\Ofthe 256th squadron?"

\Ofthe ghting256th Squadron," Yossarian replied.

::: \That'stwo tothe ghtingeighthpower."

| JOSEPH HELLER, Cath-22 (1961)

A16-bitodethatsupportsnearlyevery modernlanguagebeameaninter-

nationalstandard during the1990s. This ode, known as Uniodeor ISO/IEC

10646UCS-2,inludes notonlyGreekletterslikeS ands (

#

03a3 and

#

033),

CyrilliletterslikeWandw (

#

0429 and

#

0449),Armenian letterslike and

(

#

0547 and

#

0577), Hebrew letters like Y (

#

05e9), Arabi letters like –

(

#

0634), and Indian letters like f (

#

0936) or x (

#

09b6) or S (

#

0b36) or ‰

(

#

0bb7), et., but alsotensof thousandsof East Asian ideographssuh as the

Chinese harater for mathematis and omputing, (

#

7b97). It even has

speial odes for Roman numerals: MMIX =

#

216f216f21602169. Ordinary

ASCII or Latin-1 haraters are represented by simply giving them a leading

byteofzero: p^ate is

#

007000e2007400e9,al'Uniode.

(9)

Wewillusetheonvenienttermwydeto desribea 16-bitquantitylikethe

wideharatersof Uniode,beausetwo-byte quantities arequiteimportantin

pratie. Wealsoneedonvenientnamesforfour-byteandeight-bytequantities,

whihweshallall tetrabytes (or\tetras")andotabytes (or\otas"). Thus

2bytes=1 wyde;

2wydes=1 tetra;

2tetras=1 ota:

Oneotabyteequalsfour wydesequalseightbytes equalssixty-fourbits.

Bytesandmultibytequantitiesan,ofourse,representnumbersaswellas

alphabetiharaters. Usingthebinarynumbersystem,

anunsignedbytean expressthenumbers 0::255;

anunsignedwydeanexpressthenumbers0 ::65,535;

anunsignedtetraanexpressthenumbers0 ::4,294,967,295;

anunsignedotaanexpressthenumbers0 :: 18,446,744,073,709,551,615.

Integersarealsoommonlyrepresentedbyusingtwo'somplementnotation,in

whihtheleftmostbitindiatesthesign: Iftheleadingbitis1,wesubtrat2 n

to

gettheintegerorrespondingtoann-bitnumberinthisnotation. Forexample,

1isthesignedbyte

#

ff;itisalsothesignedwyde

#

ffff,thesignedtetrabyte

#

ffffffff,andthesignedotabyte

#

ffffffffffffffff. Inthis way

asignedbytean expressthenumbers 128 ::127;

asignedwydeanexpressthenumbers 32;768::32,767;

asignedtetraanexpressthenumbers 2;147;483;648::2,147,483,647;

asignedotaanexpressthenumbers 9;223;372;036;854;775;808::

9,223,372,036,854,775,807.

Memoryand registers. Fromaprogrammer'sstandpoint,anMMIXomputer

has 2 64

ells of memory and 2 8

general-purpose registers, together with 2 5

speial registers (see Fig. 13). Data is transferred from the memory to the

registers,transformedin theregisters, andtransferred fromtheregistersto the

memory. Theells ofmemoryarealled M[0℄,M[1℄,:::,M[2 64

1℄;thusifxis

anyotabyte,M[x℄isabyteofmemory. Thegeneral-purposeregistersarealled

$0,$1,:::,$255;thus ifxisanybyte,$xisanotabyte.

The 2 64

bytes of memory are grouped into 2 63

wydes, M

2

[0℄ = M

2 [1℄ =

M[0℄M[1℄,M

2 [2℄=M

2

[3℄=M[2℄M[3℄,:::;eahwydeonsistsoftwoonseutive

bytesM[2k℄M[2k+1℄=M[2k℄2 8

+M[2k+1℄,andisdenotedeitherbyM

2 [2k℄

orbyM

2

[2k+1℄. Similarlythereare2 62

tetrabytes

M

4

[4k℄=M

4

[4k+1℄==M

4

[4k+3℄=M[4k℄M[4k+1℄:::M[4k+3℄;

and2 61

otabytes

M

8

[8k℄=M

8

[8k+1℄==M

8

[8k+7℄=M[8k℄M[8k+1℄:::M[8k+7℄:

Ingeneral if x is any otabyte, thenotations M

2 [x℄, M

4

[x℄, and M

8

[x℄ denote

thewyde, thetetra, and the ota that ontain byte M[x℄; we ignore the least

(10)

$0:

$1:

$2:

.

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

.

$254:

$255:

rA:

rB:

.

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

. .

.

.

rZZ:

M[0℄ M[1℄ M[2℄ M[3℄ M[4℄ M[5℄ M[6℄ M[7℄ M[8℄

M[2 64

1℄

M[2 64

2℄

M[2 64

3℄

M[2 64

4℄

M[2 64

5℄

M[2 64

6℄

M[2 64

7℄

M[2 64

8℄

M[2 64

9℄

Fig. 13. The MMIX omputer, as seen by a programmer, has 256 general-purpose

registersand 32speial-purposeregisters, togetherwith2 64

bytes ofvirtualmemory.

Eahregisterholds64bitsofdata.

signiantlgtbitsofxwhenreferringtoM

t

[x℄. Forompleteness,wealsowrite

M

1

[x℄=M[x℄,andwedeneM[x℄=M[xmod2 64

℄whenx<0or x2 64

.

The 32 speial registers of MMIX are alled rA, rB, :::, rZ, rBB, rTT,

rWW, rXX, rYY, and rZZ. Like theirgeneral-purpose ousins, theyeah hold

an otabyte. Their uses will be explained later; for example, we will see that

rAontrolsarithmetiinterrupts whilerRholdstheremainderafter division.

Instrutions. MMIX's memory ontains instrutions as well as data. An in-

strutionor\ommand"isatetrabytewhosefourbytesareonventionallyalled

OP,X,Y,andZ. OPistheoperationode(or\opode,"forshort);X,Y,andZ

speifytheoperands. Forexample,

#

20010203isaninstrutionwithOP=

#

20,

X=

#

01, Y=

#

02, and Z=

#

03, andit means\Set $1 to thesum of$2 and

$3." Theoperandbytesarealwaysregardedasunsigned integers.

Eah of the 256 possible opodes has a symboli form that is easy to re-

member. Forexample,opode

#

20isADD . Wewilldealalmostexlusivelywith

symboli opodes; the numeri equivalentsan be found, ifneeded, in Table1

below,andalsoin theendpapers ofthisbook.

The X,Y, and Zbytes alsohavesymboli representations, onsistentwith

the assembly language that we will disuss in Setion 1.3.2. For example,

the instrution

#

20010203 is onventionally written `ADD $1,$2,$3', and the

additioninstrutioningeneraliswritten`ADD$X,$Y,$Z'. Mostinstrutionshave

threeoperands,butsomeofthemhaveonlytwo,andafewhaveonlyone. When

therearetwooperands,therstisXandtheseondisthetwo-bytequantityYZ;

thesymboli notationthen hasonly one omma. For example, theinstrution

(11)

`INCL $X,YZ ' inreases register $Xby theamount YZ. Whenthere is onlyone

operand,itistheunsignedthree-bytenumberXYZ,andthesymboli notation

hasno omma at all. For example, we will see that `JMP +4*XYZ' tells MMIX

to nd itsnext instrutionby skipping ahead XYZtetrabytes; the instrution

`JMP+1000000'hasthehexadeimalform

#

f003d090,beauseJMP=

#

f0and

250000=

#

03d090.

We will desribe eah MMIX instrution bothinformally and formally. For

example,theinformal meaningof `ADD$X,$Y,$Z' is\Set $Xto thesumof $Y

and$Z";theformaldenitionis`s($X) s($Y)+s($Z)'. Heres(x)denotesthe

signedinteger orrespondingto thebitpattern x,aordingto theonventions

oftwo'somplementnotation. Anassignmentlikes(x) N meansthatxisto

besettothebitpatternforwhihs(x)=N. (Suhanassignmentausesinteger

overow if N is too large or too small to t in x. For example, an ADD will

overowifs($Y)+s($Z) isless than 2 63

or greaterthan2 63

1. Whenwe're

disussing an instrution informally, we will often gloss over the possibility of

overow;theformaldenition,however,willmakeeverythingpreise. Ingeneral

theassignments(x) NsetsxtothebinaryrepresentationofNmod2 n

,where

nisthenumberofbitsinx,anditsignalsoverowifN < 2 n 1

orN 2 n 1

;

seeexerise5.)

Loading and storing. Although MMIX has256 dierentopodes, we willsee

thattheyfallintoafeweasilylearnedategories. Let'sstartwiththeinstrutions

thattransferinformationbetweentheregistersandthememory.

Eah of the following instrutions has a memory address A obtained by

adding$Yto$Z. Formally,

A= u($Y)+u($Z)

mod2 64

(5)

isthesumoftheunsignedintegersrepresentedby$Yand$Z,reduedtoa64-bit

numberbyignoringanyarrythatoursattheleftwhenthosetwointegersare

added. Inthisformulathenotationu(x)isanalogoustos(x),but itonsidersx

tobeanunsignedbinarynumber.

LDB$X,$Y,$Z(loadbyte): s($X) s M

1 [A℄

.

LDW$X,$Y,$Z(loadwyde): s($X) s M

2 [A℄

.

LDT$X,$Y,$Z(loadtetra): s($X) s M

4 [A℄

.

LDO$X,$Y,$Z(loadota): s($X) s M

8 [A℄

.

Theseinstrutionsbringdata frommemoryintoregister$X,hangingthedata

ifneessary froma signed byte, wyde,or tetrabyteto a signed otabyte of the

samevalue. Forexample,supposetheotabyteM

8

[1002℄=M

8

[1000℄is

M[1000℄M[1001℄:::M[1007℄=

#

0123456789abdef: (6)

Thenif$2=1000and$3=2,wehaveA=1002,and

LDB $1,$2,$3 sets$1

#

0000000000000045;

LDW $1,$2,$3 sets$1

#

0000000000004567;

LDT $1,$2,$3 sets$1

#

0000000001234567;

LDO $1,$2,$3 sets$1

#

0123456789abdef:

(12)

Butif$3=5,sothat A=1005,

LDB $1,$2,$3 sets $1

#

ffffffffffffffab;

LDW $1,$2,$3 sets $1

#

ffffffffffff89ab;

LDT $1,$2,$3 sets $1

#

ffffffff89abdef;

LDO $1,$2,$3 sets $1

#

0123456789abdef:

Whena signedbyte orwydeor tetrais onverted to asigned ota,itssign bit

is\extended"into allpositionsto theleft.

LDBU$X,$Y,$Z(loadbyteunsigned): u($X) u M

1 [A℄

.

LDWU$X,$Y,$Z(loadwydeunsigned): u($X) u M

2 [A℄

.

LDTU$X,$Y,$Z(loadtetraunsigned): u($X) u M

4 [A℄

.

LDOU$X,$Y,$Z(loadotaunsigned): u($X) u M

8 [A℄

.

These instrutionsare analogousto LDB ,LDW ,LDT,andLDO , buttheytreatthe

memory data as unsigned; bit positions at the left of the register are set to

zero when a short quantity is being lengthened. Thus, in the example above,

LDBU$1,$2,$3with$2+$3=1005would set$1

#

00000000000000ab.

The instrutions LDO and LDOU atually have exatly the same behavior,

beausenosignextension orpadding withzeros isneessary whenanotabyte

is loaded into a register. Buta good programmer will use LDO when the sign

is relevant and LDOU when it is not; then readers of the program an better

understandthesignianeofwhatisbeingloaded.

LDHT$X,$Y,$Z(loadhightetra): u($X) u M

4 [A℄

2 32

.

Here the tetrabyte M

4

[A℄is loaded into the left half of $X, and therighthalf

is set to zero. For example, LDHT $1,$2,$3 sets $1

#

89abdef00000000,

assuming(6)with$2+$3=1005.

LDA$X,$Y,$Z(loadaddress): u($X) A.

This instrution, whih puts a memory address into a register, is essentially

thesame as the ADDU instrutiondesribed below. Sometimesthe words\load

address"desribeitspurposebetterthanthewords\addunsigned."

STB$X,$Y,$Z(store byte): s M

1 [A℄

s($X).

STW$X,$Y,$Z(store wyde): s M

2 [A℄

s($X).

STT$X,$Y,$Z(store tetra): s M

4 [A℄

s($X).

STO$X,$Y,$Z(store ota): s M

8 [A℄

s($X).

These instrutions go the other way, plaing register data into the memory.

Overowispossible ifthe(signed)numberintheregisterliesoutsidetherange

of the memory eld. For example, suppose register $1 ontains the number

65536=

#

ffffffffffff0000. Thenif$2=1000,$3=2,and(6)holds,

STB $1,$2,$3 sets M

8 [1000℄

#

0123006789abdef (withoverow);

STW $1,$2,$3 sets M

8 [1000℄

#

0123000089abdef (withoverow);

STT $1,$2,$3 sets M

8 [1000℄

#

ffff000089abdef;

STO $1,$2,$3 sets M

8 [1000℄

#

ffffffffffff0000:

(13)

STBU$X,$Y,$Z(storebyteunsigned):

u M

1 [A℄

u($X)mod2 8

.

STWU$X,$Y,$Z(storewydeunsigned):

u M

2 [A℄

u($X)mod2 16

.

STTU$X,$Y,$Z(storetetraunsigned):

u M

4 [A℄

u($X)mod2 32

.

STOU$X,$Y,$Z(storeotaunsigned): u M

8 [A℄

u($X).

These instrutions have exatly the same eet on memory as their signed

ounterpartsSTB ,STW ,STT ,andSTO ,but overowneverours.

STHT$X,$Y,$Z(storehightetra): u M

4 [A℄

u($X)=2 32

.

Thelefthalf ofregister$Xisstoredinmemorytetrabyte M

4 [A℄.

STCOX,$Y,$Z(storeonstantotabyte): u M

8 [A℄

X.

Aonstantbetween0and255 isstoredinmemoryotabyte M

8 [A℄.

Arithmetioperators. Mostof MMIX'soperationstakeplaestritlybetween

registers. We might as well begin our study of the register-to-register opera-

tionsbyonsideringaddition,subtration,multipliation,anddivision,beause

omputersaresupposedto beabletoompute.

ADD$X,$Y,$Z(add): s($X) s($Y)+s($Z).

SUB$X,$Y,$Z(subtrat): s($X) s($Y) s($Z).

MUL$X,$Y,$Z(multiply): s($X) s($Y)s($Z).

DIV$X,$Y,$Z(divide): s($X)

s($Y)=s($Z)

[$Z6=0℄,and

s(rR ) s($Y)mods($Z).

Sums,dierenes,and produtsneednofurther disussion. TheDIVommand

formsthequotientandremainderasdenedinSetion1.2.4;theremaindergoes

into thespeial remainder register rR,where it an be examinedby using the

instrutionGET$X,rRdesribedbelow. Ifthedivisor$Ziszero,DIVsets$X 0

andrR $Y(seeEq.1.2.4{(1));an\integer dividehek"alsoours.

ADDU$X,$Y,$Z(addunsigned): u($X) u($Y)+u($Z)

mod2 64

.

SUBU$X,$Y,$Z(subtratunsigned): u($X) u($Y) u($Z)

mod2 64

.

MULU$X,$Y,$Z(multiplyunsigned): u(rH$X) u($Y)u($Z).

DIVU $X,$Y,$Z (divide unsigned): u($X)

u(rD$Y)=u($Z)

, u(rR)

u(rD$Y)modu($Z),ifu($Z)>u(rD); otherwise$X rD, rR $Y.

Arithmetionunsigned numbersneverausesoverow. A full16-byte produt

isformedbytheMULUommand,andtheupperhalfgoesintothespeialhimult

register rH. For example, when the unsigned number

#

9e3779b97f4a716 in

(2)and(4)aboveismultipliedbyitselfweget

rH

#

618864680b583ea; $X

#

1bb32095dd51e4: (7)

Inthis asethevalueofrHhasturnedoutto beexatly 2 64

minus theoriginal

number

#

9e3779b97f4a716;this isnota oinidene! Thereasonis that(2)

atuallygives the rst64 bits of thebinary representation of the golden ratio

1

= 1, if we plae a binary radix point at the left. (See Table 2 in

AppendixA.) Squaringgivesusanapproximationtothebinaryrepresentation

of 2

=1

1

,withtheradixpointnowattheleftofrH.

(14)

Division with DIVU yields the 8-byte quotient and remainder of a 16-byte

dividend with respet to an 8-byte divisor. The upper half of the dividend

appears in the speial dividend register rD, whih is zero at the beginning of

a program; this register an be set to any desired value with the ommand

PUT rD,$Z desribed below. If rD is greater than or equal to the divisor,

DIVU $X,$Y,$Z simply sets $X rD and rR $Y. (This ase always arises

when$Ziszero.) ButDIVU neverausesaninteger dividehek.

The ADDU instrution omputes a memory address A, aording to deni-

tion(5); therefore,as disussedearlier,wesometimesgiveADDU thealternative

nameLDA . Thefollowingrelatedommandsalsohelpwithaddressalulation.

2ADDU$X,$Y,$Z(times2andaddunsigned):

u($X) u($Y)2+u($Z)

mod2 64

.

4ADDU$X,$Y,$Z(times4andaddunsigned):

u($X) u($Y)4+u($Z)

mod2 64

.

8ADDU$X,$Y,$Z(times8andaddunsigned):

u($X) u($Y)8+u($Z)

mod2 64

.

16ADDU$X,$Y,$Z(times16andaddunsigned):

u($X) u($Y)16+u($Z)

mod2 64

.

It is faster to exeutetheommand 2ADDU $X,$Y,$Y than to multiplyby 3,if

overowisnotanissue.

NEG$X,Y,$Z(negate): s($X) Y s($Z).

NEGU$X,Y,$Z(negateunsigned): u($X) Y u($Z)

mod2 64

.

In these ommands Y is simply an unsigned onstant, not a register number

(justasXwasanunsignedonstantintheSTCOinstrution). UsuallyYiszero,

inwhihaseweanwritesimplyNEG $X,$ZorNEGU $X,$Z.

SL$X,$Y,$Z(shift left): s($X) s($Y)2 u($Z)

.

SLU$X,$Y,$Z(shift leftunsigned): u($X) u($Y)2 u($Z)

mod2 64

.

SR$X,$Y,$Z(shift right): s($X)

s($Y)=2 u($Z)

.

SRU$X,$Y,$Z(shift rightunsigned): u($X)

u($Y)=2 u($Z)

.

SL and SLU both produe the same result in $X, but SL might overow while

SLU neverdoes. SRextends thesignwhenshiftingright,but SRUshiftszerosin

from theleft. ThereforeSRand SRU produethe sameresultin $Xif andonly

if$Yis nonnegativeor $Zis zero. TheSLand SRinstrutions aremuh faster

thanMUL andDIVbypowersof2. AnSLUinstrutionismuhfaster thanMULU

byapowerof2,althoughitdoesnotaetrHasMULUdoes. AnSRUinstrution

ismuhfasterthanDIVUbyapowerof2,althoughitisnotaetedbyrD. The

notationyz isoftenusedtodenotetheresultofshiftingabinaryvaluey to

theleftbyzbits;similarly,yzdenotesshiftingtotheright.

CMP$X,$Y,$Z(ompare):

s($X)

s($Y)>s($Z)

s($Y)<s($Z)

.

CMPU$X,$Y,$Z(ompareunsigned):

s($X)

u($Y)>u($Z)

u($Y)<u($Z)

.

These instrutions eah set $X to either 1, 0, or 1, depending on whether

register$Yislessthan, equalto,or greaterthanregister$Z.

(15)

Conditionalinstrutions. Severalinstrutions basetheirationsonwhether

aregisterispositive,ornegative,orzero,et.

CSN$X,$Y,$Z(onditionalsetifnegative): ifs($Y)<0,set$X $Z.

CSZ$X,$Y,$Z(onditionalsetifzero): if $Y=0,set$X $Z.

CSP$X,$Y,$Z(onditionalsetifpositive): ifs($Y)>0,set$X $Z.

CSOD$X,$Y,$Z(onditionalsetifodd): ifs($Y)mod2=1,set$X $Z.

CSNN$X,$Y,$Z(onditionalsetifnonnegative): ifs($Y)0,set$X $Z.

CSNZ$X,$Y,$Z(onditionalsetifnonzero): if $Y6=0,set $X $Z.

CSNP$X,$Y,$Z(onditionalsetifnonpositive): ifs($Y)0,set$X $Z.

CSEV$X,$Y,$Z(onditionalsetifeven): ifs($Y)mod2=0,set$X $Z.

Ifregister$Ysatisesthestatedondition,register$Zisopied toregister $X;

otherwise nothing happens. A register is negative if and only if its leading

(leftmost)bitis1. Aregisterisoddifandonlyifitstrailing(rightmost)bitis1.

ZSN$X,$Y,$Z(zeroorset ifnegative): $X $Z[s($Y)<0℄.

ZSZ$X,$Y,$Z(zeroorset ifzero): $X $Z[$Y=0℄.

ZSP$X,$Y,$Z(zeroorset ifpositive): $X $Z[s($Y)>0℄.

ZSOD$X,$Y,$Z(zeroorset ifodd): $X $Z[s($Y)mod2=1℄.

ZSNN$X,$Y,$Z(zeroorset ifnonnegative): $X $Z[s($Y)0℄.

ZSNZ$X,$Y,$Z(zeroorset ifnonzero): $X $Z[$Y6=0℄.

ZSNP$X,$Y,$Z(zeroorset ifnonpositive): $X $Z[s($Y)0℄.

ZSEV$X,$Y,$Z(zeroorset ifeven): $X $Z[s($Y)mod2=0℄.

Ifregister$Ysatisesthestatedondition,register$Zisopied toregister $X;

otherwiseregister$Xissetto zero.

Bitwise operations. Weoften nd it useful to think of an otabyte xas a

vetor v(x)of 64individual bits, and to performoperationssimultaneously on

eahomponentoftwo suhvetors.

AND$X,$Y,$Z(bitwise and): v($X) v($Y)^v($Z).

OR$X,$Y,$Z(bitwise or): v($X) v($Y)_v($Z).

XOR$X,$Y,$Z(bitwise exlusive-or): v($X) v($Y)v($Z).

ANDN$X,$Y,$Z(bitwise and-not): v($X) v($Y)^v($Z).

ORN$X,$Y,$Z(bitwise or-not): v($X) v($Y)_v($Z).

NAND$X,$Y,$Z(bitwise not-and): v($X) v($Y)^v($Z).

NOR$X,$Y,$Z(bitwise not-or): v($X) v($Y)_v($Z).

NXOR$X,$Y,$Z(bitwise not-exlusive-or): v($X) v($Y)v($Z).

Here v denotes the omplement of vetor v, obtained by hanging 0 to 1 and

1to0. Thebinaryoperations^,_,and,denedbytherules

0^0=0;

0^1=0;

1^0=0;

1^1=1;

0_0=0;

0_1=1;

1_0=1;

1_1=1;

00=0;

01=1;

10=1;

11=0;

(8)

are applied independently to eah bit. Anding is the same as multiplying or

takingtheminimum;oringisthesameastakingthemaximum. Exlusive-oring

isthesame asaddingmod 2.

(16)

MUX$X,$Y,$Z(bitwisemultiplex): v($X) v($Y)^v(rM) _ v($Z)^v(rM ) .

TheMUX operationombinestwo bitvetorsbylooking atthespeialmultiplex

maskregisterrM,hoosingbitsof$YwhererMis1andbitsof$ZwhererMis0.

SADD$X,$Y,$Z(sidewaysadd): s($X) s P

v($Y)^v($Z) .

TheSADD operationountsthenumberofbitpositionsinwhihregister$Yhas

a1 whileregister$Zhasa 0.

Bytewiseoperations. Similarly,weanregardanotabytexasavetorb(x)

ofeightindividual bytes, eah ofwhihis aninteger between0 and 255;or we

anthinkofitasavetor w(x)offourindividualwydes,ora vetort(x)oftwo

unsignedtetras. Thefollowingoperationsdealwithallomponentsatone.

BDIF$X,$Y,$Z(bytedierene): b($X) b($Y) .

b($Z).

WDIF$X,$Y,$Z(wydedierene): w($X) w($Y) .

w($Z).

TDIF$X,$Y,$Z(tetradierene): t($X) t($Y) .

t($Z).

ODIF$X,$Y,$Z(otadierene): u($X) u($Y) .

u($Z).

Here .

denotestheoperationofsaturating subtration,

y .

z = max(0;y z): (9)

These operations haveimportant appliations to text proessing, as wellas to

omputer graphis(when thebytesor wydesrepresentpixel values). Exerises

27{30disusssome oftheirbasi properties.

We an alsoregardan otabyte as an 88 Boolean matrix,that is, as an

88arrayof0sand1s. Letm(x)bethematrixwhoserowsfromtoptobottom

are thebytes of xfrom left to right; and let m T

(x) bethe transposed matrix,

whoseolumns arethebytesofx. Forexample,ifx=

#

9e3779b97f4a716is

theotabyte (2) ,wehave

m(x)= 0

B

B

B

B

B

B

B

B

B

1 0 0 1 1 1 1 0

0 0 1 1 0 1 1 1

0 1 1 1 1 0 0 1

1 0 1 1 1 0 0 1

0 1 1 1 1 1 1 1

0 1 0 0 1 0 1 0

0 1 1 1 1 1 0 0

0 0 0 1 0 1 1 0 1

C

C

C

C

C

C

C

C

C

A

; m

T

(x)= 0

B

B

B

B

B

B

B

B

B

1 0 0 1 0 0 0 0

0 0 1 0 1 1 1 0

0 1 1 1 1 0 1 0

1 1 1 1 1 0 1 1

1 0 1 1 1 1 1 0

1 1 0 0 1 0 1 1

1 1 0 0 1 1 0 1

0 1 1 1 1 0 0 0 1

C

C

C

C

C

C

C

C

C

A

: (10)

Thisinterpretation ofotabytessuggests two operationsthat arequitefamiliar

tomathematiians,but wewillpauseamomenttodenethemfrom srath.

IfAisanmnmatrixandB isannsmatrix,andifÆandarebinary

operations,thegeneralizedmatrixprodut A Æ

B isthemsmatrixCdened

by

C

ij

=(A

i1 B

1j ) Æ (A

i2 B

2j

)Æ Æ (A

in B

nj

) (11)

for 1 i m and 1 j s. [See K.E. Iverson, A Programming Language

(Wiley, 1962), 23{24; we assume that Æ is assoiative.℄ An ordinary matrix

produtisobtainedwhenÆis+andis,butweobtainimportantoperations

(17)

onBooleanmatriesifwelet Æbe_or:

(A _

B)

ij

=A

i1 B

1j _A

i2 B

2j

__A

in B

nj

; (12)

(A

B)

ij

=A

i1 B

1j A

i2 B

2j

A

in B

nj

: (13)

NotiethatiftherowsofAeahontainatmostone1,atmostonetermin(12)

or(13) isnonzero. Thesame is trueiftheolumns of B eah ontainat most

one1. ThereforeA _

B andA

B bothturnouttobethesameastheordinary

matrixprodutA +

B=ABin suh ases.

MOR$X,$Y,$Z(multipleor): m T

($X) m T

($Y) _

m

T

($Z);

equivalently,m($X) m($Z) _

m($Y). (Seeexerise 32.)

MXOR$X,$Y,$Z(multiple exlusive-or): m T

($X) m T

($Y)

m

T

($Z);

equivalently,m($X) m($Z)

m($Y).

Theseoperationsessentially seteahbyteof$Xbylookingattheorresponding

byte of$Z and usingits bits to selet bytes of $Y; theseleted bytes are then

oredorxoredtogether. If,forexample,wehave

$Z=

#

0102040810204080; (14)

thenbothMOR andMXOR willset register$Xto thebytereversal ofregister $Y:

Thekthbytefromtheleftof$Xwillbesettothekthbytefromtherightof$Y,

for1k8. Ontheother hand if $Z=

#

00000000000000ff, MOR andMXOR

willsetallbytesof$Xtozeroexeptfortherightmostbyte,whihwillbeome

eithertheORortheXORofalleightbytesof$Y. Exerises33{37illustratesome

ofthemanypratialappliations oftheseversatile ommands.

Floatingpointoperators. MMIXinludesafullimplementationofthefamous

IEEE/ANSIStandard754foroatingpointarithmeti. Completedetailsofthe

oatingpointoperationsappearin Setion4.2 andin theMMIXware doument;

aroughsummary willsuÆeforourpurposeshere.

Every otabyte x represents a oating binary number f(x) determined as

follows: Theleftmostbitofxisthesign(0=`+',1=` ');thenext11bitsare

theexponent E;theremaining52bitsarethefration F.Thevaluerepresented

isthen

0:0,ifE=F=0(zero);

2 1074

F,ifE=0and F6=0 (denormal);

2 E 1023

(1+F=2 52

),if0<E<2047(normal);

1,ifE=2047andF=0(innite);

NaN(F=2 52

),ifE=2047andF6=0(Not-a-Number).

The\short" oating point numberf(t) represented bya tetrabyte t is similar,

but its exponent part has only8 bits and itsfration hasonly 23; the normal

ase0<E<255ofa shortoatrepresents2 E 127

(1+F=2 23

).

FADD$X,$Y,$Z(oatingadd): f($X) f($Y)+f($Z).

FSUB$X,$Y,$Z(oatingsubtrat): f($X) f($Y) f($Z).

FMUL$X,$Y,$Z(oatingmultiply): f($X) f($Y)f($Z).

FDIV$X,$Y,$Z(oatingdivide): f($X) f($Y)=f($Z).

(18)

FREM$X,$Y,$Z(oatingremainder): f($X) f($Y)remf($Z).

FSQRT$X,$Zor FSQRT$X,Y,$Z(oatingsquareroot): f($X) f($Z) 1=2

.

FINT$X,$Zor FINT$X,Y,$Z(oatinginteger): f($X) intf($Z).

FCMP$X,$Y,$Z(oatingompare):s($X) [f($Y)>f($Z)℄ [f($Y)<f($Z)℄.

FEQL$X,$Y,$Z(oatingequalto): s($X) [f($Y)=f($Z)℄.

FUN$X,$Y,$Z(oatingunordered): s($X) [f($Y)kf($Z)℄.

FCMPE$X,$Y,$Z(oatingomparewithrespettoepsilon):

s($X)

f($Y)f($Z) f(rE)

f($Y)f($Z) f(rE)

,see4.2.2{(21).

FEQLE$X,$Y,$Z(oatingequivalentwithrespettoepsilon):

s($X)

f($Y)f($Z) f(rE)

,see4.2.2{(24).

FUNE$X,$Y,$Z(oatingunorderedwithrespettoepsilon):

s($X)

f($Y)kf($Z) f(rE)

.

FIX$X,$Z orFIX$X,Y,$Z(onvertoatingtoxed): s($X) intf($Z).

FIXU$X,$Zor FIXU$X,Y,$Z(onvert oatingtoxedunsigned):

u($X) intf($Z)

mod2 64

.

FLOT$X,$Zor FLOT$X,Y,$Z(onvert xedtooating): f($X) s($Z).

FLOTU$X,$Zor FLOTU$X,Y,$Z(onvertxedto oatingunsigned):

f($X) u($Z).

SFLOT$X,$Zor SFLOT$X,Y,$Z(onvertxedto shortoat):

f($X) f(T) s($Z).

SFLOTU$X,$ZorSFLOTU$X,Y,$Z(onvertxedtoshort oatunsigned):

f($X) f(T) u($Z).

LDSF$X,$Y,$ZorLDSF $X,A(loadshortoat): f($X) f(M

4 [A℄).

STSF$X,$Y,$ZorSTSF $X,A(storeshort oat): f(M

4

[A℄) f($X).

Assignment to a oating point quantity uses the urrent rounding mode to

determinetheappropriatevaluewhenanexatvalueannot beassigned. Four

roundingmodesaresupported: 1 (ROUND_OFF),2 (ROUND_UP),3 (ROUND_DOWN),

and4(ROUND_NEAR). TheYeldofFSQRT ,FINT ,FIX ,FIXU ,FLOT ,FLOTU,SFLOT ,

andSFLOTUanbeusedtospeifyaroundingmodeotherthantheurrentone,

ifdesired. Forexample,FIX $X,ROUND_UP,$Zsetss($X)

f($Z)

. Operations

SFLOTandSFLOTUrstroundasifstoringintoananonymoustetrabyteT,then

theyonvertthat numbertootabyteform.

The `int' operationrounds to an integer. The operationyremz isdened

tobey nz, wherenis thenearestintegerto y=z,or thenearesteven integer

inaseofa tie. Speialrulesapplywhentheoperandsareinniteor NaN,and

speialonventionsgovernthe signofa zeroresult. The values+0:0 and 0:0

havedierentoatingpointrepresentations,butFEQLallsthemequal. Allsuh

tehnialitiesareexplainedintheMMIXware doument,andSetion4.2explains

whythetehnialities areimportant.

Immediate onstants. Programs often need to deal with small onstant

numbers. For example, we might want to add or subtrat 1 from a register,

or we mightwant to shiftby 32, et. Insuh ases it's a nuisane to load the

small onstant frommemory into anotherregister. So MMIX providesa general

mehanism by whih suh onstants an be obtained \immediately" from an

(19)

instrution itself: Every instrution we have disussed so far has a variant in

whih $Z is replaed by the number Z, unless the instrution treats $Z as a

oatingpointnumber.

For example, `ADD $X,$Y,$Z' has a ounterpart `ADD $X,$Y,Z', meaning

s($X) s($Y)+Z; `SRU$X,$Y,$Z' hasa ounterpart `SRU$X,$Y,Z', meaning

u($X)

u($Y)=2 Z

; `FLOT $X,$Z ' has a ounterpart `FLOT $X,Z ', meaning

f($X) Z. But`FADD $X,$Y,$Z'hasnoimmediateounterpart.

The opode for `ADD $X,$Y,$Z' is

#

20 and the opode for `ADD $X,$Y,Z'

is

#

21; weusethesamesymbolADDin bothases forsimpliity. Ingeneralthe

opodefortheimmediatevariantofanoperationisonegreaterthantheopode

fortheregistervariant.

Several instrutions also feature wyde immediate onstants, whih range

from

#

0000 =0 to

#

ffff =65535. These onstants, whih appearin theYZ

bytes, an be shifted into the high, medium high, medium low, or low wyde

positionsofanotabyte.

SETH$X,YZ(sethighwyde): u($X) YZ2 48

.

SETMH$X,YZ(setmediumhigh wyde): u($X) YZ2 32

.

SETML$X,YZ(setmediumlowwyde): u($X) YZ2 16

.

SETL$X,YZ(setlowwyde): u($X) YZ.

INCH$X,YZ(inreasebyhighwyde): u($X) u($X)+YZ2 48

mod2 64

.

INCMH$X,YZ(inreasebymediumhigh wyde):

u($X) u($X)+YZ2 32

mod2 64

.

INCML$X,YZ(inreasebymediumlowwyde):

u($X) u($X)+YZ2 16

mod2 64

.

INCL$X,YZ(inreasebylowwyde): u($X) u($X)+YZ

mod2 64

.

ORH$X,YZ(bitwiseor withhighwyde): v($X) v($X)_v(YZ48).

ORMH$X,YZ(bitwiseorwith mediumhighwyde):

v($X) v($X)_v(YZ32).

ORML$X,YZ(bitwiseorwith mediumlowwyde):

v($X) v($X)_v(YZ16).

ORL$X,YZ(bitwiseor withlowwyde): v($X) v($X)_v(YZ).

ANDNH$X,YZ(bitwise and-nothighwyde): v($X) v($X)^v(YZ 48).

ANDNMH$X,YZ(bitwise and-notmediumhighwyde):

v($X) v($X)^v(YZ32).

ANDNML$X,YZ(bitwise and-notmediumlowwyde):

v($X) v($X)^v(YZ16).

ANDNL$X,YZ(bitwise and-notlowwyde): v($X) v($X)^v(YZ ).

Usingat mostfouroftheseinstrutions,weangetanydesiredotabyteintoa

registerwithoutloadinganythingfromthememory. Forexample,theommands

SETH$0,#0123; INCMH$0,#4567; INCML $0,#89ab; INCL $0,#def

put

#

0123456789abdef intoregister $0.

TheMMIX assembly language allows usto write SET as anabbreviation for

SETL ,andSET$X,$YasanabbreviationfortheommonoperationOR$X,$Y,0.

(20)

Jumps and branhes. Instrutions are normally exeuted in their natural

sequene. Inotherwords,theommandthatisperformedafterMMIXhasobeyed

thetetrabytein memoryloationisnormallythetetrabyte foundinmemory

loation+4. (Thesymboldenotestheplaewherewe're\at.") Butjump

andbranhinstrutions allowthissequenetobeinterrupted.

JMPRA(jump): RA.

Here RA denotes a three-byte relative address, whih ould be written more

expliitlyas+4XYZ ,namelyXYZtetrabytesfollowingtheurrentloation.

Forexample,`JMP+4*2 'isasymboliformforthetetrabyte

#

f0000002;ifthis

instrutionappearsin loation

#

1000, thenext instrutiontobeexeuted will

betheonein loation

#

1008. Wemightinfat write`JMP#1008 ';butthenthe

valueofXYZwould depend ontheloationjumpedfrom.

Relative osets an also be negative, in whih ase the opode inreases

by1 andXYZis theosetplus2 24

. Forexample, `JMP-4*2 ' isthetetrabyte

#

f1fffffe. Opode

#

f0tells theomputerto\jumpforward"andopode

#

f1

tells it to \jump bakward," but we write both as JMP . In fat, we usually

writesimply `JMPAddr 'when wewanttojumpto loationAddr ,and theMMIX

assemblyprogramguresouttheappropriateopodeandtheappropriatevalue

ofXYZ.Suhajumpwillbepossible unlesswetrytostraymorethanabout67

millionbytesfromourpresentloation.

GO$X,$Y,$Z(go): u($X) +4,then A.

TheGOinstrutionallowsustojumptoanabsoluteaddress,anywhereinmem-

ory;thisaddressAisalulatedbyformula(5) ,exatlyasintheload andstore

ommands. Beforegoingtothespeiedaddress,theloationoftheinstrution

that would ordinarilyhave ome nextis plaed into register $X. Therefore we

ould return to that loation later by saying, for example, `GO $X,$X,0', with

Z=0as animmediateonstant.

BN$X,RA(branhifnegative): ifs($X)<0,set RA.

BZ$X,RA(branhifzero): if $X=0,set RA.

BP$X,RA(branhifpositive): ifs($X)>0,set RA.

BOD$X,RA (branhifodd): ifs($X)mod2=1,set RA.

BNN$X,RA (branhifnonnegative): ifs($X)0,set RA.

BNZ$X,RA (branhifnonzero): if $X6=0,set RA.

BNP$X,RA (branhifnonpositive): ifs($X)0,set RA.

BEV$X,RA (branhifeven): ifs($X)mod2=0,set RA.

A branh instrution is a onditional jump that depends on the ontents of

register$X. TherangeofdestinationaddressesRA ismorelimited thanitwas

withJMP,beauseonlytwobytesareavailabletoexpresstherelativeoset;but

stillweanbranhtoanytetrabytebetween 2 18

and+2 18

4.

PBN$X,RA (probablebranhifnegative): ifs($X)<0,set RA.

PBZ$X,RA (probablebranhifzero): if $X=0,set RA.

PBP$X,RA (probablebranhifpositive): ifs($X)>0,set RA.

PBOD$X,RA(probable branhifodd): ifs($X)mod2=1,set RA.

PBNN$X,RA(probable branhifnonnegative): ifs($X)0,set RA.

(21)

PBNZ$X,RA(probablebranhifnonzero): if $X6=0,set RA.

PBNP$X,RA(probablebranhifnonpositive): ifs($X)0,set RA.

PBEV$X,RA(probablebranhifeven): ifs($X)mod2=0,set RA.

High-speedomputersusuallyworkfastestiftheyanantiipatewhenabranh

will be taken, beause foreknowledge helps them look ahead and get ready for

futureinstrutions. ThereforeMMIXenouragesprogrammerstogivehintsabout

whetherbranhing islikelyornot. Whenevera branh isexpeted tobetaken

morethanhalfofthetime,a wiseprogrammerwillsayPBinsteadofB.

*Subroutine alls. MMIX also hasseveral instrutions that failitateeÆient

ommuniationbetweensubprograms,viaaregisterstak. Thedetailsaresome-

whattehnialandwewilldeferthemuntilSetion1.4;aninformaldesription

willsuÆehere. Shortprogramsdonotneedtousethesefeatures.

PUSHJ$X,RA (pushregisters and jump): push(X)and set rJ +4, then

set RA.

PUSHGO$X,$Y,$Z(pushregistersandgo): push(X)andsetrJ +4,then

set A.

Thespeialreturn-jumpregisterrJissettotheaddressofthetetrabytefollowing

thePUSH ommand. Theation\push(X)"means,roughly speaking, thatloal

registers $0 through $X are saved and made temporarily inaessible. What

used to be $(X+1) is now $0, what used to be $(X+2) is now $1, et. But

allregisters$k for krGremainunhanged; rGis thespeialglobal threshold

register,whosevaluealwaysliesbetween32and255,inlusive.

Register$kisalledglobal ifkrG. Itisalledloalifk<rL;hererListhe

speialloal thresholdregister,whihtellshowmanyloalregistersareurrently

ative. Otherwise, namelyif rL k <rG, register $k is alled marginal,and

$k is equal to zerowheneverit is used as a soureoperandin a ommand. If

a marginal register $k is used as a destination operand in a ommand, rL is

automatially inreased to k+1 before the ommand is performed, thereby

making$k loal.

POPX,YZ(popregistersandreturn): pop(X),then rJ+4YZ.

Here \pop(X)" means, roughly speaking, that all but X of the urrent loal

registersbeomemarginal,andthentheloalregistershiddenbythemostreent

\push"thathasnotyetbeen\popped"arerestoredtotheirformervalues. Full

detailsappearin Setion1.4,togetherwithnumerousexamples.

SAVE$X,0(saveproessstate): u($X) ontext.

UNSAVE$Z(restoreproessstate): ontext u($Z).

The SAVE instrution stores all urrent registers in memory at the top of the

registerstak, andputs theaddressofthetopmost storedotabyteinto u($X).

Register$Xmustbeglobal; thatis,X mustberG. Alloftheurrentlyloal

and global registers are saved, together with speial registers like rA, rD, rE,

rG, rH, rJ, rM, rR, and several others that we have not yet disussed. The

UNSAVE instrution takes the address of suh a topmost otabyte and restores

theassoiatedontext, essentially undoing apreviousSAVE. The valueofrL is

set to zeroby SAVE ,but restored by UNSAVE. MMIX has speial registers alled

(22)

theregister stak oset (rO) andregister stak pointer (rS), whih ontrol the

PUSH , POP , SAVE , and UNSAVE operations. (Again, full details an befound in

Setion1.4.)

*System onsiderations. Several opodes, intended primarily for ultrafast

and/or parallel versions of the MMIX arhiteture, are of interest only to ad-

vanedusers,but weshouldatleastmentionthemhere. Someoftheassoiated

operations are similar to the \probable branh" ommands, in the sense that

theygivehintstothemahineabouthowtoplanaheadformaximumeÆieny.

Mostprogrammersdonotneedtousetheseinstrutions,exeptperhapsSYNCID .

LDUNC$X,$Y,$Z(loadotaunahed): s($X) s M

8 [A℄

.

STUNC$X,$Y,$Z(storeotaunahed): s M

8 [A℄

s($X).

These ommands perform the same operationsas LDO and STO, but they also

inform the mahine that theloaded or stored otabyte and its near neighbors

willprobablynotbereadorwritten inthenearfuture.

PRELDX,$Y,$Z(preloaddata).

SaysthatmanyofthebytesM[A℄throughM[A+X℄willprobablybeloaded or

storedinthenearfuture.

PRESTX,$Y,$Z(prestoredata).

Says that all of the bytes M[A℄ through M[A+X℄ will denitely be written

(stored)beforetheyarenextread(loaded).

PREGOX,$Y,$Z(prefethtogo).

Says that manyof thebytes M[A℄ throughM[A+X℄ will probably be used as

instrutionsin thenearfuture.

SYNCIDX,$Y,$Z(synhronizeinstrutions anddata).

SaysthatallofthebytesM[A℄throughM[A+X℄mustbefethedagainbefore

beinginterpreted as instrutions. MMIX is allowed to assumethat a program's

instrutionsdonothangeaftertheprogramhasbegun,unlesstheinstrutions

havebeenpreparedbySYNCID. (Seeexerise57.)

SYNCDX,$Y,$Z(synhronizedata).

Says that all of bytes M[A℄ throughM[A+X℄ must bebroughtup to date in

the physial memory, so that other omputers and input/output devies an

readthem.

SYNCXYZ (synhronize).

Restrits parallel ativities so that dierent proessors an ooperate reliably;

seeMMIXware fordetails. XYZmustbe0,1,2,or 3.

CSWAP$X,$Y,$Z(ompareand swapotabytes).

Ifu(M

8

[A℄)=u(rP),whererPisthespeialpreditionregister,setu(M

8 [A℄)

u($X)and u($X) 1. Otherwiseset u(rP) u(M

8

[A℄) and u($X) 0. This

isanatomi(indivisible) operation,usefulwhenindependentomputerssharea

ommonmemory.

LDVTS$X,$Y,$Z(loadvirtualtranslationstatus).

Thisinstrution,desribedinMMIXware,isfortheoperatingsystemonly.

(23)

*Interrupts. The normal ow of instrutions from one tetrabyte to the next

an be hanged not only by jumps and branhes but also by less preditable

events like overow or external signals. Real-world mahines must also ope

withsuhthingsasseurityviolationsandhardwarefailures. MMIXdistinguishes

two kindsof programinterruptions: \trips" and \traps." A tripsends ontrol

toa trip handler,whih ispart oftheuser'sprogram;atrap sendsontrol toa

trap handler,whih ispartoftheoperatingsystem.

Eight kinds of exeptional onditions an arise when MMIX is doing arith-

meti, namely integer divide hek (D), integer overow (V), oat-to-x over-

ow (W), invalid oating operation (I), oating overow (O), oating under-

ow (U), oating division by zero (Z), and oating inexat (X). The speial

arithmeti status register rA holds urrent information about all these exep-

tions. Theeightbitsofitsrightmost byte arealleditseventbits,andtheyare

namedD_BIT(

#

80),V_BIT(

#

40),:::,X_BIT (

#

01),inorderDVWIOUZX.

Theeight bits justto theleft of theeventbits in rA are alled theenable

bits; theyappearin the same orderDVWIOUZX. Whenan exeptional ondi-

tionours duringsome arithmeti operation, MMIX looksat theorresponding

enable bitbefore proeedingto thenext instrution. Iftheenable bit is 0,the

orrespondingeventbitissetto1;otherwisethemahineinvokesatriphandler

by \tripping" to loation

#

10 for exeption D,

#

20 for exeption V, :::,

#

80

forexeptionX. Thus theevent bitsofrA reordtheexeptionsthathavenot

ausedtrips. (Ifmorethanoneenabledexeptionours,theleftmostonetakes

preedene. For example,simultaneous Oand XishandledbyO.)

ThetwobitsofrAjusttotheleftoftheenablebitsholdtheurrentrounding

mode, mod 4. The other46 bits ofrA should bezero. A programan hange

thesettingofrAat anytime,usingthePUTommand disussedbelow.

TRIPX,Y,ZorTRIP X,YZorTRIP XYZ(trip).

Thisommandforesatripto thehandlerat loation

#

00.

Wheneveratripours,MMIXusesvespeialregisterstoreordtheurrent

state: thebootstrapregisterrB,thewhere-interruptedregisterrW,theexeution

register rX,the Yoperand register rY, andtheZ operandregister rZ.First rB

isset to$255,then$255issettorJ,andrWissetto +4. Thelefthalf ofrX

issetto

#

80000000, andtherighthalfisset totheinstrutionthattripped. If

theinterruptedinstrutionwas nota storeommand,rYis setto$YandrZ is

set to $Z(or to Z in ase ofan immediate onstant); otherwise rYis set to A

(thememoryaddress of thestoreommand) andrZ is set to$X (thequantity

tobestored). Finallyontrolpasses tothehandlerbysettingto thehandler

address(

#

00or

#

10 oror

#

80).

TRAPX,Y,ZorTRAP X,YZorTRAP XYZ(trap).

ThisommandisanalogoustoTRIP ,butitforesatraptotheoperatingsystem.

SpeialregistersrBB, rWW,rXX, rYY,and rZZtaketheplaeof rB,rW,rX,

rY,andrZ;thespeialtrap address register rTsuppliestheaddressofthetrap

handler,whih is plaed in . Setion 1.3.2desribesseveral TRAP ommands

that provide simple input/output operations. The normal way to onlude a

(24)

programis tosay `TRAP0 '; this instrutionisthe tetrabyte 00000000, so you

mightrunintoitbymistake.

The MMIXware doument gives further details about external interrupts,

whih are governed by the speial interrupt mask register rK and interrupt

requestregister rQ.Dynamitraps,whih arisewhenrK^rQ6=0,arehandled

ataddress rTTinsteadof rT.

RESUME0(resumeafter interrupt).

If s(rX) is negative, MMIX simply sets rW and takes its next instrution

fromthere. Otherwise,iftheleading byteofrX iszero,MMIX sets rW 4

and exeutes the instrution in the lower half of rX as if it had appeared in

that loation. (This feature an be used even if no interrupt has ourred.

Theinserted instrutionmust notitself beRESUME.) OtherwiseMMIX performs

speialationsdesribedin theMMIXwaredoumentandofinterestprimarilyto

theoperatingsystem;seeexerise 1.4.3{14.

The omplete instrutionset. Table1showsthesymbolinamesofall256

opodes,arrangedbytheirnumerivaluesinhexadeimalnotation. Forexample,

ADDappearsin theupperhalfoftherowlabeled

#

2xandintheolumnlabeled

#

0 at the top,so ADD is opode

#

20; ORL appearsin the lower half of therow

labeled

#

Exandintheolumnlabeled

#

Batthebottom,soORLisopode

#

EB.

Table 1 atually says `ADD[I℄ ', not `ADD ', beause the symbol ADD really

standsfortwoopodes. Opode

#

20arisesfromADD$X,$Y,$Zusingregister$Z,

while opode

#

21 arises from ADD $X,$Y,Z using the immediate onstant Z.

Whenadistintionisneessary,wesaythatopode

#

20isADDandopode

#

21

isADDI(\add immediate");similarly,

#

F0 isJMPand

#

F1 isJMPB(\jumpbak-

ward"). Thisgiveseveryopodeauniquename. However,theextraIandBare

generallydroppedforonveniene whenwewriteMMIXprograms.

Wehavedisussednearly allof MMIX'sopodes. Twoofthestragglersare

GET$X,Z (getfrom speialregister): u($X) u(g[Z℄),where0Z<32.

PUTX,$Z (putintospeialregister): u(g[X℄) u($Z),where0X<32.

Eahspeialregisterhasaodenumberbetween0and31. Wespeakofregisters

rA,rB,:::,asaidstohumanunderstanding;butregisterrAisreallyg[21℄from

themahine'spointofview,andregisterrBisreallyg[0℄,et. Theodenumbers

appearin Table2onpage21.

GETommandsareunrestrited,butertainthingsannotbePUT: Novalue

anbeputintorGthatisgreaterthan255,lessthan32,orlessthantheurrent

setting of rL. No valuean be put into rA that is greater than

#

3ffff. If a

program tries to inrease rL with the PUT ommand, rL will stay unhanged.

Moreover,aprogramannot PUTanythinginto rC,rN,rO,rS,rI, rT,rTT,rK,

rQ,rU,orrV;these\extraspeial"registershaveodenumbersintherange8{18.

Mostofthespeialregistershavealreadybeenmentionedinonnetionwith

spei instrutions,but MMIX also hasa \lokregister" or yle ounter, rC,

whih keeps advaning; an interval ounter, rI, whih keeps dereasing, and

whih requests an interruptwhen itreaheszero; a serial number register, rN,

whih gives eah MMIX mahine a unique number; a usage ounter, rU, whih

(25)

Table1

THE OPCODES OF MMIX

#

0

#

1

#

2

#

3

#

4

#

5

#

6

#

7

TRAP5 FCMP FUN FEQL FADD4 FIX4 FSUB4 FIXU4

#

0x

#

0x

FLOT [I℄4 FLOTU[I℄4 SFLOT [I℄4 SFLOTU[I℄4

FMUL4 FCMPE4 FUNE FEQLE4 FDIV40 FSQRT40 FREM4 FINT4

#

1x

#

1x

MUL [I℄10 MULU [I℄10 DIV[I℄60 DIVU [I℄60

ADD [I℄ ADDU [I℄ SUB[I℄ SUBU [I℄

#

2x

#

2x

2ADDU [I℄ 4ADDU[I℄ 8ADDU [I℄ 16ADDU[I℄

CMP [I℄ CMPU [I℄ NEG[I℄ NEGU [I℄

#

3x

#

3x

SL[I℄ SLU [I℄ SR[I℄ SRU [I℄

BN[B℄+ BZ [B℄+ BP[B℄+ BOD [B℄+

#

4x

#

4x

BNN [B℄+ BNZ [B℄+ BNP[B℄+ BEV [B℄+

PBN [B℄3 PBZ [B℄3 PBP[B℄3 PBOD [B℄3

#

5x

#

5x

PBNN [B℄3 PBNZ [B℄3 PBNP [B℄3 PBEV [B℄3

CSN [I℄ CSZ [I℄ CSP[I℄ CSOD [I℄

#

6x

#

6x

CSNN [I℄ CSNZ [I℄ CSNP [I℄ CSEV [I℄

ZSN [I℄ ZSZ [I℄ ZSP[I℄ ZSOD [I℄

#

7x

#

7x

ZSNN [I℄ ZSNZ [I℄ ZSNP [I℄ ZSEV [I℄

LDB [I℄+ LDBU [I℄+ LDW[I℄+ LDWU [I℄+

#

8x

#

8x

LDT [I℄+ LDTU [I℄+ LDO[I℄+ LDOU [I℄+

LDSF [I℄+ LDHT [I℄+ CSWAP [I℄2+2 LDUNC [I℄+

#

9x

#

9x

LDVTS [I℄ PRELD[I℄ PREGO [I℄ GO [I℄3

STB [I℄+ STBU [I℄+ STW[I℄+ STWU [I℄+

#

Ax

#

Ax

STT [I℄+ STTU [I℄+ STO[I℄+ STOU [I℄+

STSF [I℄+ STHT [I℄+ STCO [I℄+ STUNC [I℄+

#

Bx

#

Bx

SYNCD [I℄ PREST[I℄ SYNCID [I℄ PUSHGO[I℄3

OR[I℄ ORN [I℄ NOR[I℄ XOR [I℄

#

Cx

#

Cx

AND [I℄ ANDN [I℄ NAND [I℄ NXOR [I℄

BDIF [I℄ WDIF [I℄ TDIF [I℄ ODIF [I℄

#

Dx

#

Dx

MUX [I℄ SADD [I℄ MOR[I℄ MXOR [I℄

SETH SETMH SETML SETL INCH INCMH INCML INCL

#

Ex

#

Ex

ORH ORMH ORML ORL ANDNH ANDNMH ANDNML ANDNL

JMP [B℄ PUSHJ[B℄ GETA [B℄ PUT [I℄

#

Fx

#

Fx

POP3 RESUME5 [UN℄SAVE20+ SYNC SWYM GET TRIP5

#

8

#

9

#

A

#

B

#

C

#

D

#

E

#

F

=2ifthebranhistaken,=0ifthebranhisnottaken

inreasesby1wheneverspeiedopodesareexeuted;andavirtualtranslation

register,rV,whihdenesamappingfromthe\virtual"64-bitaddressesusedin

programstothe\atual"physialloationsof installedmemory. These speial

registers help make MMIX a omplete, viable mahine that ould atually be

builtand run suessfully; but they are not of importane to us in this book.

TheMMIXware doumentexplainsthem fully.

GETA$X,RA(getaddress): u($X) RA.

This instrutionloads a relative address into register $X, using the same on-

ventionsastherelativeaddressesinbranhommands. Forexample,GETA$0,

willset$0totheaddressoftheinstrutionitself.

(26)

Table2

SPECIAL REGISTERS OF MMIX

ode saved? put?

rA arithmetistatusregister . . . 21

p p

rB bootstrapregister(trip) . . . 0

p p

rC yleounter . . . 8

rD dividendregister . . . 1

p p

rE epsilon register . . . 2

p p

rF failureloationregister . . . 22

p

rG globalthresholdregister . . . 19

p p

rH himultregister . . . 3

p p

rI intervalounter . . . 12

rJ return-jumpregister . . . 4

p p

rK interruptmaskregister . . . 15

rL loalthresholdregister. . . 20

p p

rM multiplexmaskregister . . . 5

p p

rN serialnumber. . . 9

rO registerstakoset . . . 10

rP preditionregister. . . 23

p p

rQ interruptrequestregister. . . 16

rR remainderregister. . . 6

p p

rS registerstakpointer . . . 11

rT trapaddressregister. . . 13

rU usageounter. . . 17

rV virtualtranslationregister . . . 18

rW where-interruptedregister(trip) . . . 24

p p

rX exeutionregister(trip) . . . 25

p p

rY Yoperand(trip) . . . 26

p p

rZ Zoperand(trip) . . . 27

p p

rBB bootstrapregister(trap) . . . 7

p

rTT dynamitrapaddressregister. . . 14

rWW where-interruptedregister(trap) . . . 28

p

rXX exeutionregister(trap) . . . 29

p

rYY Yoperand(trap) . . . 30

p

rZZ Zoperand(trap) . . . 31

p

SWYMX,Y,Zor SWYMX,YZor SWYMXYZ (sympathizewithyourmahinery).

The last of MMIX's 256 opodes is, fortunately, the simplest of all. In fat, it

is often alled a no-op, beause it performs no operation. It does, however,

keepthemahine runningsmoothly,justas real-world swimming helpsto keep

programmershealthy. BytesX,Y, andZareignored.

Timing. In later parts of this book we will often want to ompare dierent

MMIX programsto see whih is faster. Suh omparisons aren't easy to make,

ingeneral,beausetheMMIX arhiteturean beimplementedinmanydierent

ways. Although MMIX is a mythial mahine, its mythial hardware exists in

heap,slowversions aswell asinostlyhigh-performanemodels. Therunning

timeofaprogramdependsnotonlyonthelokratebutalsoonthenumberof

(27)

funtionalunits thatanbeativesimultaneouslyandthedegreetowhihthey

arepipelined; itdependsonthetehniquesused toprefeth instrutions before

theyareexeuted; itdependsonthe sizeof therandom-aess memorythat is

used to give the illusion of 2 64

virtual bytes; and it depends on the sizes and

alloationstrategies ofahesandotherbuers,et.,et.

For pratial purposes,the runningtime ofanMMIX program an oftenbe

estimated satisfatorily by assigning a xed ost to eah operation, based on

the approximate running time that would be obtained on a high-performane

mahine withlots of main memory;so that's what wewill do. Eah operation

willbeassumedto takeanintegernumberof, where(pronouned\oops")*

is a unit that represents the lok yle time in a pipelined implementation.

Althoughthevalueofdereasesastehnologyimproves,wealwayskeepupwith

thelatestadvanesbeausewemeasure timeinunits of,notin nanoseonds.

Therunningtimeinourestimateswillalsobeassumedtodependonthenumber

ofmemoryreferenesor mems that aprogram uses;thisis thenumberofload

andstore instrutions. For example, wewill assume that eah LDO (load ota)

instrutionosts+,where istheaverageostofa memoryreferene. The

totalrunningtimeofaprogrammightbereportedas,say,35+1000,meaning

\35memsplus1000oops." Theratio=hasbeeninreasingsteadilyformany

years; nobody knows forsure whether this trend will ontinue, but experiene

hasshownthat and deservetobeonsideredindependently.

Table 1,whih isrepeatedalsoin theendpapers ofthis book,displays the

assumedrunningtimetogetherwitheahopode. Notiethatmostinstrutions

takejust1,whileloadsandstorestake+. Abranhorprobablebranhtakes

1if preditedorretly, 3 ifpredited inorretly. Floatingpoint operations

usuallytake4eah,althoughFDIVandFSQRTost40. Integer multipliation

takes10;integer divisionweighsinat 60.

Eventhough wewill often use the assumptions of Table 1 for seat-of-the-

pantsestimatesofrunningtime,wemustrememberthattheatualrunningtime

might be quite sensitive to the ordering of instrutions. For example, integer

divisionmightostonly oneyle ifwean nd60other thingsto dobetween

thetime we issue the ommandand the time we need theresult. Several LDB

(loadbyte)instrutions mightneedtoreferenememoryonlyone,iftheyrefer

to the same otabyte. Yet the result of a load ommand is usually notready

for use in the immediately following instrution. Experiene has shown that

some algorithmswork well with ahe memory, and others donot; therefore

is not really onstant. Even the loation of instrutions in memory an have

a signiant eet on performane, beause some instrutions an be fethed

togetherwithothers. ThereforetheMMIXwarepakageinludesnotonlyasimple

simulator, whih alulates running times by the rules of Table 1, but also a

omprehensivemeta-simulator,whihrunsMMIXprogramsunderawiderangeof

dierenttehnologialassumptions. Usersofthemeta-simulatoranspeifythe

* TheGreekletterupsilon()iswiderthananitalilettervee(v),buttheauthoradmits

thatthisdistintionisrathersubtle.Readerswhoprefertosayveeinsteadofoopsarefreeto

doastheywish. Thesymbolis,however,anupsilon.

(28)

harateristisofthememorybusandtheparametersofsuhthingsasahesfor

instrutions and data, virtual address translation, pipelining andsimultaneous

instrutionissue,branhpredition,et. Givenaongurationleandaprogram

le, the meta-simulator determines preisely how long the speied hardware

wouldneedtoruntheprogram. Onlythemeta-simulatoranbetrustedtogive

reliable information about a program's atual behavior in pratie; but suh

results an bediÆult to interpret, beause innitely manyongurations are

possible. That'swhyweoftenresorttothemuhsimplerestimatesofTable1.

No benhmark resultshouldever betakenat fae value.

|BRIANKERNIGHANandCHRISTOPHERVAN WYK (1998)

MMIX versus reality. A person who understands the rudiments of MMIX

programminghasaprettygoodideaofwhattoday'sgeneral-purposeomputers

an doeasily; MMIX is very muh likeallof them. ButMMIX hasbeenidealized

in several ways, partly beausethe author hastried to design a mahine that

is somewhat \aheadof its time" so that it won't beome obsolete too quikly.

Therefore a briefomparison between MMIX and the omputers atually being

builtattheturnofthemillenniumisappropriate. Themaindierenesbetween

MMIXandthosemahinesare:

Commerialmahines donotignore thelow-order bitsof memoryaddresses,

asMMIXdoeswhenaessingM

8

[A℄;theyusuallyinsistthatAbeamultiple

of8. (Wewillndmanyusesforthose preiouslow-orderbits.)

Commerial mahines are usually deient in theirsupport of integer arith-

meti. Forexample,theyalmostneverproduethetruequotientbx=yand

trueremainderxmodywhenxisnegativeoryisnegative;theyoftenthrow

awaytheupperhalf of aprodut. Theydon'ttreatleft andright shiftsas

strit equivalentsofmultipliationand divisionbypowers of2. Sometimes

theydonotimplementdivisioninhardwareatall;andwhentheydohandle

division, theyusuallyassumethat theupperhalfofthe128-bitdividendis

zero. Suhrestritionsmakehigh-preisionalulationsmore diÆult.

CommerialmahinesdonotperformFINT andFREMeÆiently.

Commerial mahines do not (yet?) have thepowerfulMOR and MXOR opera-

tions. Theyusuallyhavea halfdozenor soadhoinstrutionsthat handle

onlythemostommonspeialasesof MOR.

Commerialmahinesrarelyhavemorethan64general-purposeregisters. The

256 registers of MMIX signiantly derease programlength, beausemany

variables and onstants of a program an live entirely in those registers

instead of in memory. Furthermore, MMIX's register stak is more exible

thantheomparablemehanismsin existingomputers.

All of these pluses forMMIX have assoiated minuses, beauseomputer design

always involves tradeos. The primary design goal for MMIX was to keep the

mahine as simple and lean and onsistent and forward-looking as possible,

withoutsariingspeedandrealismtoogreatly.

Références

Documents relatifs

version of genlex order; similarly , the method of \plain hanges&#34; (Algorithmj. 7.2.1.2P) visits permutations in a genlex order of the

Moreover, Algorithm H produes partitions in lexiographi order of

[r]

Note: This examination consists of ten questions on one page. a) Considering two arbitrary vectors a and b in three-dimensional Cartesian space, how can the angle between

No calculators are to be used on this exam. Note: This examination consists of ten questions on one page. Give the first three terms only. a) What is an orthogonal matrix?

Le rang ~minent de ces deux ill, stres math~maticiens a ddcid~ le Comit~ de Direction iz r~aliser pour cette lois une exdcution des m~dailles comportant l'dffijie

Die Coefficienten der bestandig convergirenden P0tenzrcihe, in die F(x) entwickelt werden kann, wOrden sicht leieht ergeben, wGnn diese Frage bejaht werden

ARKIV FOR