HAL Id: hal-01241394
https://hal.archives-ouvertes.fr/hal-01241394
Submitted on 10 Dec 2015
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
OCamlDoom: ML for 3D action games
François Pessaux
To cite this version:
François Pessaux. OCamlDoom: ML for 3D action games. ACM SIGPLAN Workshop on ML, Sep
1998, Baltimore, United States. �hal-01241394�
Fran ois Pessaux INRIA Ro quen ourt Fran ois.Pessauxinria.fr
Abstra t
Thispaperdes ribesa3Dgraphi senginewithtexture map-pingforDoom-style omputergamesentirelywrittenin Ob-je tive Caml. This work demonstratesthe appli ability of MLforintera tive omputergraphi s.
1 Introdu tion
Thetraditionalareaofappli ationforMLissymboli pro- essing: theoremproving, ompilers,...Ithasbeen laimed thatMLisill-suitedtootherareasbyla kofspeedand a - esstolow-levelma hinefeatures. Re entworkonnetwork proto ols implementedinML [1, 4℄hasrefutedthis laim. Inthispaper,weatta kanotherstrongholdofC(andeven assemblylanguage)programming: realtime3D graphi sas foundina tion omputergamessu hasIDSoftware's infa-mous\Doom".
Wedes ribea3Dgraphi senginewithtexturemapping entirelywritteninObje tiveCaml. Thepurposeistwofold: rst,giveatutorialintrodu tiontothemainalgorithmsused inthiskindofgraphi sengines;se ond,studytheadequa y ofMLfor su happli ations. WeshowthatML's datatypes andre ursivefun tionsareanaturalmat hfor those algo-rithms. Wealsostudytheperforman esobtainedwiththe
Obje tive Caml ompilerand omparethem againstthose of aC implementationof thesame algorithms. The Caml implementationdeliversapproximatively75%ofthe perfor-man eof the Cimplementation,and a hievesa highly re-spe tableframerateof100framesperse ondona333Mhz PentiumII
1 .
Theimplementationdes ribedinthispaperisnot state-of-the-art: itis learlynotoptimizedasitshouldbeinorder tomakearealgame. Morepowerfulalgorithmsexiststoget amorerealisti rendering,butaretoo omplexforatutorial introdu tion. Finally,weonlydealwiththerenderingofthe s enery, but not withsprites, ballisti aspe ts, nor sound. Still,themodelwetalkaboutisagoodstartingpointbefore goingdeeperin3D graphi sprogramming...
Theremainderofthispaperisorganizedasfollows. We rstintrodu ethebasi ideabehindDoom-likeengines.Then wewill inspe t the ore datastru tureof the renderer,i.e the BSP tree, and seehow it an beimplementedin ML. Inse tion 3wewill godeeperintherenderingme hanism tounderstandhowtheviewisbuiltbeforebeingdisplayed. Se tion4showsperforman eguresand omparethemwith theCversionofthesameprogram.Se tion5dis usses pos-sibleenhan ementstomaketheprogrammoreeÆ ient. 2 Basi sofpseudo-3Drendering
Forperforman ereasons,gamessu hasDoomdonot per-formfull3Drendering,butrestri tthemselvestoa parti u-lar lassofmodelsthatwe allpseudo-3D.Inthesemodels, the virtual world where players move is represented as a 2-dimensional map viewed from above, with the third di-mension (theheight) addedafterwards. For instan e, here ishowasimplere tangularroomisrepresented:
1
25to30framesperse ondisgenerally onsidereda eptable,as it orrespondstoTV-qualityanimation.
Line
Vertex
Su haroom,also alledase tor,is omposedofseveral verti es(4inourexample) onne tedbyorientedlines,also alled linedefs. These lines dene the walls of the room wherethe playermoves. Theheight is givenas aseparate attributeofthese tor.
Todisplaythis roomonthe s reen,ea hlinedef is pro-je tedontotheplayer'seldofvision,a ordingtoits posi-tioninspa e. Be ausealinedefrepresentsinfa tasurfa e of onstantheight,on eproje tedontheplanerepresenting thevisible part of thespa e, it leads toa simplepolygon. (This is the reason why these graphi s engines are alled polygon-basedrenderers.)
Polygon
Screen
to draw
Wall
Eye
Field of vision extremity
Field of vision extremity
Plane
representing
screen
Wall rendered
on the screen
Beforegoingdeeperintheproje tion-renderingpro ess, letusaddresstheproblemofknowingwhi hwallsare visi-blefromaspe i position. Thisproblemisimportantfor two reasons. First, it onditions the orre t rendering of overlappingwalls: wedrawwallsstartingbythosethatare losesttotheplayer,andneverdrawingoverawallthathas already been drawn. This way, we an stop rendering as soonasthes reenislled;this wouldnotbepossible with asimplepainteralgorithm, whi hfor es to draw all walls. Se ond, we avoid omputing and displaying walls that lie outsideoftheplayer'seldofvision.
Toaddressthisissue, we representthe s eneryby a bi-nary spa e partitioning (BSP) tree. This is a stati data stru ture omputedat the timewhere thevirtual worldis designed,andsavedintheledes ribingthat world. 2.1 The BSPtree
ABSPtreerepresentsare ursive,hierar hi alpartitioning,
dimension is 2, and we will partition it with hyperplanes thatare1-dimensionalobje ts,thatis,lines.
The pro ess for building su h a tree is rather simple: hooseapartitionlinel
p
anddeterminethesetLoflinedefs in the map whi h lie on the the left of l
p
, as well as the set R oflinedefs inthe mapwhi hlie onthe the right of l
p
. Re ursivelyrepresentLandRbyBSPs. Finally, reate a node,labeledbyl
p
, withleft and right hildren thetwo BSPsrepresentingLandR.
0
1
2
3
4
5
6
7
8
Consider the world depi tedabove. It is omposed of twose tors. Notethatonthejun tionofthesetwose tors, there are two linedefs of opposite dire tion, not onlyone. Thisisbe ausease tormustbea losedpolygon.
To de ompose this worldinto a BSP, we hoose a par-tition line alongthe linedef 4. Hen e, thetop nodeof the BSP tree is labeled by 4, the left hild represents the set ontaininglinedefsf0;1;2;3g,andtheright hildrepresents f5;6;7;8g.
Insome ases,the hosenpartitionline an ross some linedefs. Hen e,itisimpossibletodeterminewhetherthese linedefs are on the left or onthe right of the partitioning line. For instan e, assume we partition the world above along linedef3. Then, segment 6is neitherontheleft nor ontherightofthepartitionline:
0
1
2
3
4
5
6
7
8
Inthis ase,wesplitthe rossedlinedefsintwobyadding anewvertexattheinterse tionpoint. Hen e,theleft hild ofthetopnode(labeled3)isf6
0
;7;8gandtheright hildis f0;1;2;3;4;5;6g.
0
1
2
3
4
5
6
7
8
6’
uen es the shape of the tree. In some appli ations, the riterionfor hoosingpartitionlines anbeveryimportant. Forthe sakeof simpli ity, inour ase, we hoosepartition linesatrandomamongthelinedefsoftheworld.
To ompletethedes riptionofthealgorithm,weneeda riterionto knowwhentostoppartitioning. This riterion dependsonthe intendeduse oftheBSP.Forexample, one ande idetostopwhenalimitisrea hedonthenumberof linedefs ontainedinaleaf,orthetreedepth,orthenumber ofsplits, orthenumberofleaves, ...Inourappli ation, we stoppartitioningwhen:
all linedefsinaleafbelongtothesamese tor, andalllinedefsinaleafforma onvexpolygon
2 . Therstpointmakessurethatalllinedefsinaleafhave the same height (be ause theybelong to the same se tor, andtheheightisanattributeofthese tor).
These ond point makes surethat when drawing walls ina leaf, we an draw them in any order without risk of overlapping. Toseethatthisisnotthe asewhenthe poly-gonisnot onvex, onsiderthefollowingexample,wherethe viewingdire tionistheboldarrow.
0
1
2
3
4
Ifwedrawlinedef1beforelinedef0,theviewlooks or-re t,butintheinverseorderthe loserlinedef 0ismasked bylinedef1andtheresultisvisuallywrong.
NowthatweknowhowtobuildaBSPtree,letusdis uss itsrelevan etoourgraphi sengine. Therstproblemwasto performhiddensurfa eremoval, i.etodisplay losest walls rst. Toa hieve this,all weneedtodoiswalkthe treein thefollowingway:
1. Determineplayer'sposition
2. Ifthetreeisaleaf,thendisplayea hlinesofthisleaf 3. If thetreeisanode, lassifythe player'sposition a - ordingtothepartitionline(i.edeterminewhetherhe isonitsleftoronitsright).
Ifontheleft,thenre urseonleftsubtree,display partitionline,re urseontherightsubtree. Ifontheright,thenre urseonrightsubtree,
dis-playpartitionline,re urse ontheleft subtree. Ifonthepartitionline, hoosetopro eedlikeone
theprevious ases. 2
Insome ases,itispossiblethatthereisnotenoughlinedefsata leaftoforma losedpolygon.Wealsoa eptnon- losedpolygonsas longastheyare\ onvex"inthefollowingsense: theanglebetween onse utivelinedefsl n andl n+1 mustbe180 o
. This aneasilybe determinedbythesignofadotprodu t.
from the player's standpoint, then the loser one will be visitedrst.
Nowwestillhavetoaddresstheproblemofdetermining whi h walls are in the player's eld of vision. To handle this in a simple but ee tive way, we add two bounding boxes(axis-aligned)at ea hnodeof the tree,onefor ea h subtree. These boxes ontain oordinates of the minimal re tangle bounding linedefs of the left hild (respe tively right hild). So, before re ursing ina subtree, we simply he ktheplayer'seldofvisionagainst theboundingbox. This anbea hievedqui klyandwithasuÆ ientpre ision usingquadrants[7 ℄. Toa hivethisissue,wesplitthespa e (let'sassumeitis enteredontheplayer'sposition)infour partsalongthexandyaxis. Thenwe onsiderthedire tion oftheleftmostrayintheplayer'seldofvisiona ordingto thedire tion(byblo kof90
o
)theplayeris urrentlylooking at.
If we only onsiderer that the user's one of vision is the entral raypointingtowardthe dire tion theplayer is looking at, then the following gure shows onditions on relations betweenplayer'spositionand boundingbox oor-dinatesto onsiderthisboundingbox anbevisible(i.ethe ray aninterse tthebox).
PlayerX < x2
PlayerY > y1
PlayerY > y1
PlayerX > x1
PlayerX < x2
PlayerY < y2
PlayerX > x1
PlayerY < y2
(x1, y1)
(x2, y2)
(x2, y1)
(x1, y2)
(x1, y1)
(x2, y1)
(x2, y2)
(x1, y2)
(x1, y2)
(x2, y2)
(x2, y2)
(x1, y2)
(x1, y1)
(x1, y1)
(x2, y1)
(x2, y1)
Y
Player
X
Infa t,theplayer's oneofvisioniswiderthatasimple ray. We assume that the angle of this oneis lowerthan 180
o
. By onsideringtheleftmostextremityofthis one,we makesurethatall raysinthis onewon't rossaquadrant ontheleft of theoneplayerislookingat. But,depending onthereal angle wherethe playeris looking, someraysof the one antravelintothequadrantjustontheright(not morebe auseweassumedthe onetobelowerthan180
o ). Thentobe orre t, thetestofvisibility mustnotperform, for ea h quadrant, the twotests giveninea h part of the gure,butrathertheonlytest ommontothequadrantand
inate, for one quadrant, all bounding boxes lying on the othersemi-planethantheonedenedbythequadrantand theoneimmediatelyonitsright.
Hen e,testvisibilityisperformedby: let is_right_visible playerX playerY
playerLeftAngle (x1, y1, x2, y2) = if (playerLeftAngle < 90.0) then (playerX <= x2) else if (playerLeftAngle < 180.0) then (playerY < y2)
else
if (playerLeftAngle < 270.0) then (playerX >= x1)
else (playerY >= y1) 2.2 BSPtreestru ture
TheBSPtreestru tureiselegantlydes ribedbythe follow-ingMLdatatype:
type tree =
| Leaf of int list | Node of node and node = {
(* Label of partition line for the urrent node *) partline : int ;
(* Bounding box for the left subtree *) leftbbox : (float * float * float * float) ; (* Left subtree : ontains linedefs on *) (* the right of the partition line *) left : tree ;
(* Bounding box for the right subtree *) rightbbox : (float * float * float * float) ; (* Right subtree : ontains linedefs on *) (* the right of the partition line *) right : tree
}
Note that linedefs are represented as integers, indi es intoaglobalarray;thismakesiteasiertosavethestru ture todisk.
There ursivefun tiontowalkdownthetreeintheorder des ribedinse tion2.1isalsoeasilywritteninML: let re front_to_ba k_tree_parsing tree =
if not (s reen_full ()) then mat h tree with
| Leaf lines -> List.iter draw_lidenef lines | Node nd ->
(* Get partition line start point *) let v1 = lines.(nd.partline).start (* Get partition line end point *) and v2 = lines.(nd.partline).stop in mat h get_point_position !playerX !playerY
v1.x v1.y v2.x v2.y with | P_Left ->
if is_left_visible nd.leftbbox
then front_to_ba k_tree_parsing nd.left ; draw_linedef nd.partline ;
if is_right_visible nd.rightbbox
then front_to_ba k_tree_parsing nd.right | P_Right | P_On ->
draw_linedef nd.partline ; if is_left_visible nd.leftbbox
then front_to_ba k_tree_parsing nd.left) 3 Moreonrendering
Wenowknowhowlinedefsarepassed totherenderer. It's now ne essaryto renderindividually ea hlinedef as awall onthes reen.
Therststepistotransformlinedef oordinatesintothe player's oordinate system. This is simply a hieved by a translation plusarotation onthelinedef'sverti es oordi-nates. Thenthe linedefobtained hastobe lipped in ase itwouldbebehindtheplayer.
Atthispoint,weknowthatapartofthewallthelinedef representsispotentiallyvisible. Weneedtoproje titonthe virtual plane representing thes reen usingasimple me h-anismofperspe tive,inordertogethorizontal oordinates ofthiswall onthes reensurfa e. Wenowknowwherethis wall will extendhorizontally onthe surfa e of the s reen, butwestillneedto lipagainstthe player'seld( one)of visiontoremovenonvisibleparts(seefollowinggure).
Surface of the screen
Eye
Y
(new_xstart, new_ystart)
Focal distance
(xend, yend)
(xstart, ystart)
(new_xend, new_yend)
View from above
X
On ethisisdone, weknowthehorizontalextent ofthe wall on the s reen. Now the problem is to get the height ofthiswall. Thisheight an hangealong thewallbe ause of perspe tive ee t. In fa t, omputingthe height of the starting andendingwall pointsis suÆ ient, alinear inter-polationwillgiveustheheightinanypointbetween.
Thedes riptionaboveissuÆ ientfordrawingplainwalls likethoseofa losedroom. Intheworldswetry tomodel, itis possibletohavetwoadja entrooms,ea hsharingone ofitssideswiththeother. This orrespondsto\doors"and \windows" between rooms. In this ase, the limit linedef mustnotbedrawn,otherwiseitwouldlookliketheplayer annotwalk fromoneroomtotheother. Asolution ould be to he k whether the linedef we want to draw is atwo se torsjun tion, and nottorenderitinthis ase. Infa t, thissolutionisnot orre tbe auseourworlds anhavetwo adja entse torswithdierent eilingand/or ooraltitude. So insome ases the jun tion an look like a stepon the oororonthe eiling(seethefollowinggure).
Screen
Ceiling 2
Floor 1
Wall 3 - upper part
Wall 3 lower part
Ceiling 1
Wall 6 - main part
Wall 1
Wall 2
Wall 4
Wall 5
Screen column
Forthisreasonea hlinedef ontains3dierenttextures, the upper texture whi his used if the wall shows aupper visiblepart,thelowertexturewhi hisusedifthewallshows alowervisiblepart,andthe maintexturewhi hisusedto paintthewallifitisplain(thatisifitisnotajun tion be-tweentwose tors). We annoti ethat ifalowerorupper texture is needed, the main one is not used. This orre-spondstothe ase wherethewallisase tor jun tion.
Whenloweranduppertexturesareused,insteadof draw-ingoneplainwall,weneedtodrawtwowalls:theupperpart ofthewall, anditlowerpart(seegureabove). Of ourse, oneofthispart anbenull(eventhetwoparts,inthis ase, the wall is simplya jun tionbetween two se torsof same eilingand ooraltitude).
Hen e,fromthisdes ription,weseethatweneedto om-pute,for ea h wall, the verti al start and stop positionof ea h extremity of this wall (intermediate values are om-puted by linear interpolation) before being able to really drawit.
So,thelinedefdrawingroutineinMLlookslike: let draw_linedef linedef =
(* Transform line extremities oords *) (* into the viewer spa e system. *) let (xstart, ystart) =
rotate_translate lines.(linedef).start in let (xend, yend) =
rotate_translate lines.(linedef).stop in (* Che k if the wall is ompletely invisible *) if (is_behind_us xstart) && (is_behind_us xend) then ()
else
let (new_xstart, new_xend, new_ystart, new_yend) = lip xstart xend ystart yend in
(* Proje t oords on the omputer s reen spa e *) let s r_xstart = proje t new_xstart new_ystart in and s r_xend = proje t new_xend new_yend in (* Che k if the wall is ompletely out of our *) (* s reen. In this ase, do not ompute anymore *) if (s r_xstart = s r_sxend) || (s r_xend < 0)
|| (s r_xstart > s reen_width) then () else
if linedef.maintx <> 255 then add_wall new_xstart new_xend
s r_xstart s r_xend ; else
begin
add_wall new_xstart new_xend s r_xstart s r_xend ; if linedef.lowertx <> 255 then
add_wall new_xstart new_xend s r_xstart s r_xend end
Thispseudo odeisasimpliedversionofthereal ode implanted in O amlDoom. We an see we took the on-ventionof255meaning\no textureused". Inthereal im-plantation, before alling addwall, we needto omputea few oeÆ ientsusedfor texturemapping,andverti al wall extremities.
Displayingawall(donebyaddwallintheabove pseudo- ode) onsistsindrawingitverti allineperverti allineon allitsvisiblelength. Ea htimeaverti al lineisdrawnina s reen olumn,ifthisline totallyllsthe olumn,wemark this olumn. If a olumn is already marked as lled, of ourse,wedon'tdrawagainonit(thisispartofhiddenfa e removing). Ifa olumnispartiallylled,wedon'tmarkit, butwere ordwhi hpartisstillunlled.
A stronginvariant of Doom-likeworlds is that the un-lledpartofawallis alwaysa ontiguousspa e omprised between the bottom and the topof the s reen. Hen ewe only needto re ord two integers per olumn. Noti e that adding awall also draw the eiling (resp. oor) visiblein this olumn. Be ause ofsimpli ationsusedinourengine, oorsand eilingsarenottextured,soasimpleplainverti al linedrawingroutineissuÆ ientinthis ase.
3.1 Texturemapping
Inordertogetamorerealisti rendering,weneedtoapply texturesonthesewalls. Severaltexturingte hniquesexist, varyinginqualityand omplexity.Wesimply hoosetotile anarbitrarybitmap. Firstweneedto knowwhi h olumn of the bitmap maps on ea h olumn of the wall. Then, a ording to the distan e from the player to the wall we an determinefa tors for s alingverti ally this olumn to makeit tthe verti al dimension ofthe wall. Hen e, the verti al linedrawing onsists, forea hpoint ofthewallon the s reen,in fet hing the olor of thesour e point inthe textureandtowriteitonthes reen.
Wall
Texture
let verti al_textured_line_draw olumn top bottom hindex vindex vin r urrent_bitmap urrent_bitmap_height urrent_bitmap_width = let index_start = (top * 320 + olumn)
if index < index_end then begin let olor = urrent_bitmap.(((trun ate vi) mod urrent_bitmap_height) * urrent_bitmap_width + hindex) in
String.unsafe_set double_buffer index olor ; draw (index + 320) (vi +. vin r)
end in
draw index_start !vindex
Be auseofrestri tionsontheworldwemodel,ea htime we draw a olumn for a wall, we an noti e that for this olumnthenz oordinateinspa eremains onstant. Thisis thereasonwhysu hkindofengineisknowntouse onstant Ztexture mapping.
3.2 S reendrawing
To avoid i kerwhile drawing,wepreferto buildtheview imageinanoinebuerandthenblitthistemporaryimage in the video memory of the graphi ard. This buer is a simple hara ter string representing a 320x200 array of points,with8bits perpoint(i.e256 olors). Notethat on slow omputers(i.ewhenbuildingaframeviewtakeslonger thanthetimethevideo ardneedstorefreshthes reen),it anbeusefultosyn hronizethisblitwiththeendofvideo refresh.
Thislowlevela esstothevideohardwareisdoneusing SVGALib under Linux. This library provides a set of C primitivesto manipulateSVGAvideo ards. Weonlyneed threeof those primitives: obtain the address of the video memory;a essthe olor tableofthevideo ard;and wait for the endof s reen refresh. The only other part of our enginethat iswritteninCisasimple\blit"fun tionthat opiestheMLstringrepresentingtheo-s reenbuerinto thevideomemory.
NotethataversionrunningunderX-Windowalsoexists andwasusedtoextra tthesnapshotsshowninthispaper. TheamountofC odeistheexa tlythesame;itonlyuses XprimitivesinsteadofSVGALibprimitives.
3.3 Editing worlds
Given the omplexity of the data stru tures, it's obvious thatworlddes riptions annotbemade by hand. For this reason,wealsodevelopedabasi editorinObje tiveCaml, usingour CamlTK library [8 ℄. Thisallows to enter world des riptionusingmouse,windowsandbuttons. Besidesthis editor,aBSP ompileralsoexists whi htakesasinputthe world des ription and build the asso iated BSP tree. Be- auseofthere ursivestru tureofthetree,andits omplex datastru ture, a fun tional languagewith high level data typessu hasMLisveryattra tivetowritesu hatool. 4 Performan es
Inthisse tion,we omparetheperforman esofourML en-ginewithaCimplementationofthesamealgorithms. The Cimplementationisfun tionallyequivalenttotheML ver-sion,ex eptthatithandlesonly64x64bitmapsfortextures, while the ML version handles arbitrary bitmaps(GIF im-ages). Thisallowstorepla e,intheCversion,some modu-losandmultipli ationsbymasksandshifts. Tomake om-parisionmorepre ise, wepat hedthe Cengine inorderto
size onstantsand repla edshiftsandmaskby their orre-spondingarithmeti operations). TheML implementation is ompiled by the Obje tive Caml 1.07 native- ode om-piler, with default settings. Array bound he king is not globallyturnedo;weonlyusedthe\unsafe"versionof ar-ray a essprimitivesintheverti al textured linesdrawing fun tion. TheCimplementationis ompiledwithg -O2. To omparesour essize,evenifitisdiÆ ultto ompare aprogramwrittenintwodierentlanguage,we ansaythat on e ommentsand uselessblank lines are removedthe C versionisabout1000 linesofsour e,andtheMLversionis about700linesof ode. (Notethatidentiershaveroughly thesamenamesinbothversions).
The main performan e riterion is the frame rate, i.e thenumberofs reenimagesrenderedperse ond,assuming theyarerendered ontinuouslywithoutinterveningpauses. The higher the frame rate, the smoother the animation. Theframeratedependsobviouslyonthe omplexityofthe s enery;themeasuresbelowareforasimple,butnottrivial s enery.
Obje tiveCaml C Pentium166Mhz 34fps 47fps PentiumII333Mhz 64fps 81fps TheseguresshowthattheCamlversionis ompetitive with the C version, despite the fa t that we use full ML fun tionalities su h as datatypes, lists and their iterators, and re ursion. Onthe PentiumII, Caml a hieves80% of theperforman esofC.
Boththe Cimplementationand theCaml implementa-tiongivevisuallysatisfyinganimations,withoutper eptible pauses or \hi ups". A frame rateof 25 to 30framesper se ondisusually onsidered omfortable.
Therstimplementations(bothinCandML)used oat-ing point operationseverywhere,forthe sakeof simpli ity. Prolingshowsthatabout80%oftherunningtimeisspent intheverti al texturedline drawfun tionshownabove. The inner loopof this fun tion is exe utedapproximately on eforea hs reenpoint,i.e64000timesperframe. Exam-ination oftheassembly odegenerated byObje tive Caml andbyGCCshowsthatbyfarthemostexpensiveoperation intheinnerloopisthetrun ate oat-to-integer onversion. Thisoperationisextremely ostlyontheIntelx86 ar hite -ture,asitinvolves hangingtheroundingmodeoftheFPU to\trun atetowardszero",thenperformingthe onversion, thenrestoring the roundingmode. Thistakesawhopping 55to60 y lesonthePentiumII.
To address this bottlene k, we repla ed oating-point arithmeti byxed-pointarithmeti intheverti alline tex-turingfun tion(verti altextured linedraw). Thisleads to asigni ant speedimprovementof theengine as shown bythefollowing gures:
Obje tiveCaml C Pentium166 45fps 61fps PentiumII333 100fps 125fps Garbage olle tiona ountsforaverysmallpartofthe exe utiontime(lessthan5%).Thisisbe ausetherenderer allo atesalldatastru tureson eandforallat initialization-time. Mostofthetimeisspentintheverti altexturedline drawingroutine. Thenextmosttime onsumingroutineis addwall,i.etheonewhi h omputesthedimensionsofea h wall onthes reenand allstheverti al linedrawingwhen needed. The remainder of the renderer, and in parti ular
Of ourse,ourengineisfarfromperforman esobtainedwith theoriginalDoomengine,anditisfarfromthoseneededby areal game. Butaswesaidpreviouslyitwasnotintended tobearealgame. Itwasjustwrittentobeunderstandable by beginners, and to be a demonstration. To get a more eÆ ientengine,inbothMLandCversion,several enhan e-ments anbeandshouldbedone:
1. useofxedarithmeti everywhereinsteadof oats, 2. tabulate trigonometri fun tions like sin; os ;tan,
in-steadof allingthemea htimeweneedthem, 3. usexedpowerof2asdimensionsfortextures,whi h
wouldleadtousemasksandshiftsinsteadofmodulos, multipli ationsanddivisions,
4. expli itly share ommon sub-expressions while om-putingdataneededbytherendering.
Toget a moreimpressiveresult, it ould also be inter-estingtoaddtexturesonto oorsand eilings. This ompli- atestheenginebe ause oorsand eilinghavetobedrawn ashorizontallines(notasverti alinthe urrentdes ription ofthealgorithm). Inthisway,wekeeptheZ onstant invari-antwenoti edabove. Hen e,anintermediatestru turehas tobeusedtore ordverti al linesmakingup thewallsand horizontallinesmaking up oorsand eiling. ThediÆ ult partfor building this stru turelies inre ordinghorizontal runsfor oorsand eiling; verti al runsfor wallsare om-putedinthesameway thanwedid before(we justre ord theminsteadofdrawingthemonthe y).
Anewtexturemapper,more omplexbutmorepowerful isalsoneeded. Wethenhavetouse orre tperspe tive tex-turemappingte hniques.S hemati ally,forea hs reen pro-je tedpolygonwewanttotexture, knowingas reenpoint lo ation,wemustbeabletore overits orresponding posi-tiononthepolygoninthe spa e. Hen eknowing howthe textureisappliedonthepolygoninthespa e,we an deter-minewhi hpointofthistextureisused,andsowhi h olor to usefor this s reenpoint. Be ause of onstant Z invari-ant,dependingonwetherwe aredrawingaverti al runor anhorizontalrun,some omputation anbeextra tedfrom theinnerloopofthemapper,hen eredu ingtheamountof timeneededtotextureonerun.
A version ofour renderer in orporatingthose enhan e-mentsis urrentlyunderdevelopment.
6 Con lusion
OurOCamlDoomrendererdemonstratesthatML analso beusedforintera tivegraphi alappli ations,whereresponse timeisanimportantfa torandheavy omputationsare per-formedinreal time. Fortheseappli ations,using sophisti- ateddatastru turesandalgorithmsisasimportantasraw omputingpowerina hievinggoodperforman es;whatML losesinrawexe utionspeedonnumeri al omputationsis ompensatedbytheeasewithwhi hithandlesthe omplex datastru tures.
OnOCamlDoom,the Obje tiveCaml native- ode om-pilerdeliversaboutthreefourthsoftheperforman esofan optimizingC ompiler. Thisis onsistent withthegeneral laimthatgood,modernfun tional ompilerssu has Obje -tiveCamlorGHCstaywithinafa toroftwoofC ompilers. Thedieren e inexe ution speedisa eptableinpra ti e,
i alappli ations inCaml: a real-timemousedrivenimage \warper", and several image pro essing algorithms. All of themdeliverentirelysatisfa toryperforman es.
Referen es
[1℄ EdoardoBiagioni, RoberHarper, PeterLee, andBrian G.Milnes. Signatures for aNetwork Proto olSta k: A Systems Appli ation of Standard ML. Lisp and Fun -tionalProgramming,ACMPress,1994.
[2℄ MatthewS.Fell.TheunoÆ ialDOOMspe s,April1994. WEB: http://doomgate. s.buffalo.edu/ do s/ FAQ/ DOOM.FAQ.Spe s.html
[3℄ J. Foley, A. van Dam, S. Feiner, J. Huges Computer Graphi s Prin iples and Pra ti e,se ond edition,1990. Addison-WesleyPublishingCompany
[4℄ MarkHayden.TheEnsembleSystem,CornellUniversity Te hni alReport, TR98-1662,January1998.
[5℄ XavierLeroy,Jer^omeVouillon,andDamienDoligez. Ob-je tiveCaml,INRIA1998.Softwareanddo umentation availableathttp:// aml.inria.fr
[6℄ Fran ois Pessaux. BSP Trees pour la 3D Mappee, Nov 1996. Available at http://pauilla .inria.fr/ pessaux/bsparti le.html
[7℄ Fran oisPessaux.Realisation d'unmoteurgraphique en pseudo-3D mappee, January 1997. Software and do -umentation available at http://pauilla .inria.fr/ pessaux/engine.html
[8℄ Fran ois Pessaux and Fran ois Rouaix, Projet Cristal. The CamlTk interfa e, INRIARo quen ourt. Software anddo umentationavailableathttp:// aml.inria.fr/ rouaix/ amltk-readme.html
[9℄ MelSlater.AComparisonofThreeShadow Volume Al-gorithms,TheVisualComputer,(1992),Vol.9(1),25-38. [10℄ omp.graphi s.algorithmsnewsgroup FAQ
Available at http://wuar hive.wustl.edu/ graph i s/ graphi s/faq/ omp.graphi s.algo rithm s-fa q [11℄ Bsp Tree Frequently AskedQuestions
http://reality.sgi. om/bspfaq/i ndex. shtm l [12℄ Thesour e odeforDoomisnowavailableontheWEB
(De ember1997)
ftp://ftp.idsoftware. om/idstuf f/sou r e/ doomsr .zip