• Aucun résultat trouvé

The NTS project

N/A
N/A
Protected

Academic year: 2022

Partager "The NTS project"

Copied!
27
0
0

Texte intégral

(1)

Cahiers

enberg

GUT GUT GUT

m THE NTS PROJECT

P Philip Taylor , Jiˇrí Zlatuska , Karel Skoupy Cahiers GUTenberg, n  35-36 (2000), p. 53-77.

<http://cahiers.gutenberg.eu.org/fitem?id=CG_2000___35-36_53_0>

© Association GUTenberg, 2000, tous droits réservés.

L’accès aux articles des Cahiers GUTenberg (http://cahiers.gutenberg.eu.org/),

implique l’accord avec les conditions générales

d’utilisation (http://cahiers.gutenberg.eu.org/legal.html).

Toute utilisation commerciale ou impression systématique

est constitutive d’une infraction pénale. Toute copie ou impression

de ce fichier doit contenir la présente mention de copyright.

(2)
(3)

The N

T

S project:

from conception to implementation

Philip Taylor [1],Ji°íZlatu²ka [2]and KarelSkoupý [3]

[1]Webmaster, RHBNC,Universityof London, UnitedKingdom;

[email protected]

[2]Rector, Masaryk University,Brno, Czech Republic;

[email protected]

[3]Independentprogrammer,Brno,Czech Republic;

[email protected]

Introduction

For today's talk, I had hoped that Karel Skoupý, the Czech implementor of

N

T

S,wouldbeabletobeheretopresenttheresultsofhisworkandtoanswer

any questions that you mighthave.Sadly, that willnot be thecase: Karelis

working desperately hard to complete N

T

S before thecommencementof the

TUG2000conference,andIthereforehavetodeputizeforhimandattemptto

answeranyquestionsonhisbehalf.

Letmestartbypresentinganoverviewoftoday'stalkandpresentation;Iwill

attempt to cover seven separate areas, including (of course) the mandatory

questionsandanswersattheend.Thesevenareastobecoveredare:

AbriefhistoryofN

T S

T

E X, "-T

E X&N

T

S compared

ThechoiceofJavaasthelanguageofimplementation

Anoverviewoftheclasses,objectand methodsofN

T S

Asummaryofthestatusquo

AdemonstrationofN

T

S,andcomparisonwithT

E X

Questions&answers

andyouwillsoonrealizethatmyexpertiseliesverymuchintheearlierareas;

theimplementationdetails ofN

T

S areverymuchKarel'sarea,andIapologize

in advance foranyerrorswhich I maymakein presenting(in particular)the

overviewofclasses,objectsandmethods.

. ThepresentauthorswouldliketorecordtheirgratefulthankstoallmembersoftheN

T S

and"-T

E

Xteams,pastandpresent, withoutwhomneitherthispapernorN

T

Sitselfcould

(4)

1. A brief history of N

T S

The N

T

S project is the result of the foresight of just one man: Joachim

Lammarsch, one of the founders and for many years the President of

DANTEe.V.DuringtheperiodleadinguptotheDANTEmeetinginHamburg

of 1992, Joachimcirculatedamessage toall potentiallyinterested partiesas-

kingwhowouldbeinterestedin aprojectwhich wastocontinuewhereKnuth

had left o. A number ofus respondedpositivelyto this question, andthose

whodidwereinvitedbyJoachimto attendtheDANTEmeetingin Hamburg.

During the courseof themeeting, oneof the scheduledsessions wasdevoted

solely toTheFutureofT

E

X.Thosepresentdebatedatgreat lengthwhether

T

E

XshouldremainforeverasintendedbyKnuth,orwhetheritsfuturewastoo

importanttobedeterminedbyjustoneman(evenoneoftheintellectandstatus

ofKnuth).TheoutcomeofthesedeliberationswasthatT

E

Xitselfmust remain

solelyKnuth'sresponsibility,butthat aT

E

X-likesystemorsystemsshouldbe

createdby anindependentgroup, to continue developmentswhilst T

E Xitself

remainedfrozenforever.Itwasalsoagreedthatthenameofthisproposednew

systemshould beN

T

S (anacronymforNewTypesettingSystem)toindicate

that thisnewsystemwasnot T

E

X itself,but wasinsteadanewsystemwhich

whilstacknowledgingunreservedlyKnuth'srôlein itsevolutionwasfreeof

theconstraintswhichKnuthhadplacedontheevolutionofT

E Xitself.

Oncethemaindecisionhadbeentaken,otherdecisionsfollowedmoreorlessau-

tomatically.Itwasagreed,forexample,thatwhilstDANTEe.V.wouldprovide

initial fundingfortheproject,theprojectitselfwouldbedeemed tobetrans-

national,transcendingthearticialboundariesofanyoneT

E

XuserGroupand

drawingitsmembershipfromT

E

Xusersandotherinterestedpartiesthroughout

theworld.RainerSchöpfwasinvitedtochairthegroup,andothermembersin-

cludedJoachimLammarsch,Joachim(Johnny)Schrod,BerndRaichle,Peter

Breitenlohner,Ji°íZlatu²kaandmyself(Philip Taylor).

Thereafter,thegroupmetonaregularbut occasionalbasis,almostinvariably

atsuchatimeastoco-incidewithaDANTEconference.Themembershipdid

notremainstatic,andRainerstooddownasChairmanaftertherstyear,nee-

dingmoretimeforotherprojectssuchasL A

T

E

X-3andrealwork [tm].I took

overasChairman,andweagreedthatwithin thegrouptwoseparate projects

should beinvestigated,oneevolutionaryandonerevolutionary.PeterBreiten-

lohnerwastoheadtheevolutionary("-T

E

X)group,whilstJi°íZlatu²kawould

headtherevolutionarygroup(N

T

S proper).JoachimLammarsch,asfonset

origo,wouldremainasManagingDirector,andBerndRaichle,whosetechnical

skillsandT

E

Xexpertisewereinvaluable,was2nd-in-commandofbothprojects.

Sadly wealsosaidfarewell'toJoachimSchrod ataboutthistime:Joachim's

(5)

toagreewithallofthedecisionstakenwithinthegroupandpreferredtoresign

rather thanbe closelyinvolved with aprojectwith whose aims he could not

entirelyagree.

2. T

E

X, "-T

E

X & N

T

S compared

2.1. T

E X

There is surely littleneed in atalk addressed to membersof GUTenberg to

denewhatis,orisnotT

E X.T

E

Xis,bydenition,Knuth'sprogramforperfor-

mingtypesettingofthehighestquality,andthisprogramishisandhis alone.

No-oneotherthanKnuthhimselfmaymakeanychangestotheprogram(other

thanintheareaofso-calledsystemdependencies),anditisKnuth'spublically

statedintentionthatT

E

Xshould evolvenofurther:Donhasmadealltheim-

provementsto T

E

X that he deems necessary, and any further work which he

doeson T

E

X (atever-increasingintervals) isrestricted toeliminating anyge-

nuinebugs which havebeendiscoveredsincehe lastupdatedthesource.T

E X

iscurrentlyat V3:14159,andKnuth wishesT

E

X to become absolutely frozen

atthemomentofhisdeath,at whichpointitwill bedeemedtobeV.

2.2. "-T

E X

The"-T

E

Xproject,whichis as portable asT

E

X itselfand whichuses exactly

thesametoolsandlanguages(Web,Pascal,Weave,Tangle,etc.),sought(and

seeks)toextendT

E

Xinamannerwhichisbothconservativeandinnovativeat

thesametime. Itis conservativebecauseit intentionally usestex.web asthe

mastersource,andimplementsallchangesthroughthemediumofachangele,

yetisinnovativebecauseitaddsmuch-neededfunctionalitytoT

E

Xandextends

T

E

Xinawaywhichisintendedtomeettheneedsanddemandsofsophisticated

T

E

X userswhondthemselvesworkingattheverylimitofT

E

X 'sabilities.

The"-T

E

X projectwasconceived and (forsomeyears)executedby members

oftheN

T

S group,undertheleadershipofPeterBreitenlohnerandunderthe

technicaldirectionofmyself.Duringthisperiod,"-T

E

Xevolvedfrom-and-

releasesvia"-T

E

XV1to"-T

E

XV2.Bythe timeithadreachedV2:0,"-T

E X

hadaddedoverthirtynewprimitivesto thesetalreadyprovidedbyT

E X,and

hadextendedthefunctionalityofanumberofothers.Despitetheseextensions,

"-T

E

X was (and remains) 100% T

E

X -compatible, and this, together with its

portability,issurely"-T

E

X's greateststrength.

Indeed,soimportantwascompatibilityconsideredwhen "-T

E

X was beingde-

veloped that if no special action is taken when launching "-TX it then

(6)

behavesidentically toT

E

Xitself,andwiththesoleexceptionofthebannerline

cannot bedistinguishedfromT

E

X. Itgoeswithoutsayingthat,in thismode,

"-T

E

Xpassestheso-calledtriptest withyingcolours!

Ifaccessisrequiredto"-T

E

X 'smanyextensions,thenatthepointoflaunchit

isnecessaryto indicatethisexplicitly.Thisisaccomplished(oncommand-line

based systems) bylaunchingIni-"-T

E

X withan asteriskwhere an ampersand

wouldotherwisebeallowedbyT

E X , asin

e-initex *e-plain \dump

as comparedto(forexample)

initex &plain \dump

The presence of the ampersand triggers "-T

E

X into so-called extended mode,

and thisinformationis thenstoredin anyformatle which isdumpedat the

endofthat"-T

E

Xinstantiation.IfsuchaformatisthenloadedintoVir-"-T

E X ,

thelatterwillthenautomaticallystartinextended mode,asinthefollowing:

e-initex *e-plain \dump

e-virtex &e-plain <source file>

Onceinextendedmode,theuserhasaccesstoallof"-T

E

X 'smanyextensions,

yetifnoneoftheseisused"-T

E

Xcontinuestobehaveinamanneridentical

to that of T

E

X itself. Thus all legacydocuments which do not, by accident,

attempt to invoke one of the new "-T

E

X primitives will behave and typeset

identically under bothT

E

X and"-T

E X.

But"-T

E

Xhasonefurthertrucupitssleeve:aswellascompatibilityandexten-

dedmodes,"-T

E

Xoersaso-calledenhancedmodeinwhichstrictcompatibility

is sacriced in the interests of even greaterfunctionality. As of V2:0, "-T

E X

possessedonlyasingleenhancement:theimplementationofT

E X--X

E

T,based

on Knuth and MacKay'soriginal T

E X-X

E

T but completely integrated within

"-T

E

X(andthusrequiringnospecialIDVdriver).Sincetheimplementationof

T

E X --X

E

T requires that maths nodes be overloaded, 100%-compatibilityhas

to be sacriced, yet the dierences are so subtle that most "-T

E

X userswho

chosetoexploititsenhancedmodewouldstillnoticenodierenceinoutputof

theirlegacy(mono-directional)documents.

Toenterenhancedmode,specicuseractionisrequired:the"-T

E

Xdocument

beingprocessedmustspecicallyenableenhancedmode,eitheratthebeginning

of thedocument or at apointat which accessto enhancedmodeis required.

For T

E X--X

E

T, this is accomplished by setting oneof "-T

E

X's so-called state

variables, as in:

\TeXXeTstate = 1

Ingeneral,once"-T

E

Xisoperatinginenhancedmode,itisnotpossibletoforce

(7)

mode,neverfromcompatibilitymode).Incertaincircumstances,however,and

indocumentscarefullywrittentolocalizeallside-eects,itmay bepossibleto

cause"-T

E

Xtoreverttoextendedmode.Fortheexampleabove,thiswouldbe

achievedbyusing:

\TeXXeTstate = 0

at alater pointin thedocument, but usersarecautionedthat becauseofthe

asynchronousnatureof (e-)T

E

X'spage-breakingoperations,theremaystillbe

someundesirable interactions ifany modied mathsnodes are stillon one of

"-T

E

X 's internallists havingnot(yet) beenushedout.Thusfor allpractical

purposes the user should assume that, once in enhanced mode, "-T

E X will

remaininenhancedmodefortheremainderoftheinstantiation.

One last point remains to be discussed under the heading of "-T

E

X before

passingonto N

T

S proper: witheect from "-T

E

X V2:1, PeterBreitenlohner

assumedsole responsibility for"-T

E

X. Peter hasindicated that, whilehe still

wishestodevelop"-T

E

Xfurther,henolongerwishesto dosowithin theægis

oftheN

T

S group,and withsomeconsiderablesadnesswehaveacquiescedto

his wishes.We wish Peter allthe best with "-T

E

X , andare condentthat he

willcontinuetomaintainandsupportitwiththesamezealandinterestashe

hasinthepast.

2.3. N

T S

Foroverveyears,theN

T

Sgroupwereforcedbycircumstancestodevotetheir

attention almost exclusively to the development of the evolutionary system

known as"-T

E

X . This situation wasbrought aboutbythe verynature ofthe

groupitself: it was composed entirelyof volunteers, none ofwhom werein a

positiontoexpendgreattractsoftimeonaproject(N

T

S)whichwasoflittleif

anyinteresttotheirrealemployers.RecognizingthatN

T

S couldneverbecome

a reality if it wasto be developed solely by volunteers working in their own

time,thegroupdecidedthatN

T

Sshouldbeputoniceuntilsuchtimeasfunds

couldbefoundtoallowafull-timeprogrammerto beemployed.

During 1997/98, that much-longed-for possibility became a reality.

DANTE e.V. agreed to contribute the magnicent sum of DM30000 to the

project,sucientto allowaprogrammerto beemployedfull-timetoworkon

the project. Ji°í Zlatu²ka, asDean of the Faculty of Informatics at Masaryk

University (Brno, CZ), was in the fortunate position of not only being able

to recommend to the group ahighly competent programmer(Karel Skoupý)

butalso beingabletoarrange atripartite contractto allowDANTEfunds to

berouted viatheUniversityandthencetoKarelhimself.WemetwithKarel,

discussedtheprojectwithhim,anddespitethealmostverticallearningcurve

(8)

For some time, Karel did little but read. He read The T

E

Xbook, T

E X-the-

program,agreatdealaboutJava,and much elsebesides.Then,in theSpring

of 1998, Kareland Ji°ícame to my homein England, and Kareloutlined his

proposals for N

T

S. Ji°í & I were much impressed with the expertise which

Karel hadclearly acquired,and withveryfew changesagreed that heshould

continue todevelophis ideas.Bythetimewenextmet,Karel was tobein a

positionto demonstrateworkingcode.

The next review took place in Brno,at the University, and on this occasion

JoachimLammarschalsotookpart.Joachimhadgreaterfamiliaritywith,and

exposure to, Java thaneither Ji°í or myself, and his presence at that review

wasinvaluable.Oneof themoststrikingpointswhich cameoutof thereview

wasthatKarelhadelectedtoprogramforeciencyratherthanforclarity,and

there wereanumberofplaceswherewefeltobligedtoask himtore-thinkhis

approach(forexample,weaskedKareltoeschewtheuseofintegersasgeneral-

purposevariables,andinsteadto usethemonly where integerarithmetic was

required). Karel responded positively to our suggestions,although he clearly

retained his reservations,and agreed to adopt ourrather moredefensive and

didactic programmingstyle.

When thecontractwith Karelwasrst discussed, allinvolved in the project

believedthatwecouldgetfromtheorytoafullimplementationinonecalendar

year. Asthe endof theyearapproached,it became onlytooobviousthat we

had been(typically, manywould say,in theIT/softwareworld)verynaïvein

our analysis and far too condent in Karel's ability to complete the project

onschedule. Indeed,bytheendof therstyear,althoughT

E

X'smouth had

beenre-programmedinJava,N

T

Swasstillunabletoperformeventhesimplest

typesetting, and anenormousamountof work clearlyremained to be carried

out.

Despitetheobviousdisappointmentwithwhichmembersof DANTEreceived

thenewsthatN

T

S wouldnotbedeliveredonschedule,theircondenceinthe

projectremainedonthewholeunshakenandtheygenerouslyvotedtocontinue

fundingtheprojectforafurtherperiod.During1999,ananonymousbenefactor

pledgedafurtherDM7500tosupporttheproject(thisbenefactor,towhomthe

group are deeply indebted, is a private individual, not auser grouporother

corporate body), and at the 1999 EuroTeX meeting other T

E

X user groups

alsoundertook tosupporttheprojectnancially.Itisparticularlypleasantto

be able to thank the members of GUTenberg personally, since GUTenberg

have pledged EU3000 for three years in support of the project. Thank you

GUTenberg!

So what is N

T

S, and why is it taking so long to reach completion? Unlike

"-T

E

X, whichis conservativeand evolutionary, N S is truly revolutionary in

(9)

that it attempts (for the rst time, asfar as we are aware) to re-implement

thealgorithmsandfunctionalityofT

E

X -the-typesetting-systemwithoutinany

way copying the coding (or even the data structures, though to a far lesser

extent) of T

E

X -the-program. Whilst T

E

X is written in Pascal-Web, N

T S is

written in Java. And whilst T

E

X-the-program is adeeply entangled (though

carefully structured) and highly daunting monolithic 1

program, N

T

S is in-

tended to consist of aseries of loosely coupled modules, any or all of which

canbereplacedby functionally equivalentmodule(s)with thesameinterface

semantics.

Thesuccessofthislatterapproachwasborneoutfairlyearlyon,sinceKarelwi-

selydecidedtocuthisN

T

Steethonthefarlessdauntingtaskofre-engineering

TFtoPL andPLtoTF in Java. Themodule which interpretsthe TFM lefor

the purposes of TFtoPL is exactly the same module as performs that func-

tion for N

T

S, and thus software re-usability that much-vaunted modern

desideratum hasbeenachievedinpractice.

WhilsttheultimategoalofN

T

S isto provideacompleteand integrated,yet

functionallydistinct,setoftypesettingtools,theshort-termaimistoprovidea

completere-implementationofT

E

Xinaexibleandextensiblemanner. Early

experiencesuggeststhatwearewellonourwaytoachievingthataim,and(des-

pitecertaincaveatsthat willappearlateron)thishas,in part,beenachieved

bythecarefulchoiceanduseofprogramminglanguage.

3. The choice of Java

as the language of implementation

Rightfromitsconception,theN

T

Sprojectwasnotonlyaprojectwhichwould

concentrateonprovidinganew,morepowerful,successortoT

E

X,butwasalso

aneorttore-programT

E

X-the-programassuch.

Thereasonforthiswassimple:T

E

X-the-programformsanexampleofamonoli-

thicPascalprogrambasedheavilyonoptimizeddatastructureswhichallowed

Knuth to cope with the memory limitations of computers in existence more

than twenty years ago.At that time, the reasonsfor choosing both the lan-

1. Knuthwouldalmostcertainlytakegreatexception tothe useofthe wordmonolithic,

sinceheevidentlytookenormouscaretodividetheprogramintosmallandquasi-independent

modules. Unfortunately, whilst the logic, structure and orthogonality of thosemodules is

undeniable,theprogramasawholeisamasterpieceofeciency,re-usingcodeand/ordata

structures whenever possible,and as a resultthe program in totalityis very dicult for

otherstomodifyorextend.Indeed,PeterBreitenlohner'sabilitytoaddfunctionalitytoT

E X

viathemediumofachange-leis,asfarasweknow,theonlysignicantattempttoextend

T

E

X-the-programinanynon-trivialwayotherthantheequallysignicantbutinmanyways

farmorerestrictedchangesmadebyHànTh

êThànhinhisPDF-generatingvariantpdfTX.

(10)

guageandtheprogrammingtechniqueswereunderstandable.Pascaldeveloped

from the family of procedural languagesbased on a formal syntax and well-

dened semantics starting with Algol 58, Algol 60, and Algol 68, the latter

providing one of the brightest peaks of the development of programing lan-

guagesbut whichwasunfortunatelytoocomplextosurvive.Pascalappeared

as abranch of developmentwhich combined the basic programmingtools of

structuredprogrammingasamethodologyforprogramming,motivatedbypro-

gramcorrectnessprooftechniquesandanapproachtobuildinglargeprograms

from manageably smaller components, and the tools for using abstract data

structuresinsteadofjustthedatatypesprovidedbytheunderlyingcomputer

hardware.Usingadeliberatelyrestrictedsetofconstructions,Pascalcouldbe

seenas anabstract machine code forageneralset of computerswhich also

allowedfor its portability and universal adoption both asfar asimplementa-

tions forvarious typesof computers wereconcerned, and itsgeneral useasa

languageofchoiceforteachingprogramming.

ForKnuth,Pascalwasalanguageespeciallywell-suitedforexpressinggeneral-

purposealgorithmsinawaysuitableforpublishing.Inthedecadesofthe60's

and 70's, the issues associated with particular waysof expressing algorithms

and developingproperprogramming style using high-levelprogramming lan-

guageswere amongthethetopicswhichformed coreproblemsforresearchin

Computer Science. Knuth added the concept of literate programming allo-

wingtheexpressionofhisalgorithmsasastreamofconstructionswhichfollow

thelogicoftheideasneededforunderstandingaparticularprogramconstruc-

tion,ratherthanthelogicofthesyntaxoftheprogramminglanguageusedas

Knuthstatedhisgoal,towriteprogramsinsuchawaythatinsteadofimagi-

ningthatourmaintaskistoinstructacomputerwhattodo,letusconcentrate

ratheronexplainingtohumanbeingswhatwewantacomputertodo.Pascal

assuchdidnotsuitKnuth'sneedssuciently,butliterateprogrammingtools

basedon macrogeneration allowedhim to introduce anextensionto Pascal's

programmingconstructs(theotherwiseclauseinthecasestatement),tobreak

thestructure of declarationand code sections(usingthetangling feature of

theWEB system),andalsoto providetoolsfor allowingverycarefulmemory

optimizationandimplementationindependence(usingonlyintegerdatatypes

and using macro constructions to generate code fragments necessaryfor the

packing/unpackingofdataused).

WithinthetwodecadeswhichfollowedthebirthofT

E

X,thepremisesonwhich

these decisions were basedhave become questionable, and indeed formed an

obstacletoeortstoprovideasuccessortoT

E

Xwhichcouldextendthelatter's

capabilities sucientlyfar. Pascal hasbeen succeeded by aline of languages

or systemswhich gained much smaller practicalacceptance, and similarly to

(11)

usageand/orimplementation(andKnuthhimselfhasmovedtoCWEBliterate

programmingbased on C which ultimately yielded a languagethat provided

himwithindescribablejoy inprogramming).Theneedforcarefulpackingof

dataintoaslittlespaceaspossiblewasremovedbytheemergenceofcomputer

architecturessupportingmuchlargermemorysizes andthepracticalavailabi-

lityandaordabilityofinstalledmemorysizesunthinkabletwentyyearsagoin

any context other than perhapssecondary disk storage. Data structure opti-

mizationhasbecomeanobstaclepreventingprogrammodicationandplacing

akindof timebomb into thecodewhich explodes whenseeminglysmall and

straightforwardmodicationsneedtobemade.Similarproblemswithmodia-

bilityarecausedbythemonolithicstructureofPascalcodeasusedbyKnuth

within T

E

X -the-program. Particular algorithms used within its body are ex-

tremely hard to modify or extend because of the lack of narrow and clearly

denedinterfacesbetweenindividualpartsoftheprogramcodeandbecauseof

thegeneralaccessibilityofsharedglobaldatastructures fromwithinanypart

ofthecode.

TheattempttostartN

T

Sdevelopmenthasthereforebeenlinkedwithadelibe-

ratedecisiontore-createT

E

X -the-programsothattheprogramminglanguage

andprogrammingmethodologyallowfortheremovalofunnecessarydataopti-

mization,theremovalofexplicitstorageallocationandtheuseofmechanisms

already present in modern programming languages, and also to allow for a

modular code structure with clearly dened interfacesand data paths which

willallowforeasiermodicationand forexperimentswiththeresultingcode.

Theideaofre-creatingT

E

Xusingamoremodernprogrammingtechniquerst

came fromJoachimSchrod, oneof theinauguralmembersoftheN

T

S group

who also came with a prototype example of what he meant by such a re-

implementationeort,showinghowtoextract themacro-generationlanguage

ofT

E

XasaLISPprogram.ThegeneralworkplanforN

T

Sdevelopmenteort

has since then consisted of assuming that N

T

S version 0 would be created

asafaithful 100% (or foranypractical purpose asclose aspossible to that)

T

E

X-compatibleprogram,andonlyfromsuchcodewouldfurtherdevelopment

activitiescontinuebymodifyingand/orextendingthis code.

Thechoiceof asuitableprogramminglanguagefor there-implementation ef-

fortshadbeenacrucialunresolvedproblemuntilearly1998.Discussionsoscil-

latedaroundthreedierentprogrammingmethodologies,eachofwhich would

provide a dierent set of advantages concerning the programmingmethodo-

logy,theavailabilityofcompatibleimplementationsacrossawidespectrumof

hardware platformsand operatingsystems,and theexistenceofasuciently

largebaseof programmerswhowouldform thebrainstrust forfuture N

T S

extensionsand experimentswithdierenttypesettingparadigmsand userin-

(12)

FunctionalprogrammingasrepresentedbyLISPorCLOS(theCommonLISP

Object System) had been the languagefor JoachimSchrod's early attempts.

Theprincipalmotivationforusingthislanguageconsistedinthefactthatlists

of lists are thebasicdata structures manipulated within T

E

X, and hencethe

basic internal programmingparadigm should be easy to represent. Symbolic

data structures as used within LISP allow for easy meta-programming and

prototyping,twotechniquesveryhandyforexperimentaldevelopment.Within

university environment, LISP has traditionally been the language for imple-

mentation of experimental student projects with high programming produc-

tivity, and hencesatisfying the essentialrequirementsfor viability of its use.

Even though CLOS systems do provide a suciently stable and compatible

implementationofthisparadigm,thecross-platformcompatibilityislessthan

ideal.

LogicprogrammingasrepresentedbyPROLOGorconstraint-satisfactionpro-

gramminglanguagesbasedonexpressingprogramswithinasubsetofsymbolic

logic has been another prototype language family with a high level of abs-

tractionand ahigh productivityrate.Symbolicdatastructures(terms) allow

for high exibility in writing very easily modiable code, and backtracking

mechanisms used for underlying implementation of state-space search could

provideinterestingpossibilitiesforsearchingforsemi-optimalsolutionsofsets

ofconstraintsthroughwhichverycomplexconditionsontheresultingtypeset

material couldbeexpressed. Eventhoughthe programmingstructures areas

farfromtheunderlyinghardwaredatastructuresaspossible,theactualimple-

mentationsofthisparadigmvarysignicantlytosuchanextentthatcompati-

bilityproblemsamongthedialectsmakelogicprogrammingaveryproblematic

choiceifeventualcross-platformandcross-systemcompatibilityissought.

Procedurallanguages,CandC++inparticular,presentagroupoflanguages

with considerable lowerproductivity in writingthe program code and its re-

sulting size.Even thoughthelanguagesassuchcanbewell-dened,practical

dierencesasfaraslibrariesincludedoroperatingsystemsinterfacesmakeita

real messtoproduceuniversallyusablecodewhich wouldrun acrossdierent

platformswithcompatibilitycomparabletothatofT

E

Xitself.Anotherpoten-

tial problemwasseenin notoriousproblems withnon-trivialmodicationsof

programs employing access to generalcommon shared structures, a common

programmingtechniqueusedinconnectionwiththeselanguages.

Eventually,Javahasemergedasacompromisesatisfyingmanyoftheessential

requirements,oeringinterestingfutureopportunities,andbeingcomplicated

byrelativelyfewdrawbacks.Asfarasprogrammingmethodologyisconcerned,

Javacombines C-basedproceduralprogrammingwith objectbased program-

ming style.Objectsserveasthebasicprogramcomponentsallowingforstruc-

(13)

andthedevelopmentoffuturemodicationsbythesubstitutionofcertainob-

jectsof which the programs consist by other components.Objects also serve

asaconsistentreplacementoftraditionaldatastructuresandthusremovethe

traditionaldrawbackofglobalshareddatastructures.SuncameupwithJava

asacompany-basedstandardprovidinguniform systeminterfacesandahigh

levelofsecurityofJavaapplications.These claimsremaintobedemonstrated

in reality andnot just as wishful thinking and bold P.R. statements, yet the

developmentoftestedandcertiedJavainterpretersincorporatedintoubiqui-

tousWWWbrowsersmadeithighlyprobablethattherequirementofgeneral

compatibility could be achieved and a signicant base of Java programmers

formed. Last but not least, Java as a WWW-based and Internet-aware lan-

guagemakeitpossibletothinkofN

T

S asanetwork-basedprogramwhichwill

eventually allow the combination of elements downloaded from the network

and thestandardizationof interfacesandtechniques used acrosshuge groups

ofgeographicallydispersed users.

Thetrendassociatedwiththenetwork asan importantelementof thefuture

computingenvironmenthascontributedtothechoiceofJavaastheN

T S im-

plementation language. Karel Skoupý joined theN

T

S team in early 1998 as

theprogrammer whose initialtask has beento deconstruct T

E

X-the-program

intoanobject-basedprogrammingcodepreservingtheessentialfunctionalfea-

turesof T

E

X assuch but providing thegrounds for future modicationsand

extensions.

After some 18 months of (re-)design and programming, a functional proto-

typeof core components of T

E

X has beendevelopedand made accessiblefor

initial experimentation; early results were presented at previous conferences

(e.g.EuroTeX 0

99)andit ishopedthat thenalcodeforN

T

S V0will bede-

monstratedatTUG2000.Whatyouwillseetodayisveryclosetothatcode!

4. An overview of the classes,

objects and methods of N

T S

The implementation language of N

T

S is Java. It is strictly object oriented,

andalloftheprogramcodeisencapsulatedinobjectmethods.Theobjectsare

instances ofcertain classes,and cluster together to form packages,discussion

ofwhichformsthemajorityofwhatfollows.Atthetimeofwriting,notallpa-

ckageswerecomplete;inparticular,mathematicsremainstobeimplemented,

although the majority of thedesign work for this remainingtask is virtually

(14)

4.1. Package base

Themain purposeof thispackageisto denetheelementary datatypesused

in therest ofthesystem.It isaminimalelementoftheN

T

S packagehierar-

chy, meaningthat noclass hereisdependenton any classesfrom other N

T S

packages.

Themostimportantclassesareasfollows:

Dimenrepresentsadimensionmeasuredin printers'points(ormuunits). The

fact thattheinternal representationisthesameasthatof dimensionsin T

E X

ensuresthestrictcompatibilityneeded.Thepublicinterfaceofthisclasstries

to be completely independent of its internal representation. To the outside

world,aDimenlookslikeafractionofpoints.Therearemethodsforconversion

frominteger,fromfraction(givenbyitsintegralnumeratoranddenominator),

from oating point number, and vice versa. It also provides basicarithmetic

operations.Inthecaseofbinaryoperations,versionsforcombinationwithother

convertiblenumerictypesaresupportedtoo.Thegeneralpublicinterfaceallows

foracompletechangeoftheprecisionoreventheinternalunitofrepresentation

withoutaectingtheothercode.

Gluereectsanothertypefamiliar fromT

E

X .Ithasitsnaturaldimensionand

theamountandorderofstretchabilityandshrinkability.Itprovidesarithmetic

methods such asadding two Glues, multiplying by ascalar numberand also

versionsforotherconvertibletypes.

Num representsanintegralnumber.It isjustaninteger,butwrappedinto an

objectsoitcanbestoreddirectlyinthetableofequivalentsandcanbedistin-

guishedfromordinaryintegersinthecode.Itservesmainlyastherepresenta-

tionofthevalueofnumericregisters(\count)(andissomewhatsymmetricto

DimenandGluefor\dimenand\skipregisters).

All thebasicclassesaboveprovidemethodsfor obtainingcharacterstringre-

presentationswhichcanbedisplayedonscreen, in the logle orused by the

\theprimitive.

LevelEqTable is the last important and relativelycomplex class. It is used to

implementT

E

X'stable ofequivalentsandthehashtable.WhilstT

E

Xusesan

associativehashtable onlyforthemeaningsofcontrolsequences,N

T

S stores

many otherkindsof equivalentsin anassociativemanner. Any objectcanbe

associatedwith aparticular combinationof kindand key. Dierent kindsare

denedfordierenttypesofequivalents:oneisforcontrolsequencemeanings,

another isfor each classof register,stillanother forcatcodes,etc. Thekeyis

(according to thekindofequivalent)either anobject(e.g.acontrol sequence

name)oranumber(mostothers).Thisassociativeapproachforstoringregister

(15)

AlthoughN

T

S iscompatiblewithT

E

X inprovidingonly256registersofeach

sort,thislimitationisarticiallyaddedandcanbeeasilyremovedinthefuture.

Asthenamesuggests,besidesstoringequivalents,theLevelEqTablealsomain-

tainspushingandpoppingoflevelswhichareresultfromgroupingintheinput

languageandthecorrespondingsavingandrestoringofassociatedvalues.

Althoughtheregistersweremovedfrom astaticarraytoanassociativetable,

thereisstillanothertypeofvaluewhichisnotassociativebutwhichissubject

to saving andrestoring. These are parameters(such as \tolerance,\hsize,

... ) thecurrentvalue of a parameteris stored in oneconcrete place. The

LevelEqTableprovidesaninterfacefortheseexternalequivalentstoo,andmain-

tainssavingandrestoringforthem.

4.2. Package io

This package contains classes and interfaces for reading characters from an

inputle andwritingtothelog le.Either orbothofthose lesmay equally

wellrepresenttheuser'sconsole.ThepackageisindependentoftheotherN

T S

packagesaswellasofthepackagebase.

CharCodeisaninterfaceandisveryinteresting(atleast,wethinkso!).There

had been considerablediscussion as to whether ornotto representcharacter

codes by someJava primitive typeor bya class. It wasdecided that aclass

should be used do as to clearly distinguish it from other usages of primitive

types.Eventually(duringdevelopment),itturnedoutthatanevenmoreabs-

tractrepresentation(asaninterface)bestmatchesitspurpose.Itdeclaresme-

thodsforgettingthecorrespondingcharacterornumericvalue,comparingwith

another CharCode, character or number for matching, making thecorrespon-

dinguppercase orlowercaseCharCode,writingonacharacter-orientedoutput

leandseveralpredicates.MostofthemethodsaretherebecausetheT

E Xlan-

guageusescharactersheavilynotonlyfortypesettingbutalsoasnumericvalues

andasparts ofkeywords;inaddition,certaincharactershaveaninuenceon

scanningandonlogoutput(\endlinechar,\escapechar,\newlinechar).

CurrentlytheimplementationofCharCode usedinN

T

S isjust aclasscontai-

ninganordinarycharacter.Butthereexiststhepossibilitytouseverydierent

representations(e.g. namedcharacters)withoutchanginganyN

T

Scode.Such

objectscanpassthroughthewholesystemprovidedthatattheendthereexists

anoutputobjectwhichrecognizesthem andhandles themproperly. Even se-

veral independentimplementationsof CharCode could co-existin somefuture

(16)

Name has the samerelation to CharCode as Stringdoesto char. It is used to

representthenamesofcontrolsequences,\jobname,fontnamesandle-names

scannedfromtheinput.

InputLine represents one line from the input le or from the user's console.

There aremethodsforgettingthe nextCharCode orjust peeking to seewhat

the next code is without altering the current reading position. It interprets

extended character codes (such as^^M), ignores trailing blanks and appends

\endlinecharifneeded.Anotherclass,LineInput,servesasaninputsequence

ofInputLinesfromaleorconsole.

Log is an important interfacefor printing information ona log le oron the

user'sconsole.Itdeclaresmethodsforprintingvaluesofprimitivetypes,Strings,

CharCodesand Loggables(seebelow).Severalmethods aredeclaredtocontrol

output line breaking. Class StandardLog implements the Log interface in the

standardT

E X way.

Loggableisaverysimpleinterfacewhichdeclaresonemethodforprintingona

Log.ItisveryhandybecausemostoftheimportantclassesinN

T

Simplement

this interfaceandso theirloggingisconvenientlyhandled.

4.3. Package command

Classes in thepackagecommand form the interpreterfor the T

E

X input lan-

guage. Although it is a large package, it has nothing to do with typesetting

per se.Infact,atleastonethirdoftheT

E

Xsourceisnotabouttypesettingat

all. Itisresponsiblefortheprocess ofscanninginputtokens,expandingthem

andformostofthemode-independentprocessingsuchasmacrodenitionsand

registerassignments.

Token is an abstract class. It declares methods for getting the meaning of a

Token, assigning a newmeaning (if allowed),matching anotherToken, and a

numberof predicateswhich tellifitis redenable,isabrace, aletter, andso

on.ThereareseveralkindsofTokens,and theyformasmallhierarchyofsub-

classes. TypicalexamplesincludeCtrlSeqToken, ActiveCharToken,SpaceToken,

LetterToken,LeftBraceToken, ...

Tokenizer is able to provide a sequence of Tokens. There are various sub-

classesofTokenizersuchas:LineInputTokenizerfortokenizationoftheinputle,

MacroExpansionfor macrobodies withsupplied parameters,InsertedTokenList

for a token list from a token register inserted into the input stream, or

BackedToken for just one backed-up token. Tokenizers are pushed onto a

TokenizerStacktheanalogueofT

E

X 'sinputstack.

Command is an abstract class which represents each T

E

X command. Mostly

(17)

table ofequivalents, butthere are important exceptionssuch as Macroorthe

meaning of a character.In T

E

X, tokens and command codes are represented

bythesametypeandtheyareofteninterpretedinbothwayswhichmaylead

to confusion. N

T

S strictly separates the concept of token and command. As

outlinedabove,aTokenisapieceofinputwhichcanhavesomemeaning.The

typeofthis meaningistheCommanddiscussedhere.

ACommandhasmethods forexecutionandexpansion.OnlysomeCommands

canbeexpanded,andthispropertyisindicatedbyanotherpredicatemethod.

TheheartofN

T

S isacycleverysimilartoT

E

X'smain_control.Inonestep,

atokenisfetched from theinputandits meaningisexamined. Ifitisexpan-

dable, theappropriate method forexpansion iscalled. Ifthere is someresult

ofexpansion,themethodis responsibleforpushingitonto theinputstack.If

itisnotexpandable,themethodforexecutingthecommandiscalled.

There is one curious fact aboutexpandable commands: theyare executed if

theirexpansionissuppressed by\noexpand.In thiscasetheybehaveexactly

like \relax (they do not do anything apart from re-setting T

E

X's internal

stateandterminatinganyactivelook-ahead)andtheyevenpretendthatthey

are \relaxwhen examined by \show. For that reason, the whole subtree of

expandablecommandsisderivedfromtheRelaxcommand.

Anotherimportant partof theCommands interfaceare themethods used for

indicatingavailabilityandgettingsomevalueofacertaintype.Itisusefulwhen

the command occurs on the right-hand side of an assignment, for example,

andthereforetheregisters,parametersandafewotherscanprovidenumeric,

dimension,glueortokenlistvalues.

CommandBaseisasuperclassofCommand.Itdenesonlystaticmethodswhich

arerelatedtoscanningvariouselementsofinput(suchasnumbers,dimensions,

le names, keywords,... ), maintaining the table of equivalents, input stack

andseveralinstancesofLogoutput.Aswewillseelater,therearemoreobjects

thanjust Commands which requiresuchservicesand so arederivedfrom this

abstractclassforconvenience.

4.4. Package node

Nowatlastwearegettingtotypesetting!Theclassesinthispackagerepresent

material to be typeset. There are also general interfaces to font metrics and

output generators. Thepackage is relativelylowin the hierarchy, the classes

aredependentonlyonthebaseandiopackages.Thatgivesthemagoodchance

tobere-usedinadierenttypesettingsystemwhichmayprovideacompletely

(18)

Nodeisaninterfacewhichdenestheelementarybuildingblockoftypesetting

material. It has methods to get its sizes (even when aected by some stret-

ching or shrinking), to describe itself on a logle and to be typeset. There

is a hierarchy of classes which implement the Node interface. Some of these

are elementary, for example: RuleNode, HKernNode, VKernNode, HSkipNode,

VSkipNode, PenaltyNode; other objects are complexand can contain lists of

subsidiary nodes:HBoxNode,VBoxNode.

Packerisusedwhenweneedtocomputethesizesofcomplexboxeswhich are

built outoflists ofnodes.This processis called packagingin T

E

X. Thealgo-

rithmisessentiallythesameforhorizontalandverticallistsofboxes,onlythe

horizontal andverticaldimensionsareippedfordierentcases.Theabstract

classPackerdenesanabstractalgorithmandprovidesaplaceholdertogetthe

appropriate box dimensions. There are then special subclasses for horizontal

and verticalboxes which are in turn sub-classed outsidethis package to give

therightkindofwarningsifsomethingisnotdecent.

FontMetric isanabstract interfaceforfontmetricinformation objects.Atthe

moment, there are only the familiar tfm les but that is not a restriction of

N

T

S itispreparedforanykindoffontmetricwhich canbeadaptedto this

interface.Therearemethodsforgettinganidenticationandvariousnumericor

dimensionparametersforT

E

Xcompatibility.Butrstofalltherearemethods

to getaNodeforaparticularCharCode,togetanormalinter-wordspaceand

togetaspecialobjectwhichisabletoproducetherepresentationofcharacters,

ligaturesandkernsforagivensequenceofCharCodes.

TypeSetter hassimilarcharacteristicsto FontMetric.Itdenes aninterfacefor

generaltypesettingoutput.Therearemethodsfortypesettingacharacterora

ruleat thecurrentpositionandadjusting thisposition.

4.5. Package builder

This packagetakescareoftheareasconcernedwithT

E

X'shorizontal,vertical

andmathsmodes.WhilstinT

E

Xthereisjustoneglobalintegervariablewhich

indicatesoneofthesevenpossiblemodes(the threementionedaboveareeach

internal orexternal, and there is oneno mode) on the top of the semantic

stack,N

T

S usesobjectswhichbuildtypesettingmaterialfordierentmodes.

The package is more dependent on the T

E

X paradigm than is node but is

still independentoftheT

E

X language.It isrelativelysimpleandsmall. Some

amountofcomplexitymustbesolvedwhentypesettingcommandsanddierent

modesinteractbutthatissueisaddressedinanotherpackage.

Builder is the root of the hierarchy of classes for dierent modes. It declares

(19)

methodsforaddinganode,kernorskiptothelistofnodeswhichiscurrently

beingbuilt.It makestheappropriateversions(horizontalorvertical)ofkerns

andskipsandperformsotheradjustmentsifneeded.Currentlyonlythemodes

knownfrom T

E

X aresupportedbutthere isprovisionforothertypesofmode

(chemical,picture,...).

4.6. Package typo

Thepackagetypois asuperstructure ofthepackagecommand. Itcontainsall

theCommandsubclasseswhichdealwithtypesettingcurrentlydeveloped(there

willbeapackagemathsformathematicaltypesettingcommandsbutitdidnot

exist whenthis textwas written).It utilizesthepackagesbuilder andnode as

well.

TypoCommandissimilartoCommandBasebutisintendedfortypesettingcom-

mands. It is an intermediate abstract Command class which denes several

useful static methods. It maintains a stack of Builders and the current font

metric.Therearemethodsforscanningafontmetricorboxspecicationfrom

theinput,andaddingacharacteror spacetothecurrentBuilder.

Manyclassesin thispackagearederiveddirectlyfrom classesin thecommand

packagebecausetheycaninheritsomeusefulbehaviourfromthem.Theycan-

notbeincludedin thecommandpackagebecausetheyneedsomeinformation

which isavailableonlyinthetypopackage(usuallybycallingsomestaticme-

thodofTypoCommand).Therearebasicallytwokindsofthese:oneis\ifpri-

mitivessuchas\ifhmodeor\ifvboxwhichjust needsomeinformationabout

thecurrentBuilder oracertain boxregister; anotheristhe commandswhich

aremodeindependentbuttypographicsuchas \setbox,\wdand\chardef.

BuilderCommandis an abstract superclassfor commands which are mode de-

pendent.InanopensystemsuchasN

T

Swewantnewfeaturestobecapableof

beingeasilyadded.Thereis,forexample,asuperclassCommandwhichdenes

aparticularsetof methodswhichcanbeimplementedbythenewcommands

in any sensible way. This kind of polymorphism is directly supported by the

chosenprogramminglanguage.

ButwhattodoifinfuturewewanttoaddanewmodebydevelopingaBuilder

which oers somenew functionalitynot declaredin theBuilder interfaceand

some specialized commands which can utilize this new functionality? If we

do not want to extend the basic interface (at least until a new version) or

evenperhapscannotdoit(wearedevelopingaplug-in),theonlychance isto

examinethetypeofthecurrentBuilder andtousetheinfamouscastoperator

(20)

But thereis another problemwith existing mode-dependentcommands. How

should theybehavein thenewmode? For this purpose, theBuilderCommand

maintains a hash table which associates an Action with each combination of

Builder class and Command instance. The association is dened at the level

of the N

T

S conguration and it automatically follows the class hierarchy of

Builders. Thanksto this versatility,itis veryeasy tospecifythe behaviourof

commandsin dierentmodesformodiedsystems.

Action is a subclass of CommandBase so it inherits many methods for scan-

ning theinput, dealing with log les and error messages.Actionsare usually

implementedasinnerclassesofthecorrespondingBuilderCommand.

A fragmentoftheN

T

S congurationdatalookslike:

RulePrim hrule = new RulePrim

("hrule", default_rule, Dimen.NULL, Dimen.ZERO,

Dimen.ZERO);

RulePrim vrule = new RulePrim

("vrule", Dimen.NULL, default_rule, Dimen.NULL,

Dimen.ZERO);

hrule.defineAction(VertBuilder.class, hrule.NORMAL);

hrule.defineAction(ParBuilder.class, hrule.FINISH_PAR);

hrule.defineAction(HBoxBuilder.class, hrule.BAD_HRULE);

vrule.defineAction(HorizBuilder.class, vrule.NORMAL);

vrule.defineAction(VertBuilder.class, vrule.START_PAR);

TheBuilderCommandcorrespondingtotheT

E

Xprimitive\hruledenesthree

actions:itperformsthenormaloperationinverticalmode,nishesthecurrent

paragraph (if any) in horizontal mode and complains inside an \hbox. The

\vruleperformsnormallyinanyhorizontalmodeandentersanewparagraph

in vertical mode. There is in fact only one class (RulePrim) which has two

instanceswithnameshruleandvruleanddierentparameters;theyareassigned

dierentActionsfor thesamemodes.All the ActionsNORMAL,START_PAR,

FINISH_PARandBAD_HRULEareinstancesofinner classesinside RulePrimor

itssuperclass.

Other examples of BuilderCommand are: HBoxPrim, VBoxPrim , VTopPrim,

LowerPrim, MoveLeftPrim, BoxPrim, KernPrim, CharPrim , ExSpacePrim,

AccentPrim, AnySkipPrim.

Group is another subclass of CommandBase. Its subclasses cover the various

typesofgroupinT

E

X.TherearegroupssuchasSimpleGroupforapairofbraces,

SemiSimpleGroupforthe\begingroupand\endgroup,HBoxGroup,VBoxGroup

orVTopGroup.Groupitselfisdened andthestackofGroupsismaintainedin

(21)

Groups have one problem in common with Builder. Their closing commands

behavedierentlyin combination withcertaintypeofGroup. Therightbrace

cannotmatch \begingroupand\endgroupcannot matchtheleft brace.The

problem is solvedin exactly the sameway asfor combinations of commands

andBuilders.

4.7. Package tfm

The package tfm implements a particular type of font metric information

theT

E

X fontmetricle foruse in N

T

S . It canbeused asanexamplefor

implementingothertypesoffontmetric.

TeXFmisaclasswhichrepresentsthelow-levelalmostrawformatofaT

E X

fontmetricle.Somecomplicationsarehiddenbut itspublicinterfacereects

justtheinformationwhichisavailableinthele.Itusesseveralauxiliaryclasses

becausethe tfm format is toocomplex to be captured by only oneprogram

le.Asanexample,thewholeprocessofreadingatfmleisdonebytheclass

TeXFmLoader which createsaninstance ofTeXFm. TeXFmitself hasmethods

for getting information concerning the characters, ligatures and kernings for

pairs of characters, extensible recipes and sequences of enlarging characters.

Anothermethod is provided forprinting itsrepresentationasapropertylist.

This isused byasmall Javaapplication tftoplwhich thanksto TeXFm

sharesmostof thecodewithN

T S.

TeXFontMetric is an adaptation of TeXFm which implements the FontMetric

interface from the node package. It is awrapper which uses the naturalme-

thodsofTeXFmandprovidesthemethodsrequiredbytherestofN

T

S .This

approach is probablyusefulfor future implementationsof other types offont

metric.Wecannotexpectthatsomethirdpartywillprovidetheexactinterface

evenifaJavaclassissuppliedforaccess.

4.8. Package dvi

Thispackageimplementsthedviformatasoneofthepossibleoutputformats

forN

T

S .Inmanyaspectsitwillbesimilartothepackagetfmbutitwastoo

earlyto saymoreasdevelopmentof thisparthadjust startedwhen thistext

waswritten.

4.9. Package tex

Thispackageisan umbrellaforthe otherN

T

S packages,and itis byfarthe

messiest part of thesystem. All theclasses and packagesso farare designed

(22)

classesand packagesaspossible.ButinT

E

Xitself, there aresomanyunclear

dependencies.ThatwasonereasonforstartingthewholeN

T

S projectin the

rstplace. Theclassesinthis packagejoin alltheindependentunits together,

andinadditionalltheweirdcaseswereexportedfromthecleandesignofother

packagesto here if this waspossible. That is the main reason that the code

sometimeslooksrathermessyhere.

Besidesthis,thereareclassesformaintainingtheerrorpoolsothatcommands

arenotdependentonthewayinwhichtheerrormessagesaregiven.

ThemostinterestingpartofthispackageistheclassPrimitiveswhichcontains

thecongurationofthewholesystem.Therewasalreadyanexampleinpackage

typo.

4.10. Modularity and congurability

To develop a system which is as modular as possible was one of the main

desiderata.InthecurrentT

E

Ximplementation,therearealotofdependencies.

Experience shows that it is very dicult and dangerous to make some non-

trivialchangessincethesecanleadtoanumberofpossiblyunclearside-eects.

The approach taken in developing N

T

S has been to make all dependencies

explicit and clear. All classes havea well-dened interface of publicmethods

which is used for all communication. There are no uncontrolled changes of

globalvariables.ThismannerofprogrammingisgreatlysupportedbytheJava

object-orientedlanguage.

Anothermotivationformakingcodeunitsindependentistoallowsubstitutions

of some modules by other modules with the same interface but a dierent

underlying implementation. Independentclassesor packagescanalso be used

asbuildingblocksforanothersystem.TheN

T

Spackagesarethereforedesigned

ratheras classlibrarieswithastricthierarchy.

An interesting problem concerned with the decomposition of T

E

X into inde-

pendentunitsistheproblemof cyclicdependencies.Therearemanyofthem.

AsimpleexampleistherelationbetweenT

E

X'seyesandstomach.Thesto-

mach is fed by commands which originateat theeyes,but the action of the

eyesdependson\catcodesettingswhichoriginatefrom thestomach.

This makes it particularly dicult to maintain a non-cyclic hierarchy of pa-

ckages. On the other hand, it is verydesirable if we want to use only some

of themin anotherapplication. Themethodthat N

T

S usestoavoidsuchcy-

clicdependenciesisviaabstractinterfaces.Ifsomeclassneedsinformationor

an action which is not available at the current level of hierarchy, it denes

(23)

constructor or of somemethod). The parameterisationis then made at some

higherlevel,usuallyintheumbrella(ormaybeN

T

S'sbrain)thepackagetex.

5. A summary of the status quo

N

T

Swasenvisaged(morethanalittlenaïvely,ashasalreadybeensuggested)

astakingoneyearfromcommencementto fullimplementation.It isnowtwo

yearssinceformalcommencement,andworkisnotyetcomplete.Howfarhave

wegot,andwhatwerethereasonsforthedelays?

Thegoodnewsisthatworkisverynearlycomplete:Karelhastackledthetask

inaverylogicalorder,startingwithT

E

X'seyesandmouth (thescannerand

tokeniser),thenmacroexpansion,thencommandexecutionwherethisdidnot

involvetypesetting, through to list creation,and page-building. N

T

S is now

able to process and typeset (that is, generate DVI) for any document which

doesnot involvemathematics oralignments, although it cannot (at thetime

of writing) yet hyphenate words. In fact, only three real challenges remain:

mathematics (mathematical typesetting, of course, rather than mathematics

perse),alignmentsandhyphenation.Karelhasalreadycompletedalargepart

oftheresearchanddesignphaseforthese.

However,there isanotherareain which somework remainstobecarriedout,

and that is the area of system interactions. Of course, T

E

X itself does not

interact with the systemin any potentially dangerousway(with the notable

exceptionofbeingabletoopenanarbitraryleforwriting,providedthatthe

user running T

E

X hasappropriate permissions). But T

E

X does interact with

theenvironmentinrathermoresubtleways,forexampletoascertainthepath

orpathswhich itwillsearchforeachclassofle(\inputles, .tfmles,and

soforth).

Most implementations of T

E

X perform this interaction through the medium

ofso-calledenvironmentvariables (e.g.TeX_Inputs,TeX_Fontsandsoforth).

TheseenvironmentvariablesaretypicallysetbytheinstallerofT

E

Xforagiven

system,andcanusuallybemodiedbyindividualuserstosuittheirparticular

needs. Whether these environment variables are actually variables, or logical

names, or part of (e.g.) a Windows NT environment settings is irrelevant to

the user: all that matters is that there is a standardway(standard, that is,

for each platform and implementation of T

E

X) of informing T

E

X where the

relevantlesareto befound.

TheproblemisthatJavaisaportable language.Andtrulyportablelanguages

must behaveidentically nomatter onwhichplatform theyare installed.And

(24)

not standardised across platforms, Java shall have no access to environment

variables.Disaster!

It therefore looks at the moment as if N

T

S's environment will have to be

congured independently to that of T

E

X, using a Java-specic conguration

system, and there will be noway ofallowing N

T

S to inherit T

E

X's run-time

environmentsettings.Butthisareaisstillunderreview,anditisstillpossible

that somesatisfactorycompromisewillbefound.Recentimprovementstothe

Java systemhave acknowledged the need fora so-called policy le, which by

default is ignored but which if permittedby the securitysettings canbe

read by the Java run-time system during initialisation. Such a le could be

generated in averystraightforwardwayfrom existing environmentvariables,

although(forobvious,bootstrapping,reasons)theprogramtogenerateitcould

not bewrittenin pureJava!

Somathematics,alignments,hyphenationandenvironmentalenquiriesremain

tobeimplemented,virtuallyallelseiscomplete;howsatisedarewewiththe

workdonesofar?

In general,weare extremely satised; Karelhas donean excellentjob of re-

engineering and re-implementing aT

E

X-compatible systemin amodular and

open way. Compatibility remains uncompromised: the DVI les and log les

(andeventheconsoleoutput)ofN

T

SandT

E

Xareidentical (obviouslymodulo

suchnecessarydierencesastheNTSbannerreading"ThisisNTS"ratherthan

"ThisisTeX").

Butthereisalsooneareaaboutwhichwearedeeplyconcerned,anditisonly

fairthat weshouldrevealourconcernstothesponsorsoftheproject(suchas

GUTenberg).Thatareaisperformance.Andtheperformanceisabysmal.

WhenwerstwenttoKnuthwithourplansforN

T

S,wesaidthatweintended

toperformthere-implementationintwophases:phase-1woulduseamodern,

rapid-prototyping, language to validate the design; the second phase would

involve a further re-implementation using a language selected for eciency.

Donreassuredusthatthis secondphasewould neverprovenecessary:bythe

time youare readyto perform thesecond re-implementation, technologywill

have advanced somuch that asecond re-implementationwill notbe needed.

Computerperformancecontinuestorocket,yearafteryear,andshewsnosigns

ofstartingtoreachaplateau isaparaphrase(frommemory)ofDon'swords.

Well, in one sense, Don was right: computer performance does continues to

rocket, and still shews no signs of starting to reach a plateau. Yet, despite

this,N

T

Sis,onlargebenchmarks,over100timesslowerthanT

E

X ,evenusing

themuch-vauntedjust-in-time compiler.Andso,wearefacedwithacrucial

decision:dowecontinuetouseJava,andjustwaitforthehardwaretospeedup

(25)

rst1GHzpentium-classmachineshould shipthisyearafactorof 200: 1).

OrdoweusetheJavaimplementationjustfortestpurposes,butre-implement

Karel'sdesign in aradicallymoreecientlanguage?Or dowesimply admit

defeat,saywetried,andleaveittootherstoseeiftheycanbemoresuccessful

thanwe?

Thesearehardquestions,andtherewill beconsiderablesoul-searchingbefore

wecandecideontheanswer;allIcansayatthemomentisthatGUTenberg,

asone ofourmajorsponsors,will alsobeoneofthersttoknow.

6. Epilogue

Althoughmy talkhasended onaratherdownbeatnote, I'dliketo trytolift

yourspirits byasking (andanswering)onevitalquestion: whatlesson(s) can

belearnedfrom ourexperience(s)?

Therstmistakewassurelytounder-estimatethetimenecessaryfortheinitial

re-implementation.HadwefollowedKnuth's(?apocryphal?)algorithmfores-

timatingthetimeneededto developamajorsoftwaresystem,wewould have

added 1and thengone up to thenext order ofmagnitude. Thus Knuth's al-

gorithmwouldhavesuggested(hadweheededit)thatwewouldneednotone

yearbut twodecades!

Infact,weprobablyneedaboutthreeyearstocompletefullywhatwethought

couldbecompletedin one.Is itpossibletoexplainwhy?

Ratherinterestingly,I thinktheanswerisyes(whichmaysuggestthatIam

stillasnaïveasIwaswhenIstartedtheproject!).AccordingtoKarel,almost

alloftheextratimehasbeenspentmakingN

T

S 100%T

E

X -compatible.Note,

100%,not99:9%.It was thislast 0:1% thatate upso muchofthelost time.

Little things, like making sure that the console output wasidentical, even if

consoleoutputisephemeralandcanneverbecomparedotherthanbymemory.

Littlethings,likemakingsurethatN

T

S'sbehaviouratboundaryconditionsis

identicaltothatofT

E

X,evenifT

E

X'sbehaviourinsuchconditionsissometimes

awedandatworstcompletelyinsane.Littlethings,likemakingsurethatDVI

lesproduced byN

T

S are binary-identical withthose produced byT

E X , not

justsyntactic-and semantic-compatible.

WhatmadethissituationworsewasthatKarel'sbriefwasnot towriteaT

E X -

simulator;hadthatbeenhistask,hecouldprobablyhavecompletedthework

in eight months orless. His brief was, in fact, to write a exible, extensible,

modular T

E

Xsimulator,whichmeantthateverytimehediscoveredsomewhere

thatT

E

Xbehavedlessthanideally,hehadtoimplementtworoutines:(1)the

(26)

circumstances, and (2) a T

E

X-compatible routine, that introduced whatever

anomalous behaviour T

E

X itself would exhibit in those circumstances. Thus

someone taking theN

T

S sourcein the future will ndthat allthenecessary

logical, predictable, behaviour has already been implemented; it has simply

beensub-classedoutofsightin theinterestsofT

E

X-compatibility.

What otherlessons canbelearned? Well,it iscertainly worthre-visitingthe

questionofimplementationlanguage.WasJavatherightchoice?Inhindsight,

the answerappearsto be no,much asit hurts to admit it. There arethree

primary reasons for this. (1) Java is not astype-safe as we had thought, at

least if one wantsbothtype-safetyand eciency at thesametime. Whereas

in Pascalone canwrite:

type group = (simple_group, semi_simple_group, ...)

andthereafterusetheidentierssimple_group(etc.)intheabsolute certainty

thattheycanneverbeusedinacontextwhere(e.g.)anintegerisexpected,this

isnotthecaseforJava.Therearenoenumeratedtypes,andthusifonewants

type-safety to bechecked and enforcedat the compiler level, oneis virtually

forced to use objects to represent even the simplest enumerated type.

2

And

objects, of course, carry considerable baggage with them, and their use (in

excess)hasaheavyperformanceimpact.(2)Javalacksgenerictypes,andthus

in asituationinwhichonewantstomanipulate(say)listsofdierenttypes of

object,one is forcedeither to write type-specic code foreach type ofobject

or to use theonlytruly genericobject(Objectitself),and thento use casts.

In the latter case, type-checking is deferred from compile-time to run-time,

with anaccompanyinglackof(a)compile-timetype-safety,and(b)eciency.

(3) Javaimposesconsiderable performance overheads. IfN

T

S wereten times

slower than T

E

X, I might be prepared to argue that (a) Java performance

will continuetoimprove,andthereforeweshouldbewithintouchingdistance

of T

E

X's performance before too long; and (b) aperformance degradation is

acceptableifbothmaintainabilityandextensibilityareconsiderablyenhanced.

But I cannot, in all honesty, oer these defences in the present situation: if

N

T

S remains 100 times slower than T

E

X, its chances of ever being used in

earnestarevanishinglyremote.

Java'sstrengths, ontheotherhand,remain virtuallyunchallenged;itis por-

table (andobviatesany needfor system dependencies and/orlocal modica-

tions), it has attracted a large user (=programmer) base, and it does oer

seamlessnetworkconnectivity.Atthemoment,weareuncertainwhich(ifany)

other languagecouldoerthese advantages whileavoidingJava'slimitations.

GenericJava,particularlyifsupportedbySun,wouldmakegreatsense;Eiel

2. the java.util package does recognisethe need forenumerated types, butunless and

(27)

looksinteresting,too.AllIcansayatthemomentisthatKarelwillnishN

T S

V0usingSun'sJava;if,afterthat,thereisgeneralconsensusthattheproject

should continue, wewill investigatethe optionof translating(probablyauto-

matically) N

T

S from Java to another, moreecient, language.And beyond

thatistoofarto see!

Andonenalproblem,whichhasdoggedthisproject,andwhich(sadly)doesn't

seemlikelyto disappear. That isaproblem of communication.Theteam are

geographicallydiverse,withrepresentativesfromatleastvenations(UK,CZ,

DE, PL, NL); our programmer is based in CZ, where the only other team

memberis morethan fully occupied running a major university (Ji°í is now

Rector of Masaryk University). Thus Karel lacks the day-to-day support of

otherswithwhom todiscuss progressandproblemsother thanbye-mailand

atoccasionalgroupmeetings.Almostcertainly,communicationproblemshave

also led to various misunderstandingswithin the group,which are frequently

seenasbeingpolitically motivated.Politics have castashadowoverthis pro-

ject,ofthatthereisnodoubt;yetequallywithoutdoubteverymemberwants

theprojecttosucceed. I believethatthe goodwill which exists outweighsthe

dicultieswhichcanoccur,andthatwewillbeabletobringthisprojectto a

statewhere N

T

S is completeandusable.

ButDonadvisedus that weshould bepreparedat somepointtodowhat he

hasdone to say enoughis enough and to allow others to carry thetorch

forwards. I'm sure we aren't ready to do that yet there are far too many

excitingchallengestobemetyetthetimewillundoubtedlycomewhenN

T S

willitselfberegardedaspassé,andotherswillbekeentotakeonthechallenge

of carryingcomputer typesetting(in the nest T

E

X tradition) forwardin as-

yet unforeseen ways. I hope that amongst those who takeup this challenge,

membersof GUTenberg willgureprominently: youhaveamongstyoumany

whohavecontributedenormouslytothefurtheranceofT

E

X,someofwhom I

havehadthepleasureofknowingasfriendsaswellascolleagues.Onbehalfof

theN

T

Sproject,Ithankyoumostsincerelyforyoursupport;Ihopethatyou

enjoythedemonstrationofN

T

Swhichfollows,afterwhichIwilltrytoanswer

anyquestionswhichyoumayhave.

Références

Documents relatifs

In distribution-based evolution, an example of which is PBIL [11], the genetic pool is coded as a distribution on the search space; in each generation, the population is generated

This paper investigates the use of SJ for session-typed parallel programming, and introduces new lan- guage primitives for chained iteration and multi-channel communication..

Differently from PC, name mobility in choreographies is typically done using channel delegation [4], which is less powerful: a process that introduces two other processes requires a

5.3 presents our unsuccessful experiments for finding both good and bad shapes in 19x19, from MoGoCVS and its database of patterns as in [8].. Section 5.4 presents results on

generalize the mixture to more than one channel and propose to use the residual to scale the audio quality up to perceptual transparency, there is no explicit control over the

Specifically, our proposal is to exploit the nondeterministic constraint solver pro- vided by JSetL [9], a Java library that combines the object-oriented program- ming paradigm of

In recent years (publication year: between 2009 and 2016) different consulting companies and consultants have dealt with the domain of selection and introduction of project

We do this by considering a Java library, called JSetL [14], that integrates the notions of logical variable, (set) unification and constraints that are typical of constraint