Cahiers
enberg
GUT GUT GUT
m THE NTS PROJECT
P Philip Taylor , Jiˇrí Zlatuska , Karel Skoupy Cahiers GUTenberg, n 35-36 (2000), p. 53-77.
<http://cahiers.gutenberg.eu.org/fitem?id=CG_2000___35-36_53_0>
© Association GUTenberg, 2000, tous droits réservés.
L’accès aux articles des Cahiers GUTenberg (http://cahiers.gutenberg.eu.org/),
implique l’accord avec les conditions générales
d’utilisation (http://cahiers.gutenberg.eu.org/legal.html).
Toute utilisation commerciale ou impression systématique
est constitutive d’une infraction pénale. Toute copie ou impression
de ce fichier doit contenir la présente mention de copyright.
The N
T
S project:
from conception to implementation
Philip Taylor [1],Ji°íZlatu²ka [2]and KarelSkoupý [3]
[1]Webmaster, RHBNC,Universityof London, UnitedKingdom;
[2]Rector, Masaryk University,Brno, Czech Republic;
[3]Independentprogrammer,Brno,Czech Republic;
Introduction
For today's talk, I had hoped that Karel Skoupý, the Czech implementor of
N
T
S,wouldbeabletobeheretopresenttheresultsofhisworkandtoanswer
any questions that you mighthave.Sadly, that willnot be thecase: Karelis
working desperately hard to complete N
T
S before thecommencementof the
TUG2000conference,andIthereforehavetodeputizeforhimandattemptto
answeranyquestionsonhisbehalf.
Letmestartbypresentinganoverviewoftoday'stalkandpresentation;Iwill
attempt to cover seven separate areas, including (of course) the mandatory
questionsandanswersattheend.Thesevenareastobecoveredare:
AbriefhistoryofN
T S
T
E X, "-T
E X&N
T
S compared
ThechoiceofJavaasthelanguageofimplementation
Anoverviewoftheclasses,objectand methodsofN
T S
Asummaryofthestatusquo
AdemonstrationofN
T
S,andcomparisonwithT
E X
Questions&answers
andyouwillsoonrealizethatmyexpertiseliesverymuchintheearlierareas;
theimplementationdetails ofN
T
S areverymuchKarel'sarea,andIapologize
in advance foranyerrorswhich I maymakein presenting(in particular)the
overviewofclasses,objectsandmethods.
. ThepresentauthorswouldliketorecordtheirgratefulthankstoallmembersoftheN
T S
and"-T
E
Xteams,pastandpresent, withoutwhomneitherthispapernorN
T
Sitselfcould
1. A brief history of N
T S
The N
T
S project is the result of the foresight of just one man: Joachim
Lammarsch, one of the founders and for many years the President of
DANTEe.V.DuringtheperiodleadinguptotheDANTEmeetinginHamburg
of 1992, Joachimcirculatedamessage toall potentiallyinterested partiesas-
kingwhowouldbeinterestedin aprojectwhich wastocontinuewhereKnuth
had left o. A number ofus respondedpositivelyto this question, andthose
whodidwereinvitedbyJoachimto attendtheDANTEmeetingin Hamburg.
During the courseof themeeting, oneof the scheduledsessions wasdevoted
solely toTheFutureofT
E
X.Thosepresentdebatedatgreat lengthwhether
T
E
XshouldremainforeverasintendedbyKnuth,orwhetheritsfuturewastoo
importanttobedeterminedbyjustoneman(evenoneoftheintellectandstatus
ofKnuth).TheoutcomeofthesedeliberationswasthatT
E
Xitselfmust remain
solelyKnuth'sresponsibility,butthat aT
E
X-likesystemorsystemsshouldbe
createdby anindependentgroup, to continue developmentswhilst T
E Xitself
remainedfrozenforever.Itwasalsoagreedthatthenameofthisproposednew
systemshould beN
T
S (anacronymforNewTypesettingSystem)toindicate
that thisnewsystemwasnot T
E
X itself,but wasinsteadanewsystemwhich
whilstacknowledgingunreservedlyKnuth'srôlein itsevolutionwasfreeof
theconstraintswhichKnuthhadplacedontheevolutionofT
E Xitself.
Oncethemaindecisionhadbeentaken,otherdecisionsfollowedmoreorlessau-
tomatically.Itwasagreed,forexample,thatwhilstDANTEe.V.wouldprovide
initial fundingfortheproject,theprojectitselfwouldbedeemed tobetrans-
national,transcendingthearticialboundariesofanyoneT
E
XuserGroupand
drawingitsmembershipfromT
E
Xusersandotherinterestedpartiesthroughout
theworld.RainerSchöpfwasinvitedtochairthegroup,andothermembersin-
cludedJoachimLammarsch,Joachim(Johnny)Schrod,BerndRaichle,Peter
Breitenlohner,Ji°íZlatu²kaandmyself(Philip Taylor).
Thereafter,thegroupmetonaregularbut occasionalbasis,almostinvariably
atsuchatimeastoco-incidewithaDANTEconference.Themembershipdid
notremainstatic,andRainerstooddownasChairmanaftertherstyear,nee-
dingmoretimeforotherprojectssuchasL A
T
E
X-3andrealwork [tm].I took
overasChairman,andweagreedthatwithin thegrouptwoseparate projects
should beinvestigated,oneevolutionaryandonerevolutionary.PeterBreiten-
lohnerwastoheadtheevolutionary("-T
E
X)group,whilstJi°íZlatu²kawould
headtherevolutionarygroup(N
T
S proper).JoachimLammarsch,asfonset
origo,wouldremainasManagingDirector,andBerndRaichle,whosetechnical
skillsandT
E
Xexpertisewereinvaluable,was2nd-in-commandofbothprojects.
Sadly wealsosaidfarewell'toJoachimSchrod ataboutthistime:Joachim's
toagreewithallofthedecisionstakenwithinthegroupandpreferredtoresign
rather thanbe closelyinvolved with aprojectwith whose aims he could not
entirelyagree.
2. T
E
X, "-T
E
X & N
T
S compared
2.1. T
E X
There is surely littleneed in atalk addressed to membersof GUTenberg to
denewhatis,orisnotT
E X.T
E
Xis,bydenition,Knuth'sprogramforperfor-
mingtypesettingofthehighestquality,andthisprogramishisandhis alone.
No-oneotherthanKnuthhimselfmaymakeanychangestotheprogram(other
thanintheareaofso-calledsystemdependencies),anditisKnuth'spublically
statedintentionthatT
E
Xshould evolvenofurther:Donhasmadealltheim-
provementsto T
E
X that he deems necessary, and any further work which he
doeson T
E
X (atever-increasingintervals) isrestricted toeliminating anyge-
nuinebugs which havebeendiscoveredsincehe lastupdatedthesource.T
E X
iscurrentlyat V3:14159,andKnuth wishesT
E
X to become absolutely frozen
atthemomentofhisdeath,at whichpointitwill bedeemedtobeV.
2.2. "-T
E X
The"-T
E
Xproject,whichis as portable asT
E
X itselfand whichuses exactly
thesametoolsandlanguages(Web,Pascal,Weave,Tangle,etc.),sought(and
seeks)toextendT
E
Xinamannerwhichisbothconservativeandinnovativeat
thesametime. Itis conservativebecauseit intentionally usestex.web asthe
mastersource,andimplementsallchangesthroughthemediumofachangele,
yetisinnovativebecauseitaddsmuch-neededfunctionalitytoT
E
Xandextends
T
E
Xinawaywhichisintendedtomeettheneedsanddemandsofsophisticated
T
E
X userswhondthemselvesworkingattheverylimitofT
E
X 'sabilities.
The"-T
E
X projectwasconceived and (forsomeyears)executedby members
oftheN
T
S group,undertheleadershipofPeterBreitenlohnerandunderthe
technicaldirectionofmyself.Duringthisperiod,"-T
E
Xevolvedfrom-and-
releasesvia"-T
E
XV1to"-T
E
XV2.Bythe timeithadreachedV2:0,"-T
E X
hadaddedoverthirtynewprimitivesto thesetalreadyprovidedbyT
E X,and
hadextendedthefunctionalityofanumberofothers.Despitetheseextensions,
"-T
E
X was (and remains) 100% T
E
X -compatible, and this, together with its
portability,issurely"-T
E
X's greateststrength.
Indeed,soimportantwascompatibilityconsideredwhen "-T
E
X was beingde-
veloped that if no special action is taken when launching "-TX it then
behavesidentically toT
E
Xitself,andwiththesoleexceptionofthebannerline
cannot bedistinguishedfromT
E
X. Itgoeswithoutsayingthat,in thismode,
"-T
E
Xpassestheso-calledtriptest withyingcolours!
Ifaccessisrequiredto"-T
E
X 'smanyextensions,thenatthepointoflaunchit
isnecessaryto indicatethisexplicitly.Thisisaccomplished(oncommand-line
based systems) bylaunchingIni-"-T
E
X withan asteriskwhere an ampersand
wouldotherwisebeallowedbyT
E X , asin
e-initex *e-plain \dump
as comparedto(forexample)
initex &plain \dump
The presence of the ampersand triggers "-T
E
X into so-called extended mode,
and thisinformationis thenstoredin anyformatle which isdumpedat the
endofthat"-T
E
Xinstantiation.IfsuchaformatisthenloadedintoVir-"-T
E X ,
thelatterwillthenautomaticallystartinextended mode,asinthefollowing:
e-initex *e-plain \dump
e-virtex &e-plain <source file>
Onceinextendedmode,theuserhasaccesstoallof"-T
E
X 'smanyextensions,
yetifnoneoftheseisused"-T
E
Xcontinuestobehaveinamanneridentical
to that of T
E
X itself. Thus all legacydocuments which do not, by accident,
attempt to invoke one of the new "-T
E
X primitives will behave and typeset
identically under bothT
E
X and"-T
E X.
But"-T
E
Xhasonefurthertrucupitssleeve:aswellascompatibilityandexten-
dedmodes,"-T
E
Xoersaso-calledenhancedmodeinwhichstrictcompatibility
is sacriced in the interests of even greaterfunctionality. As of V2:0, "-T
E X
possessedonlyasingleenhancement:theimplementationofT
E X--X
E
T,based
on Knuth and MacKay'soriginal T
E X-X
E
T but completely integrated within
"-T
E
X(andthusrequiringnospecialIDVdriver).Sincetheimplementationof
T
E X --X
E
T requires that maths nodes be overloaded, 100%-compatibilityhas
to be sacriced, yet the dierences are so subtle that most "-T
E
X userswho
chosetoexploititsenhancedmodewouldstillnoticenodierenceinoutputof
theirlegacy(mono-directional)documents.
Toenterenhancedmode,specicuseractionisrequired:the"-T
E
Xdocument
beingprocessedmustspecicallyenableenhancedmode,eitheratthebeginning
of thedocument or at apointat which accessto enhancedmodeis required.
For T
E X--X
E
T, this is accomplished by setting oneof "-T
E
X's so-called state
variables, as in:
\TeXXeTstate = 1
Ingeneral,once"-T
E
Xisoperatinginenhancedmode,itisnotpossibletoforce
mode,neverfromcompatibilitymode).Incertaincircumstances,however,and
indocumentscarefullywrittentolocalizeallside-eects,itmay bepossibleto
cause"-T
E
Xtoreverttoextendedmode.Fortheexampleabove,thiswouldbe
achievedbyusing:
\TeXXeTstate = 0
at alater pointin thedocument, but usersarecautionedthat becauseofthe
asynchronousnatureof (e-)T
E
X'spage-breakingoperations,theremaystillbe
someundesirable interactions ifany modied mathsnodes are stillon one of
"-T
E
X 's internallists havingnot(yet) beenushedout.Thusfor allpractical
purposes the user should assume that, once in enhanced mode, "-T
E X will
remaininenhancedmodefortheremainderoftheinstantiation.
One last point remains to be discussed under the heading of "-T
E
X before
passingonto N
T
S proper: witheect from "-T
E
X V2:1, PeterBreitenlohner
assumedsole responsibility for"-T
E
X. Peter hasindicated that, whilehe still
wishestodevelop"-T
E
Xfurther,henolongerwishesto dosowithin theægis
oftheN
T
S group,and withsomeconsiderablesadnesswehaveacquiescedto
his wishes.We wish Peter allthe best with "-T
E
X , andare condentthat he
willcontinuetomaintainandsupportitwiththesamezealandinterestashe
hasinthepast.
2.3. N
T S
Foroverveyears,theN
T
Sgroupwereforcedbycircumstancestodevotetheir
attention almost exclusively to the development of the evolutionary system
known as"-T
E
X . This situation wasbrought aboutbythe verynature ofthe
groupitself: it was composed entirelyof volunteers, none ofwhom werein a
positiontoexpendgreattractsoftimeonaproject(N
T
S)whichwasoflittleif
anyinteresttotheirrealemployers.RecognizingthatN
T
S couldneverbecome
a reality if it wasto be developed solely by volunteers working in their own
time,thegroupdecidedthatN
T
Sshouldbeputoniceuntilsuchtimeasfunds
couldbefoundtoallowafull-timeprogrammerto beemployed.
During 1997/98, that much-longed-for possibility became a reality.
DANTE e.V. agreed to contribute the magnicent sum of DM30000 to the
project,sucientto allowaprogrammerto beemployedfull-timetoworkon
the project. Ji°í Zlatu²ka, asDean of the Faculty of Informatics at Masaryk
University (Brno, CZ), was in the fortunate position of not only being able
to recommend to the group ahighly competent programmer(Karel Skoupý)
butalso beingabletoarrange atripartite contractto allowDANTEfunds to
berouted viatheUniversityandthencetoKarelhimself.WemetwithKarel,
discussedtheprojectwithhim,anddespitethealmostverticallearningcurve
For some time, Karel did little but read. He read The T
E
Xbook, T
E X-the-
program,agreatdealaboutJava,and much elsebesides.Then,in theSpring
of 1998, Kareland Ji°ícame to my homein England, and Kareloutlined his
proposals for N
T
S. Ji°í & I were much impressed with the expertise which
Karel hadclearly acquired,and withveryfew changesagreed that heshould
continue todevelophis ideas.Bythetimewenextmet,Karel was tobein a
positionto demonstrateworkingcode.
The next review took place in Brno,at the University, and on this occasion
JoachimLammarschalsotookpart.Joachimhadgreaterfamiliaritywith,and
exposure to, Java thaneither Ji°í or myself, and his presence at that review
wasinvaluable.Oneof themoststrikingpointswhich cameoutof thereview
wasthatKarelhadelectedtoprogramforeciencyratherthanforclarity,and
there wereanumberofplaceswherewefeltobligedtoask himtore-thinkhis
approach(forexample,weaskedKareltoeschewtheuseofintegersasgeneral-
purposevariables,andinsteadto usethemonly where integerarithmetic was
required). Karel responded positively to our suggestions,although he clearly
retained his reservations,and agreed to adopt ourrather moredefensive and
didactic programmingstyle.
When thecontractwith Karelwasrst discussed, allinvolved in the project
believedthatwecouldgetfromtheorytoafullimplementationinonecalendar
year. Asthe endof theyearapproached,it became onlytooobviousthat we
had been(typically, manywould say,in theIT/softwareworld)verynaïvein
our analysis and far too condent in Karel's ability to complete the project
onschedule. Indeed,bytheendof therstyear,althoughT
E
X'smouth had
beenre-programmedinJava,N
T
Swasstillunabletoperformeventhesimplest
typesetting, and anenormousamountof work clearlyremained to be carried
out.
Despitetheobviousdisappointmentwithwhichmembersof DANTEreceived
thenewsthatN
T
S wouldnotbedeliveredonschedule,theircondenceinthe
projectremainedonthewholeunshakenandtheygenerouslyvotedtocontinue
fundingtheprojectforafurtherperiod.During1999,ananonymousbenefactor
pledgedafurtherDM7500tosupporttheproject(thisbenefactor,towhomthe
group are deeply indebted, is a private individual, not auser grouporother
corporate body), and at the 1999 EuroTeX meeting other T
E
X user groups
alsoundertook tosupporttheprojectnancially.Itisparticularlypleasantto
be able to thank the members of GUTenberg personally, since GUTenberg
have pledged EU3000 for three years in support of the project. Thank you
GUTenberg!
So what is N
T
S, and why is it taking so long to reach completion? Unlike
"-T
E
X, whichis conservativeand evolutionary, N S is truly revolutionary in
that it attempts (for the rst time, asfar as we are aware) to re-implement
thealgorithmsandfunctionalityofT
E
X -the-typesetting-systemwithoutinany
way copying the coding (or even the data structures, though to a far lesser
extent) of T
E
X -the-program. Whilst T
E
X is written in Pascal-Web, N
T S is
written in Java. And whilst T
E
X-the-program is adeeply entangled (though
carefully structured) and highly daunting monolithic 1
program, N
T
S is in-
tended to consist of aseries of loosely coupled modules, any or all of which
canbereplacedby functionally equivalentmodule(s)with thesameinterface
semantics.
Thesuccessofthislatterapproachwasborneoutfairlyearlyon,sinceKarelwi-
selydecidedtocuthisN
T
Steethonthefarlessdauntingtaskofre-engineering
TFtoPL andPLtoTF in Java. Themodule which interpretsthe TFM lefor
the purposes of TFtoPL is exactly the same module as performs that func-
tion for N
T
S, and thus software re-usability that much-vaunted modern
desideratum hasbeenachievedinpractice.
WhilsttheultimategoalofN
T
S isto provideacompleteand integrated,yet
functionallydistinct,setoftypesettingtools,theshort-termaimistoprovidea
completere-implementationofT
E
Xinaexibleandextensiblemanner. Early
experiencesuggeststhatwearewellonourwaytoachievingthataim,and(des-
pitecertaincaveatsthat willappearlateron)thishas,in part,beenachieved
bythecarefulchoiceanduseofprogramminglanguage.
3. The choice of Java
as the language of implementation
Rightfromitsconception,theN
T
Sprojectwasnotonlyaprojectwhichwould
concentrateonprovidinganew,morepowerful,successortoT
E
X,butwasalso
aneorttore-programT
E
X-the-programassuch.
Thereasonforthiswassimple:T
E
X-the-programformsanexampleofamonoli-
thicPascalprogrambasedheavilyonoptimizeddatastructureswhichallowed
Knuth to cope with the memory limitations of computers in existence more
than twenty years ago.At that time, the reasonsfor choosing both the lan-
1. Knuthwouldalmostcertainlytakegreatexception tothe useofthe wordmonolithic,
sinceheevidentlytookenormouscaretodividetheprogramintosmallandquasi-independent
modules. Unfortunately, whilst the logic, structure and orthogonality of thosemodules is
undeniable,theprogramasawholeisamasterpieceofeciency,re-usingcodeand/ordata
structures whenever possible,and as a resultthe program in totalityis very dicult for
otherstomodifyorextend.Indeed,PeterBreitenlohner'sabilitytoaddfunctionalitytoT
E X
viathemediumofachange-leis,asfarasweknow,theonlysignicantattempttoextend
T
E
X-the-programinanynon-trivialwayotherthantheequallysignicantbutinmanyways
farmorerestrictedchangesmadebyHànTh
êThànhinhisPDF-generatingvariantpdfTX.
guageandtheprogrammingtechniqueswereunderstandable.Pascaldeveloped
from the family of procedural languagesbased on a formal syntax and well-
dened semantics starting with Algol 58, Algol 60, and Algol 68, the latter
providing one of the brightest peaks of the development of programing lan-
guagesbut whichwasunfortunatelytoocomplextosurvive.Pascalappeared
as abranch of developmentwhich combined the basic programmingtools of
structuredprogrammingasamethodologyforprogramming,motivatedbypro-
gramcorrectnessprooftechniquesandanapproachtobuildinglargeprograms
from manageably smaller components, and the tools for using abstract data
structuresinsteadofjustthedatatypesprovidedbytheunderlyingcomputer
hardware.Usingadeliberatelyrestrictedsetofconstructions,Pascalcouldbe
seenas anabstract machine code forageneralset of computerswhich also
allowedfor its portability and universal adoption both asfar asimplementa-
tions forvarious typesof computers wereconcerned, and itsgeneral useasa
languageofchoiceforteachingprogramming.
ForKnuth,Pascalwasalanguageespeciallywell-suitedforexpressinggeneral-
purposealgorithmsinawaysuitableforpublishing.Inthedecadesofthe60's
and 70's, the issues associated with particular waysof expressing algorithms
and developingproperprogramming style using high-levelprogramming lan-
guageswere amongthethetopicswhichformed coreproblemsforresearchin
Computer Science. Knuth added the concept of literate programming allo-
wingtheexpressionofhisalgorithmsasastreamofconstructionswhichfollow
thelogicoftheideasneededforunderstandingaparticularprogramconstruc-
tion,ratherthanthelogicofthesyntaxoftheprogramminglanguageusedas
Knuthstatedhisgoal,towriteprogramsinsuchawaythatinsteadofimagi-
ningthatourmaintaskistoinstructacomputerwhattodo,letusconcentrate
ratheronexplainingtohumanbeingswhatwewantacomputertodo.Pascal
assuchdidnotsuitKnuth'sneedssuciently,butliterateprogrammingtools
basedon macrogeneration allowedhim to introduce anextensionto Pascal's
programmingconstructs(theotherwiseclauseinthecasestatement),tobreak
thestructure of declarationand code sections(usingthetangling feature of
theWEB system),andalsoto providetoolsfor allowingverycarefulmemory
optimizationandimplementationindependence(usingonlyintegerdatatypes
and using macro constructions to generate code fragments necessaryfor the
packing/unpackingofdataused).
WithinthetwodecadeswhichfollowedthebirthofT
E
X,thepremisesonwhich
these decisions were basedhave become questionable, and indeed formed an
obstacletoeortstoprovideasuccessortoT
E
Xwhichcouldextendthelatter's
capabilities sucientlyfar. Pascal hasbeen succeeded by aline of languages
or systemswhich gained much smaller practicalacceptance, and similarly to
usageand/orimplementation(andKnuthhimselfhasmovedtoCWEBliterate
programmingbased on C which ultimately yielded a languagethat provided
himwithindescribablejoy inprogramming).Theneedforcarefulpackingof
dataintoaslittlespaceaspossiblewasremovedbytheemergenceofcomputer
architecturessupportingmuchlargermemorysizes andthepracticalavailabi-
lityandaordabilityofinstalledmemorysizesunthinkabletwentyyearsagoin
any context other than perhapssecondary disk storage. Data structure opti-
mizationhasbecomeanobstaclepreventingprogrammodicationandplacing
akindof timebomb into thecodewhich explodes whenseeminglysmall and
straightforwardmodicationsneedtobemade.Similarproblemswithmodia-
bilityarecausedbythemonolithicstructureofPascalcodeasusedbyKnuth
within T
E
X -the-program. Particular algorithms used within its body are ex-
tremely hard to modify or extend because of the lack of narrow and clearly
denedinterfacesbetweenindividualpartsoftheprogramcodeandbecauseof
thegeneralaccessibilityofsharedglobaldatastructures fromwithinanypart
ofthecode.
TheattempttostartN
T
Sdevelopmenthasthereforebeenlinkedwithadelibe-
ratedecisiontore-createT
E
X -the-programsothattheprogramminglanguage
andprogrammingmethodologyallowfortheremovalofunnecessarydataopti-
mization,theremovalofexplicitstorageallocationandtheuseofmechanisms
already present in modern programming languages, and also to allow for a
modular code structure with clearly dened interfacesand data paths which
willallowforeasiermodicationand forexperimentswiththeresultingcode.
Theideaofre-creatingT
E
Xusingamoremodernprogrammingtechniquerst
came fromJoachimSchrod, oneof theinauguralmembersoftheN
T
S group
who also came with a prototype example of what he meant by such a re-
implementationeort,showinghowtoextract themacro-generationlanguage
ofT
E
XasaLISPprogram.ThegeneralworkplanforN
T
Sdevelopmenteort
has since then consisted of assuming that N
T
S version 0 would be created
asafaithful 100% (or foranypractical purpose asclose aspossible to that)
T
E
X-compatibleprogram,andonlyfromsuchcodewouldfurtherdevelopment
activitiescontinuebymodifyingand/orextendingthis code.
Thechoiceof asuitableprogramminglanguagefor there-implementation ef-
fortshadbeenacrucialunresolvedproblemuntilearly1998.Discussionsoscil-
latedaroundthreedierentprogrammingmethodologies,eachofwhich would
provide a dierent set of advantages concerning the programmingmethodo-
logy,theavailabilityofcompatibleimplementationsacrossawidespectrumof
hardware platformsand operatingsystems,and theexistenceofasuciently
largebaseof programmerswhowouldform thebrainstrust forfuture N
T S
extensionsand experimentswithdierenttypesettingparadigmsand userin-
FunctionalprogrammingasrepresentedbyLISPorCLOS(theCommonLISP
Object System) had been the languagefor JoachimSchrod's early attempts.
Theprincipalmotivationforusingthislanguageconsistedinthefactthatlists
of lists are thebasicdata structures manipulated within T
E
X, and hencethe
basic internal programmingparadigm should be easy to represent. Symbolic
data structures as used within LISP allow for easy meta-programming and
prototyping,twotechniquesveryhandyforexperimentaldevelopment.Within
university environment, LISP has traditionally been the language for imple-
mentation of experimental student projects with high programming produc-
tivity, and hencesatisfying the essentialrequirementsfor viability of its use.
Even though CLOS systems do provide a suciently stable and compatible
implementationofthisparadigm,thecross-platformcompatibilityislessthan
ideal.
LogicprogrammingasrepresentedbyPROLOGorconstraint-satisfactionpro-
gramminglanguagesbasedonexpressingprogramswithinasubsetofsymbolic
logic has been another prototype language family with a high level of abs-
tractionand ahigh productivityrate.Symbolicdatastructures(terms) allow
for high exibility in writing very easily modiable code, and backtracking
mechanisms used for underlying implementation of state-space search could
provideinterestingpossibilitiesforsearchingforsemi-optimalsolutionsofsets
ofconstraintsthroughwhichverycomplexconditionsontheresultingtypeset
material couldbeexpressed. Eventhoughthe programmingstructures areas
farfromtheunderlyinghardwaredatastructuresaspossible,theactualimple-
mentationsofthisparadigmvarysignicantlytosuchanextentthatcompati-
bilityproblemsamongthedialectsmakelogicprogrammingaveryproblematic
choiceifeventualcross-platformandcross-systemcompatibilityissought.
Procedurallanguages,CandC++inparticular,presentagroupoflanguages
with considerable lowerproductivity in writingthe program code and its re-
sulting size.Even thoughthelanguagesassuchcanbewell-dened,practical
dierencesasfaraslibrariesincludedoroperatingsystemsinterfacesmakeita
real messtoproduceuniversallyusablecodewhich wouldrun acrossdierent
platformswithcompatibilitycomparabletothatofT
E
Xitself.Anotherpoten-
tial problemwasseenin notoriousproblems withnon-trivialmodicationsof
programs employing access to generalcommon shared structures, a common
programmingtechniqueusedinconnectionwiththeselanguages.
Eventually,Javahasemergedasacompromisesatisfyingmanyoftheessential
requirements,oeringinterestingfutureopportunities,andbeingcomplicated
byrelativelyfewdrawbacks.Asfarasprogrammingmethodologyisconcerned,
Javacombines C-basedproceduralprogrammingwith objectbased program-
ming style.Objectsserveasthebasicprogramcomponentsallowingforstruc-
andthedevelopmentoffuturemodicationsbythesubstitutionofcertainob-
jectsof which the programs consist by other components.Objects also serve
asaconsistentreplacementoftraditionaldatastructuresandthusremovethe
traditionaldrawbackofglobalshareddatastructures.SuncameupwithJava
asacompany-basedstandardprovidinguniform systeminterfacesandahigh
levelofsecurityofJavaapplications.These claimsremaintobedemonstrated
in reality andnot just as wishful thinking and bold P.R. statements, yet the
developmentoftestedandcertiedJavainterpretersincorporatedintoubiqui-
tousWWWbrowsersmadeithighlyprobablethattherequirementofgeneral
compatibility could be achieved and a signicant base of Java programmers
formed. Last but not least, Java as a WWW-based and Internet-aware lan-
guagemakeitpossibletothinkofN
T
S asanetwork-basedprogramwhichwill
eventually allow the combination of elements downloaded from the network
and thestandardizationof interfacesandtechniques used acrosshuge groups
ofgeographicallydispersed users.
Thetrendassociatedwiththenetwork asan importantelementof thefuture
computingenvironmenthascontributedtothechoiceofJavaastheN
T S im-
plementation language. Karel Skoupý joined theN
T
S team in early 1998 as
theprogrammer whose initialtask has beento deconstruct T
E
X-the-program
intoanobject-basedprogrammingcodepreservingtheessentialfunctionalfea-
turesof T
E
X assuch but providing thegrounds for future modicationsand
extensions.
After some 18 months of (re-)design and programming, a functional proto-
typeof core components of T
E
X has beendevelopedand made accessiblefor
initial experimentation; early results were presented at previous conferences
(e.g.EuroTeX 0
99)andit ishopedthat thenalcodeforN
T
S V0will bede-
monstratedatTUG2000.Whatyouwillseetodayisveryclosetothatcode!
4. An overview of the classes,
objects and methods of N
T S
The implementation language of N
T
S is Java. It is strictly object oriented,
andalloftheprogramcodeisencapsulatedinobjectmethods.Theobjectsare
instances ofcertain classes,and cluster together to form packages,discussion
ofwhichformsthemajorityofwhatfollows.Atthetimeofwriting,notallpa-
ckageswerecomplete;inparticular,mathematicsremainstobeimplemented,
although the majority of thedesign work for this remainingtask is virtually
4.1. Package base
Themain purposeof thispackageisto denetheelementary datatypesused
in therest ofthesystem.It isaminimalelementoftheN
T
S packagehierar-
chy, meaningthat noclass hereisdependenton any classesfrom other N
T S
packages.
Themostimportantclassesareasfollows:
Dimenrepresentsadimensionmeasuredin printers'points(ormuunits). The
fact thattheinternal representationisthesameasthatof dimensionsin T
E X
ensuresthestrictcompatibilityneeded.Thepublicinterfaceofthisclasstries
to be completely independent of its internal representation. To the outside
world,aDimenlookslikeafractionofpoints.Therearemethodsforconversion
frominteger,fromfraction(givenbyitsintegralnumeratoranddenominator),
from oating point number, and vice versa. It also provides basicarithmetic
operations.Inthecaseofbinaryoperations,versionsforcombinationwithother
convertiblenumerictypesaresupportedtoo.Thegeneralpublicinterfaceallows
foracompletechangeoftheprecisionoreventheinternalunitofrepresentation
withoutaectingtheothercode.
Gluereectsanothertypefamiliar fromT
E
X .Ithasitsnaturaldimensionand
theamountandorderofstretchabilityandshrinkability.Itprovidesarithmetic
methods such asadding two Glues, multiplying by ascalar numberand also
versionsforotherconvertibletypes.
Num representsanintegralnumber.It isjustaninteger,butwrappedinto an
objectsoitcanbestoreddirectlyinthetableofequivalentsandcanbedistin-
guishedfromordinaryintegersinthecode.Itservesmainlyastherepresenta-
tionofthevalueofnumericregisters(\count)(andissomewhatsymmetricto
DimenandGluefor\dimenand\skipregisters).
All thebasicclassesaboveprovidemethodsfor obtainingcharacterstringre-
presentationswhichcanbedisplayedonscreen, in the logle orused by the
\theprimitive.
LevelEqTable is the last important and relativelycomplex class. It is used to
implementT
E
X'stable ofequivalentsandthehashtable.WhilstT
E
Xusesan
associativehashtable onlyforthemeaningsofcontrolsequences,N
T
S stores
many otherkindsof equivalentsin anassociativemanner. Any objectcanbe
associatedwith aparticular combinationof kindand key. Dierent kindsare
denedfordierenttypesofequivalents:oneisforcontrolsequencemeanings,
another isfor each classof register,stillanother forcatcodes,etc. Thekeyis
(according to thekindofequivalent)either anobject(e.g.acontrol sequence
name)oranumber(mostothers).Thisassociativeapproachforstoringregister
AlthoughN
T
S iscompatiblewithT
E
X inprovidingonly256registersofeach
sort,thislimitationisarticiallyaddedandcanbeeasilyremovedinthefuture.
Asthenamesuggests,besidesstoringequivalents,theLevelEqTablealsomain-
tainspushingandpoppingoflevelswhichareresultfromgroupingintheinput
languageandthecorrespondingsavingandrestoringofassociatedvalues.
Althoughtheregistersweremovedfrom astaticarraytoanassociativetable,
thereisstillanothertypeofvaluewhichisnotassociativebutwhichissubject
to saving andrestoring. These are parameters(such as \tolerance,\hsize,
... ) thecurrentvalue of a parameteris stored in oneconcrete place. The
LevelEqTableprovidesaninterfacefortheseexternalequivalentstoo,andmain-
tainssavingandrestoringforthem.
4.2. Package io
This package contains classes and interfaces for reading characters from an
inputle andwritingtothelog le.Either orbothofthose lesmay equally
wellrepresenttheuser'sconsole.ThepackageisindependentoftheotherN
T S
packagesaswellasofthepackagebase.
CharCodeisaninterfaceandisveryinteresting(atleast,wethinkso!).There
had been considerablediscussion as to whether ornotto representcharacter
codes by someJava primitive typeor bya class. It wasdecided that aclass
should be used do as to clearly distinguish it from other usages of primitive
types.Eventually(duringdevelopment),itturnedoutthatanevenmoreabs-
tractrepresentation(asaninterface)bestmatchesitspurpose.Itdeclaresme-
thodsforgettingthecorrespondingcharacterornumericvalue,comparingwith
another CharCode, character or number for matching, making thecorrespon-
dinguppercase orlowercaseCharCode,writingonacharacter-orientedoutput
leandseveralpredicates.MostofthemethodsaretherebecausetheT
E Xlan-
guageusescharactersheavilynotonlyfortypesettingbutalsoasnumericvalues
andasparts ofkeywords;inaddition,certaincharactershaveaninuenceon
scanningandonlogoutput(\endlinechar,\escapechar,\newlinechar).
CurrentlytheimplementationofCharCode usedinN
T
S isjust aclasscontai-
ninganordinarycharacter.Butthereexiststhepossibilitytouseverydierent
representations(e.g. namedcharacters)withoutchanginganyN
T
Scode.Such
objectscanpassthroughthewholesystemprovidedthatattheendthereexists
anoutputobjectwhichrecognizesthem andhandles themproperly. Even se-
veral independentimplementationsof CharCode could co-existin somefuture
Name has the samerelation to CharCode as Stringdoesto char. It is used to
representthenamesofcontrolsequences,\jobname,fontnamesandle-names
scannedfromtheinput.
InputLine represents one line from the input le or from the user's console.
There aremethodsforgettingthe nextCharCode orjust peeking to seewhat
the next code is without altering the current reading position. It interprets
extended character codes (such as^^M), ignores trailing blanks and appends
\endlinecharifneeded.Anotherclass,LineInput,servesasaninputsequence
ofInputLinesfromaleorconsole.
Log is an important interfacefor printing information ona log le oron the
user'sconsole.Itdeclaresmethodsforprintingvaluesofprimitivetypes,Strings,
CharCodesand Loggables(seebelow).Severalmethods aredeclaredtocontrol
output line breaking. Class StandardLog implements the Log interface in the
standardT
E X way.
Loggableisaverysimpleinterfacewhichdeclaresonemethodforprintingona
Log.ItisveryhandybecausemostoftheimportantclassesinN
T
Simplement
this interfaceandso theirloggingisconvenientlyhandled.
4.3. Package command
Classes in thepackagecommand form the interpreterfor the T
E
X input lan-
guage. Although it is a large package, it has nothing to do with typesetting
per se.Infact,atleastonethirdoftheT
E
Xsourceisnotabouttypesettingat
all. Itisresponsiblefortheprocess ofscanninginputtokens,expandingthem
andformostofthemode-independentprocessingsuchasmacrodenitionsand
registerassignments.
Token is an abstract class. It declares methods for getting the meaning of a
Token, assigning a newmeaning (if allowed),matching anotherToken, and a
numberof predicateswhich tellifitis redenable,isabrace, aletter, andso
on.ThereareseveralkindsofTokens,and theyformasmallhierarchyofsub-
classes. TypicalexamplesincludeCtrlSeqToken, ActiveCharToken,SpaceToken,
LetterToken,LeftBraceToken, ...
Tokenizer is able to provide a sequence of Tokens. There are various sub-
classesofTokenizersuchas:LineInputTokenizerfortokenizationoftheinputle,
MacroExpansionfor macrobodies withsupplied parameters,InsertedTokenList
for a token list from a token register inserted into the input stream, or
BackedToken for just one backed-up token. Tokenizers are pushed onto a
TokenizerStacktheanalogueofT
E
X 'sinputstack.
Command is an abstract class which represents each T
E
X command. Mostly
table ofequivalents, butthere are important exceptionssuch as Macroorthe
meaning of a character.In T
E
X, tokens and command codes are represented
bythesametypeandtheyareofteninterpretedinbothwayswhichmaylead
to confusion. N
T
S strictly separates the concept of token and command. As
outlinedabove,aTokenisapieceofinputwhichcanhavesomemeaning.The
typeofthis meaningistheCommanddiscussedhere.
ACommandhasmethods forexecutionandexpansion.OnlysomeCommands
canbeexpanded,andthispropertyisindicatedbyanotherpredicatemethod.
TheheartofN
T
S isacycleverysimilartoT
E
X'smain_control.Inonestep,
atokenisfetched from theinputandits meaningisexamined. Ifitisexpan-
dable, theappropriate method forexpansion iscalled. Ifthere is someresult
ofexpansion,themethodis responsibleforpushingitonto theinputstack.If
itisnotexpandable,themethodforexecutingthecommandiscalled.
There is one curious fact aboutexpandable commands: theyare executed if
theirexpansionissuppressed by\noexpand.In thiscasetheybehaveexactly
like \relax (they do not do anything apart from re-setting T
E
X's internal
stateandterminatinganyactivelook-ahead)andtheyevenpretendthatthey
are \relaxwhen examined by \show. For that reason, the whole subtree of
expandablecommandsisderivedfromtheRelaxcommand.
Anotherimportant partof theCommands interfaceare themethods used for
indicatingavailabilityandgettingsomevalueofacertaintype.Itisusefulwhen
the command occurs on the right-hand side of an assignment, for example,
andthereforetheregisters,parametersandafewotherscanprovidenumeric,
dimension,glueortokenlistvalues.
CommandBaseisasuperclassofCommand.Itdenesonlystaticmethodswhich
arerelatedtoscanningvariouselementsofinput(suchasnumbers,dimensions,
le names, keywords,... ), maintaining the table of equivalents, input stack
andseveralinstancesofLogoutput.Aswewillseelater,therearemoreobjects
thanjust Commands which requiresuchservicesand so arederivedfrom this
abstractclassforconvenience.
4.4. Package node
Nowatlastwearegettingtotypesetting!Theclassesinthispackagerepresent
material to be typeset. There are also general interfaces to font metrics and
output generators. Thepackage is relativelylowin the hierarchy, the classes
aredependentonlyonthebaseandiopackages.Thatgivesthemagoodchance
tobere-usedinadierenttypesettingsystemwhichmayprovideacompletely
Nodeisaninterfacewhichdenestheelementarybuildingblockoftypesetting
material. It has methods to get its sizes (even when aected by some stret-
ching or shrinking), to describe itself on a logle and to be typeset. There
is a hierarchy of classes which implement the Node interface. Some of these
are elementary, for example: RuleNode, HKernNode, VKernNode, HSkipNode,
VSkipNode, PenaltyNode; other objects are complexand can contain lists of
subsidiary nodes:HBoxNode,VBoxNode.
Packerisusedwhenweneedtocomputethesizesofcomplexboxeswhich are
built outoflists ofnodes.This processis called packagingin T
E
X. Thealgo-
rithmisessentiallythesameforhorizontalandverticallistsofboxes,onlythe
horizontal andverticaldimensionsareippedfordierentcases.Theabstract
classPackerdenesanabstractalgorithmandprovidesaplaceholdertogetthe
appropriate box dimensions. There are then special subclasses for horizontal
and verticalboxes which are in turn sub-classed outsidethis package to give
therightkindofwarningsifsomethingisnotdecent.
FontMetric isanabstract interfaceforfontmetricinformation objects.Atthe
moment, there are only the familiar tfm les but that is not a restriction of
N
T
S itispreparedforanykindoffontmetricwhich canbeadaptedto this
interface.Therearemethodsforgettinganidenticationandvariousnumericor
dimensionparametersforT
E
Xcompatibility.Butrstofalltherearemethods
to getaNodeforaparticularCharCode,togetanormalinter-wordspaceand
togetaspecialobjectwhichisabletoproducetherepresentationofcharacters,
ligaturesandkernsforagivensequenceofCharCodes.
TypeSetter hassimilarcharacteristicsto FontMetric.Itdenes aninterfacefor
generaltypesettingoutput.Therearemethodsfortypesettingacharacterora
ruleat thecurrentpositionandadjusting thisposition.
4.5. Package builder
This packagetakescareoftheareasconcernedwithT
E
X'shorizontal,vertical
andmathsmodes.WhilstinT
E
Xthereisjustoneglobalintegervariablewhich
indicatesoneofthesevenpossiblemodes(the threementionedaboveareeach
internal orexternal, and there is oneno mode) on the top of the semantic
stack,N
T
S usesobjectswhichbuildtypesettingmaterialfordierentmodes.
The package is more dependent on the T
E
X paradigm than is node but is
still independentoftheT
E
X language.It isrelativelysimpleandsmall. Some
amountofcomplexitymustbesolvedwhentypesettingcommandsanddierent
modesinteractbutthatissueisaddressedinanotherpackage.
Builder is the root of the hierarchy of classes for dierent modes. It declares
methodsforaddinganode,kernorskiptothelistofnodeswhichiscurrently
beingbuilt.It makestheappropriateversions(horizontalorvertical)ofkerns
andskipsandperformsotheradjustmentsifneeded.Currentlyonlythemodes
knownfrom T
E
X aresupportedbutthere isprovisionforothertypesofmode
(chemical,picture,...).
4.6. Package typo
Thepackagetypois asuperstructure ofthepackagecommand. Itcontainsall
theCommandsubclasseswhichdealwithtypesettingcurrentlydeveloped(there
willbeapackagemathsformathematicaltypesettingcommandsbutitdidnot
exist whenthis textwas written).It utilizesthepackagesbuilder andnode as
well.
TypoCommandissimilartoCommandBasebutisintendedfortypesettingcom-
mands. It is an intermediate abstract Command class which denes several
useful static methods. It maintains a stack of Builders and the current font
metric.Therearemethodsforscanningafontmetricorboxspecicationfrom
theinput,andaddingacharacteror spacetothecurrentBuilder.
Manyclassesin thispackagearederiveddirectlyfrom classesin thecommand
packagebecausetheycaninheritsomeusefulbehaviourfromthem.Theycan-
notbeincludedin thecommandpackagebecausetheyneedsomeinformation
which isavailableonlyinthetypopackage(usuallybycallingsomestaticme-
thodofTypoCommand).Therearebasicallytwokindsofthese:oneis\ifpri-
mitivessuchas\ifhmodeor\ifvboxwhichjust needsomeinformationabout
thecurrentBuilder oracertain boxregister; anotheristhe commandswhich
aremodeindependentbuttypographicsuchas \setbox,\wdand\chardef.
BuilderCommandis an abstract superclassfor commands which are mode de-
pendent.InanopensystemsuchasN
T
Swewantnewfeaturestobecapableof
beingeasilyadded.Thereis,forexample,asuperclassCommandwhichdenes
aparticularsetof methodswhichcanbeimplementedbythenewcommands
in any sensible way. This kind of polymorphism is directly supported by the
chosenprogramminglanguage.
ButwhattodoifinfuturewewanttoaddanewmodebydevelopingaBuilder
which oers somenew functionalitynot declaredin theBuilder interfaceand
some specialized commands which can utilize this new functionality? If we
do not want to extend the basic interface (at least until a new version) or
evenperhapscannotdoit(wearedevelopingaplug-in),theonlychance isto
examinethetypeofthecurrentBuilder andtousetheinfamouscastoperator
But thereis another problemwith existing mode-dependentcommands. How
should theybehavein thenewmode? For this purpose, theBuilderCommand
maintains a hash table which associates an Action with each combination of
Builder class and Command instance. The association is dened at the level
of the N
T
S conguration and it automatically follows the class hierarchy of
Builders. Thanksto this versatility,itis veryeasy tospecifythe behaviourof
commandsin dierentmodesformodiedsystems.
Action is a subclass of CommandBase so it inherits many methods for scan-
ning theinput, dealing with log les and error messages.Actionsare usually
implementedasinnerclassesofthecorrespondingBuilderCommand.
A fragmentoftheN
T
S congurationdatalookslike:
RulePrim hrule = new RulePrim
("hrule", default_rule, Dimen.NULL, Dimen.ZERO,
Dimen.ZERO);
RulePrim vrule = new RulePrim
("vrule", Dimen.NULL, default_rule, Dimen.NULL,
Dimen.ZERO);
hrule.defineAction(VertBuilder.class, hrule.NORMAL);
hrule.defineAction(ParBuilder.class, hrule.FINISH_PAR);
hrule.defineAction(HBoxBuilder.class, hrule.BAD_HRULE);
vrule.defineAction(HorizBuilder.class, vrule.NORMAL);
vrule.defineAction(VertBuilder.class, vrule.START_PAR);
TheBuilderCommandcorrespondingtotheT
E
Xprimitive\hruledenesthree
actions:itperformsthenormaloperationinverticalmode,nishesthecurrent
paragraph (if any) in horizontal mode and complains inside an \hbox. The
\vruleperformsnormallyinanyhorizontalmodeandentersanewparagraph
in vertical mode. There is in fact only one class (RulePrim) which has two
instanceswithnameshruleandvruleanddierentparameters;theyareassigned
dierentActionsfor thesamemodes.All the ActionsNORMAL,START_PAR,
FINISH_PARandBAD_HRULEareinstancesofinner classesinside RulePrimor
itssuperclass.
Other examples of BuilderCommand are: HBoxPrim, VBoxPrim , VTopPrim,
LowerPrim, MoveLeftPrim, BoxPrim, KernPrim, CharPrim , ExSpacePrim,
AccentPrim, AnySkipPrim.
Group is another subclass of CommandBase. Its subclasses cover the various
typesofgroupinT
E
X.TherearegroupssuchasSimpleGroupforapairofbraces,
SemiSimpleGroupforthe\begingroupand\endgroup,HBoxGroup,VBoxGroup
orVTopGroup.Groupitselfisdened andthestackofGroupsismaintainedin
Groups have one problem in common with Builder. Their closing commands
behavedierentlyin combination withcertaintypeofGroup. Therightbrace
cannotmatch \begingroupand\endgroupcannot matchtheleft brace.The
problem is solvedin exactly the sameway asfor combinations of commands
andBuilders.
4.7. Package tfm
The package tfm implements a particular type of font metric information
theT
E
X fontmetricle foruse in N
T
S . It canbeused asanexamplefor
implementingothertypesoffontmetric.
TeXFmisaclasswhichrepresentsthelow-levelalmostrawformatofaT
E X
fontmetricle.Somecomplicationsarehiddenbut itspublicinterfacereects
justtheinformationwhichisavailableinthele.Itusesseveralauxiliaryclasses
becausethe tfm format is toocomplex to be captured by only oneprogram
le.Asanexample,thewholeprocessofreadingatfmleisdonebytheclass
TeXFmLoader which createsaninstance ofTeXFm. TeXFmitself hasmethods
for getting information concerning the characters, ligatures and kernings for
pairs of characters, extensible recipes and sequences of enlarging characters.
Anothermethod is provided forprinting itsrepresentationasapropertylist.
This isused byasmall Javaapplication tftoplwhich thanksto TeXFm
sharesmostof thecodewithN
T S.
TeXFontMetric is an adaptation of TeXFm which implements the FontMetric
interface from the node package. It is awrapper which uses the naturalme-
thodsofTeXFmandprovidesthemethodsrequiredbytherestofN
T
S .This
approach is probablyusefulfor future implementationsof other types offont
metric.Wecannotexpectthatsomethirdpartywillprovidetheexactinterface
evenifaJavaclassissuppliedforaccess.
4.8. Package dvi
Thispackageimplementsthedviformatasoneofthepossibleoutputformats
forN
T
S .Inmanyaspectsitwillbesimilartothepackagetfmbutitwastoo
earlyto saymoreasdevelopmentof thisparthadjust startedwhen thistext
waswritten.
4.9. Package tex
Thispackageisan umbrellaforthe otherN
T
S packages,and itis byfarthe
messiest part of thesystem. All theclasses and packagesso farare designed
classesand packagesaspossible.ButinT
E
Xitself, there aresomanyunclear
dependencies.ThatwasonereasonforstartingthewholeN
T
S projectin the
rstplace. Theclassesinthis packagejoin alltheindependentunits together,
andinadditionalltheweirdcaseswereexportedfromthecleandesignofother
packagesto here if this waspossible. That is the main reason that the code
sometimeslooksrathermessyhere.
Besidesthis,thereareclassesformaintainingtheerrorpoolsothatcommands
arenotdependentonthewayinwhichtheerrormessagesaregiven.
ThemostinterestingpartofthispackageistheclassPrimitiveswhichcontains
thecongurationofthewholesystem.Therewasalreadyanexampleinpackage
typo.
4.10. Modularity and congurability
To develop a system which is as modular as possible was one of the main
desiderata.InthecurrentT
E
Ximplementation,therearealotofdependencies.
Experience shows that it is very dicult and dangerous to make some non-
trivialchangessincethesecanleadtoanumberofpossiblyunclearside-eects.
The approach taken in developing N
T
S has been to make all dependencies
explicit and clear. All classes havea well-dened interface of publicmethods
which is used for all communication. There are no uncontrolled changes of
globalvariables.ThismannerofprogrammingisgreatlysupportedbytheJava
object-orientedlanguage.
Anothermotivationformakingcodeunitsindependentistoallowsubstitutions
of some modules by other modules with the same interface but a dierent
underlying implementation. Independentclassesor packagescanalso be used
asbuildingblocksforanothersystem.TheN
T
Spackagesarethereforedesigned
ratheras classlibrarieswithastricthierarchy.
An interesting problem concerned with the decomposition of T
E
X into inde-
pendentunitsistheproblemof cyclicdependencies.Therearemanyofthem.
AsimpleexampleistherelationbetweenT
E
X'seyesandstomach.Thesto-
mach is fed by commands which originateat theeyes,but the action of the
eyesdependson\catcodesettingswhichoriginatefrom thestomach.
This makes it particularly dicult to maintain a non-cyclic hierarchy of pa-
ckages. On the other hand, it is verydesirable if we want to use only some
of themin anotherapplication. Themethodthat N
T
S usestoavoidsuchcy-
clicdependenciesisviaabstractinterfaces.Ifsomeclassneedsinformationor
an action which is not available at the current level of hierarchy, it denes
constructor or of somemethod). The parameterisationis then made at some
higherlevel,usuallyintheumbrella(ormaybeN
T
S'sbrain)thepackagetex.
5. A summary of the status quo
N
T
Swasenvisaged(morethanalittlenaïvely,ashasalreadybeensuggested)
astakingoneyearfromcommencementto fullimplementation.It isnowtwo
yearssinceformalcommencement,andworkisnotyetcomplete.Howfarhave
wegot,andwhatwerethereasonsforthedelays?
Thegoodnewsisthatworkisverynearlycomplete:Karelhastackledthetask
inaverylogicalorder,startingwithT
E
X'seyesandmouth (thescannerand
tokeniser),thenmacroexpansion,thencommandexecutionwherethisdidnot
involvetypesetting, through to list creation,and page-building. N
T
S is now
able to process and typeset (that is, generate DVI) for any document which
doesnot involvemathematics oralignments, although it cannot (at thetime
of writing) yet hyphenate words. In fact, only three real challenges remain:
mathematics (mathematical typesetting, of course, rather than mathematics
perse),alignmentsandhyphenation.Karelhasalreadycompletedalargepart
oftheresearchanddesignphaseforthese.
However,there isanotherareain which somework remainstobecarriedout,
and that is the area of system interactions. Of course, T
E
X itself does not
interact with the systemin any potentially dangerousway(with the notable
exceptionofbeingabletoopenanarbitraryleforwriting,providedthatthe
user running T
E
X hasappropriate permissions). But T
E
X does interact with
theenvironmentinrathermoresubtleways,forexampletoascertainthepath
orpathswhich itwillsearchforeachclassofle(\inputles, .tfmles,and
soforth).
Most implementations of T
E
X perform this interaction through the medium
ofso-calledenvironmentvariables (e.g.TeX_Inputs,TeX_Fontsandsoforth).
TheseenvironmentvariablesaretypicallysetbytheinstallerofT
E
Xforagiven
system,andcanusuallybemodiedbyindividualuserstosuittheirparticular
needs. Whether these environment variables are actually variables, or logical
names, or part of (e.g.) a Windows NT environment settings is irrelevant to
the user: all that matters is that there is a standardway(standard, that is,
for each platform and implementation of T
E
X) of informing T
E
X where the
relevantlesareto befound.
TheproblemisthatJavaisaportable language.Andtrulyportablelanguages
must behaveidentically nomatter onwhichplatform theyare installed.And
not standardised across platforms, Java shall have no access to environment
variables.Disaster!
It therefore looks at the moment as if N
T
S's environment will have to be
congured independently to that of T
E
X, using a Java-specic conguration
system, and there will be noway ofallowing N
T
S to inherit T
E
X's run-time
environmentsettings.Butthisareaisstillunderreview,anditisstillpossible
that somesatisfactorycompromisewillbefound.Recentimprovementstothe
Java systemhave acknowledged the need fora so-called policy le, which by
default is ignored but which if permittedby the securitysettings canbe
read by the Java run-time system during initialisation. Such a le could be
generated in averystraightforwardwayfrom existing environmentvariables,
although(forobvious,bootstrapping,reasons)theprogramtogenerateitcould
not bewrittenin pureJava!
Somathematics,alignments,hyphenationandenvironmentalenquiriesremain
tobeimplemented,virtuallyallelseiscomplete;howsatisedarewewiththe
workdonesofar?
In general,weare extremely satised; Karelhas donean excellentjob of re-
engineering and re-implementing aT
E
X-compatible systemin amodular and
open way. Compatibility remains uncompromised: the DVI les and log les
(andeventheconsoleoutput)ofN
T
SandT
E
Xareidentical (obviouslymodulo
suchnecessarydierencesastheNTSbannerreading"ThisisNTS"ratherthan
"ThisisTeX").
Butthereisalsooneareaaboutwhichwearedeeplyconcerned,anditisonly
fairthat weshouldrevealourconcernstothesponsorsoftheproject(suchas
GUTenberg).Thatareaisperformance.Andtheperformanceisabysmal.
WhenwerstwenttoKnuthwithourplansforN
T
S,wesaidthatweintended
toperformthere-implementationintwophases:phase-1woulduseamodern,
rapid-prototyping, language to validate the design; the second phase would
involve a further re-implementation using a language selected for eciency.
Donreassuredusthatthis secondphasewould neverprovenecessary:bythe
time youare readyto perform thesecond re-implementation, technologywill
have advanced somuch that asecond re-implementationwill notbe needed.
Computerperformancecontinuestorocket,yearafteryear,andshewsnosigns
ofstartingtoreachaplateau isaparaphrase(frommemory)ofDon'swords.
Well, in one sense, Don was right: computer performance does continues to
rocket, and still shews no signs of starting to reach a plateau. Yet, despite
this,N
T
Sis,onlargebenchmarks,over100timesslowerthanT
E
X ,evenusing
themuch-vauntedjust-in-time compiler.Andso,wearefacedwithacrucial
decision:dowecontinuetouseJava,andjustwaitforthehardwaretospeedup
rst1GHzpentium-classmachineshould shipthisyearafactorof 200: 1).
OrdoweusetheJavaimplementationjustfortestpurposes,butre-implement
Karel'sdesign in aradicallymoreecientlanguage?Or dowesimply admit
defeat,saywetried,andleaveittootherstoseeiftheycanbemoresuccessful
thanwe?
Thesearehardquestions,andtherewill beconsiderablesoul-searchingbefore
wecandecideontheanswer;allIcansayatthemomentisthatGUTenberg,
asone ofourmajorsponsors,will alsobeoneofthersttoknow.
6. Epilogue
Althoughmy talkhasended onaratherdownbeatnote, I'dliketo trytolift
yourspirits byasking (andanswering)onevitalquestion: whatlesson(s) can
belearnedfrom ourexperience(s)?
Therstmistakewassurelytounder-estimatethetimenecessaryfortheinitial
re-implementation.HadwefollowedKnuth's(?apocryphal?)algorithmfores-
timatingthetimeneededto developamajorsoftwaresystem,wewould have
added 1and thengone up to thenext order ofmagnitude. Thus Knuth's al-
gorithmwouldhavesuggested(hadweheededit)thatwewouldneednotone
yearbut twodecades!
Infact,weprobablyneedaboutthreeyearstocompletefullywhatwethought
couldbecompletedin one.Is itpossibletoexplainwhy?
Ratherinterestingly,I thinktheanswerisyes(whichmaysuggestthatIam
stillasnaïveasIwaswhenIstartedtheproject!).AccordingtoKarel,almost
alloftheextratimehasbeenspentmakingN
T
S 100%T
E
X -compatible.Note,
100%,not99:9%.It was thislast 0:1% thatate upso muchofthelost time.
Little things, like making sure that the console output wasidentical, even if
consoleoutputisephemeralandcanneverbecomparedotherthanbymemory.
Littlethings,likemakingsurethatN
T
S'sbehaviouratboundaryconditionsis
identicaltothatofT
E
X,evenifT
E
X'sbehaviourinsuchconditionsissometimes
awedandatworstcompletelyinsane.Littlethings,likemakingsurethatDVI
lesproduced byN
T
S are binary-identical withthose produced byT
E X , not
justsyntactic-and semantic-compatible.
WhatmadethissituationworsewasthatKarel'sbriefwasnot towriteaT
E X -
simulator;hadthatbeenhistask,hecouldprobablyhavecompletedthework
in eight months orless. His brief was, in fact, to write a exible, extensible,
modular T
E
Xsimulator,whichmeantthateverytimehediscoveredsomewhere
thatT
E
Xbehavedlessthanideally,hehadtoimplementtworoutines:(1)the
circumstances, and (2) a T
E
X-compatible routine, that introduced whatever
anomalous behaviour T
E
X itself would exhibit in those circumstances. Thus
someone taking theN
T
S sourcein the future will ndthat allthenecessary
logical, predictable, behaviour has already been implemented; it has simply
beensub-classedoutofsightin theinterestsofT
E
X-compatibility.
What otherlessons canbelearned? Well,it iscertainly worthre-visitingthe
questionofimplementationlanguage.WasJavatherightchoice?Inhindsight,
the answerappearsto be no,much asit hurts to admit it. There arethree
primary reasons for this. (1) Java is not astype-safe as we had thought, at
least if one wantsbothtype-safetyand eciency at thesametime. Whereas
in Pascalone canwrite:
type group = (simple_group, semi_simple_group, ...)
andthereafterusetheidentierssimple_group(etc.)intheabsolute certainty
thattheycanneverbeusedinacontextwhere(e.g.)anintegerisexpected,this
isnotthecaseforJava.Therearenoenumeratedtypes,andthusifonewants
type-safety to bechecked and enforcedat the compiler level, oneis virtually
forced to use objects to represent even the simplest enumerated type.
2
And
objects, of course, carry considerable baggage with them, and their use (in
excess)hasaheavyperformanceimpact.(2)Javalacksgenerictypes,andthus
in asituationinwhichonewantstomanipulate(say)listsofdierenttypes of
object,one is forcedeither to write type-specic code foreach type ofobject
or to use theonlytruly genericobject(Objectitself),and thento use casts.
In the latter case, type-checking is deferred from compile-time to run-time,
with anaccompanyinglackof(a)compile-timetype-safety,and(b)eciency.
(3) Javaimposesconsiderable performance overheads. IfN
T
S wereten times
slower than T
E
X, I might be prepared to argue that (a) Java performance
will continuetoimprove,andthereforeweshouldbewithintouchingdistance
of T
E
X's performance before too long; and (b) aperformance degradation is
acceptableifbothmaintainabilityandextensibilityareconsiderablyenhanced.
But I cannot, in all honesty, oer these defences in the present situation: if
N
T
S remains 100 times slower than T
E
X, its chances of ever being used in
earnestarevanishinglyremote.
Java'sstrengths, ontheotherhand,remain virtuallyunchallenged;itis por-
table (andobviatesany needfor system dependencies and/orlocal modica-
tions), it has attracted a large user (=programmer) base, and it does oer
seamlessnetworkconnectivity.Atthemoment,weareuncertainwhich(ifany)
other languagecouldoerthese advantages whileavoidingJava'slimitations.
GenericJava,particularlyifsupportedbySun,wouldmakegreatsense;Eiel
2. the java.util package does recognisethe need forenumerated types, butunless and
looksinteresting,too.AllIcansayatthemomentisthatKarelwillnishN
T S
V0usingSun'sJava;if,afterthat,thereisgeneralconsensusthattheproject
should continue, wewill investigatethe optionof translating(probablyauto-
matically) N
T
S from Java to another, moreecient, language.And beyond
thatistoofarto see!
Andonenalproblem,whichhasdoggedthisproject,andwhich(sadly)doesn't
seemlikelyto disappear. That isaproblem of communication.Theteam are
geographicallydiverse,withrepresentativesfromatleastvenations(UK,CZ,
DE, PL, NL); our programmer is based in CZ, where the only other team
memberis morethan fully occupied running a major university (Ji°í is now
Rector of Masaryk University). Thus Karel lacks the day-to-day support of
otherswithwhom todiscuss progressandproblemsother thanbye-mailand
atoccasionalgroupmeetings.Almostcertainly,communicationproblemshave
also led to various misunderstandingswithin the group,which are frequently
seenasbeingpolitically motivated.Politics have castashadowoverthis pro-
ject,ofthatthereisnodoubt;yetequallywithoutdoubteverymemberwants
theprojecttosucceed. I believethatthe goodwill which exists outweighsthe
dicultieswhichcanoccur,andthatwewillbeabletobringthisprojectto a
statewhere N
T
S is completeandusable.
ButDonadvisedus that weshould bepreparedat somepointtodowhat he
hasdone to say enoughis enough and to allow others to carry thetorch
forwards. I'm sure we aren't ready to do that yet there are far too many
excitingchallengestobemetyetthetimewillundoubtedlycomewhenN
T S
willitselfberegardedaspassé,andotherswillbekeentotakeonthechallenge
of carryingcomputer typesetting(in the nest T
E
X tradition) forwardin as-
yet unforeseen ways. I hope that amongst those who takeup this challenge,
membersof GUTenberg willgureprominently: youhaveamongstyoumany
whohavecontributedenormouslytothefurtheranceofT
E
X,someofwhom I
havehadthepleasureofknowingasfriendsaswellascolleagues.Onbehalfof
theN
T
Sproject,Ithankyoumostsincerelyforyoursupport;Ihopethatyou
enjoythedemonstrationofN
T
Swhichfollows,afterwhichIwilltrytoanswer
anyquestionswhichyoumayhave.