HAL Id: hal-00696562
https://hal.archives-ouvertes.fr/hal-00696562
Submitted on 5 Jul 2012
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Metagrammars as Logic Programs
Denys Duchier, Yannick Parmentier, Simon Petitjean
To cite this version:
Denys Duchier, Yannick Parmentier, Simon Petitjean. Metagrammars as Logic Programs. 7th Inter-
national Conference on Logical Aspects of Computational Linguistics (LACL 2012, demo session), Jul
2012, Nantes, France. pp.1-4. �hal-00696562�
DenysDuhier,YannikParmentier,andSimonPetitjean
LIFO,Universitéd'Orléans, Bâtiment3IA
6,RueLéonardDeVini-BP6759
F-45067OrléansCedex2,Frane,
firstname.lastnameuniv-orlea ns.fr
Abstrat. In this paper, we introdue the eXtensible MetaGrammar
(XMG), whih orresponds to both a language for speifying formal
grammars,andaompiler for thislanguage.XMGhas beendeveloped
overthelast deadeto providelinguistswith adelarativeand yetex-
pressive way tospeifygrammars.Ithas beenappliedto thedesignof
atualtree-basedgrammarsforFrenh,GermanorEnglish.XMGrelies
onamodulararhiteture,whihmakesitpossibletoextendtheformal-
ismwithadditional levelsofdesriptions and/or linguistiproperties.
Thus,ontopofsyntax,XMGanalsobeusedforthedesriptionofother
linguistiinformationsuhassemantis,ormorphology(thelatterbeing
urrentlyexploredforIkota,anAfrian languagespokeninGabon).
1 Introdution
SineChomsky'sseminalworkongenerativegrammar[1℄,manyformalsystems
have been proposed to desribe the syntax of natural language (see e.g. [2℄).
Thesemainly dierintermsofexpressivityandomputationalomplexity,and
generally rely either on rewriting rules (e.g. Tree-Adjoining Grammar), or on
onstraints(e.g. Head-drivenPhrase StrutureGrammar).
1
Aninterestingfamilyofformalgrammarsarelexializedgrammars[3℄.Suh
grammarsassoiateeahelementarystruture(i.e.grammarrule)withalexial
item (alled anhor).Lexialized grammars oertwo main advantages:rstly,
the grammar an be seen as a funtion mapping lexial items (i.e. words)
withuninstantiatedgrammatialstrutures(thegrammaristhenalledlexion).
Seondly,asubgrammaranbeextratedfromtheinputgrammaraordingto
thesenteneto parse,thusspeedingupparsing.
Anexampleof lexializedgrammaris LexializedTree-AdjoiningGrammar
(LTAG). In this formalism, the grammar is made of (thousandsof) uninstan-
tiated elementary trees(alledtree templates),where theleafnodesontainat
leastoneanhornode(labelledwith⋄).Theseanhornodesareattahedtoad-
equatelexialitemsatparsing.Asanillustration,onsiderFig.1depitingtwo
treetemplatestobeanhoredwithatransitiveverbsuhas manger(toeat).
1
We donotdisussthedistintionbetween onstituenyanddependenygrammar
S
N↓ V⋄ N↓
Jean mange unepomme
John eats anapple
N⋆ S
C
que
S
N↓ V⋄
La pommeque Jean mange
The applethat John eats
Fig.1.ElementarystruturesofanLTAG
From alinguistipointof view,lexialized grammarsallow to express gen-
eralizationsoverlexialentries bygatheringtreetemplates,whose anhorhave
similar syntati properties, into tree families. From a omputational point of
view, lexialized grammars are made of a huge number of strutures, due to
redundanywithin thelexion (e.g.treetemplatessharingommonsubtrees).
TheoneptofmetagrammarwasintroduedbyCandito[4℄inordertodeal
withstruturalredundanybyapturinggeneralizationsovertreetemplates.In-
stead of diretly desribingthe syntax of languagevia aformal grammar,the
linguistspeiesthestruturesofthisformalgrammarusingadediatedframe-
work.Thisspeiationofthegrammarisalledametagrammarandisautomat-
iallyproessed to generatethe grammar.Manymetagrammatialframeworks
havebeenproposed forLTAG[4,5,6,7℄.Here weintrodueone ofthese,namely
eXtensibleMetaGrammar(XMG)[6℄.XMGdiersfromothermetagrammarap-
proahesbyitsdelarativespeiationlanguage,anditsmodulararhiteture.
The latter made it possible to extend the onept of metagrammars to other
levelsofdesription(e.g. morphology)andlinguistipriniples(e.g. onstraints
onwordorder),asweshallseebelow.
2 The XMG language
As mentionedabove, the XMG languageallows for a delarative speiation
oflinguististrutures(inludingtreedesriptions).Morepreisely,XMGoers
a uniation-based language à la Prolog to speify what a grammar is. This
speiation is then proessed by the XMG ompiler in order to produe a
omputationalgrammar(e.g. anLTAG),whihan besavedin anXML le.
Capturingredundanyusingabstrations.XMGreliesontheoneptofabstra-
tiontoallowthelinguist torefertoreusablegrammatialunits (e.g. (ombina-
tionsof)treefragmentsforLTAG).Formally,anXMGspeiationorresponds
todelarativerules,whihanbedenedusingthefollowingabstratsyntax:
Rule := Name→Content
Content := Contribution | Name | Content∨Content | Content∧Content
Here,Contributionreferstoalinguistifragmentofinformationofagiventype
sriptionlanguagewhendesribingsyntaxwithLTAG).Thislanguagerelieson
uniationvariablestoshareinformationbetweendistintXMGrules(i.e. dis-
tintgrammatialunits) orbetweendistintontributions(i.e.betweensyntax
andsemantis).Thesopeofthesevariablesisbydefaultrestritedtotherule,
butanbeextendedviaimport/exportdelarations.Asatoyexampleofthese
variables and of XMG onrete syntax, onsider the rules CanonialSubjet
and Subjet below, the latter speies a generalization overthe two possible
realizationsofasubjetshowninFig.1(->isdominaneand>>preedene).
lassCanonialSubjet %% (omment)alass isan XMGrule in theabstratsyntax
export ?x ?y
delare ?x?y ?z ?u
{<syn> %% ontributionoftype <syn>
{node ?x [at=S℄;node ?z [at=N℄;node ?y (type=anhor)[at=V℄;node?u [at=N℄;
?x -> ?z;?x -> ?y ;?x-> ?u ;?z>> ?y ;?y>> ?u }
}
lassSubjet {CanonialSubjet[℄ |RelSubjet[℄}
Towardsuser-deneddesriptionlanguages.Metagrammarsbringinterestingin-
sightsin grammarengineering byoeringan abstrat viewonlanguage,made
ofombinationsofgrammatialunits.Sofar,theseunitsweredesribedusinga
set of hard-oded desriptionslanguages.Toreah extensibility, weare explor-
ing another approah: permittinguser-dened desription languages(similarly
tothegrammar,thesemustbedesribed).Somepartsoftheompilerthushave
tobegeneratedautomatially.
3 The XMG ompiler
General arhiteture. As mentioned above, the XMG language is nothing else
thanalogilanguage.Itsompilerthussharesomefeatureswithaompilerfor
logi programs. First, the lasses omposing the metagrammar (dened using
theXMGlanguageintroduedabove)areonvertedintolausesofanExtended
Denite Clause Grammar (EDCG) [8℄, whih orresponds to a DCG having
multiple aumulators. This underlying EDCG expliits the aumulation of
ontributions of multiple types (e.g. syntax, semantis). Then, this EDCG is
evaluatedaordingtoaxiomsdenedinthemetagrammar(omparabletoPro-
log queries). This produes alist of tuples of ontributions (the arity of these
tuples is the number of ontribution types). Finally, eah tuple of this list is
optionally post-proessed. For instane, tuples whose syntati ontribution is
atree desription arefed to asolverin order to produe syntatitrees. Dur-
ingthissolvingstep,itispossibletoapplylinguistiwell-formednesspriniples
(thesean useinformationfrom otherontributionsofthetuple).
XMG 2. The rst version of XMG (XMG 1.x) was developed between 2003
and 2010 in the Oz programming language, and inluded only three desrip-
tionlanguages:oneforspeifyingsyntatitrees(eitherLTAGtreetemplatesor
InterationGrammartreedesriptions),oneforspeifyingsemantirepresenta-
der to extendXMGwith theabilityto dene an arbitrarynumberof typesof
ontributions(andthusofuser-deneddesriptionlanguages).
2
4 Current state and future work
XMGan beused todesribetreestrutures, featurestrutures, prediates,or
propertiesofthePropertyGrammarformalism.Version2oftheXMGlanguage
superseedsVersion1(beingbakward-ompatible).XMG2anbeusedtoom-
pile grammars designed with XMG1, inluding theFrenh LTAG and Frenh
Interation Grammar,whose XMG metagrammarsareavailable on-line(along
withtoyexamplesofXMGinput/output).
3
WhendesribingLTAGtreetem-
plates,XMG2oersspeilinguistipriniples,namelyorderingbetweensister
nodes,uniqueness ofagivennodelabel,and nodemergingviapolarities.
XMG 2 is being atively developed in order to allow for ross-framework
grammarengineering, inthelinesof [9℄,but alsoforlinguistiexperimentation
bydeningdynamiallyitsowngrammarformalismasmentionedinSetion2.
XMG2hasbeenusedreentlytodesribethemorphologyofverbsinIkota,
an agglutinative Bantu language spoken in Gabon [10℄. The idea behind this
workistospeifymorphemesasontributionsintermsoflexialphonologyand
inetion (morpho-syntati features). In a next step, we plan to extend this
metagrammar(i.e. thisabstratlinguistiaountofmorphology)tosyntax.
Referenes
1. Chomsky,N.: SyntatiStrutures. Mouton,TheHague(1957)
2. Abeillé,A.: LesNouvellesSyntaxes. ArmandColin,Paris(1993)
3. Shabes,Y., Abeillé, A., Joshi,A.K.: Parsing strategieswith 'lexialized' gram-
mars:appliationtoTreeAdjoiningGrammars.In:12thCOLING.(1988)578583
4. Candito, M.: APriniple-Based Hierarhial Representation ofLTAGs. In:16th
COLING.(1996)194199
5. Xia, F.: AutomatiGrammarGenerationfrom two Dierent Perspetives. PhD
thesis,UniversityofPennsylvania(2001)
6. Duhier,D.,LeRoux,J.,Parmentier, Y.: TheMetagrammarCompiler:AnNLP
AppliationwithaMulti-paradigmArhiteture. In:MOZ.(2004)175187
7. VillemonteDeLa Clergerie, É.: Building fatorized TAGswithmeta-grammars.
In:TAG+10,NewHaven,CO,UnitedStates(2010)111118
8. VanRoy,P.:Extendeddgnotation:Atoolforappliativeprogramminginprolog.
Tehnialreport,TehnialReportUCB/CSD90/583,UCBerkeley(1990)
9. Duhier,D.,Parmentier,Y.,Petitjean,S.:Cross-frameworkGrammarEngineering
using Constraint-drivenMetagrammars. In:CSLP.(2011)3243
10. Duhier, D., Magnana Ekoukou, B., Parmentier, Y., Petitjean, S., Shang, E.:
DesribingMorphologially-rihLanguagesusingMetagrammars:aLookatVerbs
inIkota. In:4thWorkshoponAfrianLanguageTehnology-LREC.(2012)
2
Bothimplementations(XMG1.xandXMG2.x)arefreelyavailableon-lineathttps:
//souresup.renater.fr/xmgandhttps://launhpad.net/xmg respetively.
3