C AHIERS DE
TOPOLOGIE ET GÉOMÉTRIE DIFFÉRENTIELLE CATÉGORIQUES
D ANA M AY L ATCH
The connection between the fundamental
groupoid and a unification algorithm for syntactil algebras (extended abstract)
Cahiers de topologie et géométrie différentielle catégoriques, tome 32, no3 (1991), p. 203-242
<http://www.numdam.org/item?id=CTGDC_1991__32_3_203_0>
© Andrée C. Ehresmann et les auteurs, 1991, tous droits réservés.
L’accès aux archives de la revue « Cahiers de topologie et géométrie différentielle catégoriques » implique l’accord avec les conditions générales d’utilisation (http://www.numdam.org/conditions). Toute utilisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright.
Article numérisé dans le cadre du programme
THE CONNECTION BETWEEN THE
FUNDAMENTAL GROUPOID AND A UNIFICATION ALGORITHM FOR SYNTACTIL ALGEBRAS
(Extended Abstract)
by Dana May LATCH*
CAHIERS DE TOPOLOGIE ET GÉOMÉTRIE DIFFERENTIELLE
CA TÉGORIQUES
VOL. XXXII-3 (1991)
RÉSUMÉ:
Commenqant
par unegrammaire tcontexte-libre’,
nous construisons une
algebre
de chemins de1’ algebre syntaxique
de cettegrammaire.
Nous d6montrons que(a)
d6cider s’il existe un ensemble fini
d’homotopies
engen- drant legroupoide
fondamental deI’alg6bre
de cheminset
(b)
d6cider s’il existe unalgorithme
fini d’unifica-tion pour
1’ algebre
de termes de1’ algebre syntaxique,
sont
6quivalents.
Ces deuxprobl6mes equivalent
auprobl6me d’6quivalence
degrammaires ’contexte-libres’, qui
est non-d6cidable(au
sens deTuring).
Nous donnonsdes conditions
suffisantes, qui impliquent
1’existence d’un ensemble finid’homotopies engendrant
legroupdfde
fondamental.
1. Introduction
This paper deals with the connection between two very differ-
ent areas: unification
theory
from computer science andalge-
braic
topology
ofalgebras
/ smallcategories
from mathemat- ics.Unfortunately,
the first part of this connection deals with a class of newTuring
undecidableproblems (that is,
decisionproblems
that areequivalent
todeciding
thehalting
of a
Turing machine).
We add to a class ofequivalent Turing
undecidable
problems
for context-freelanguages, problems dealing
with unificationtheory
andhomotopy theory.
It iswell known
(for example,
see[Ha78], [HoU179],
or[DaWe83]
that the
following equivalent problems
are undecidable. Let:G , G1
andG2
be context-free grammars(CFG’s).
*on leave 1985-1988 at CUNY; Work partially supported by
NSF Grants #MCS81-04217; #DCR83-02897 and by CUNY Grants
#PSC-CUNY-668293.
(1) Ambiguity
Problem:Deciding
whether G isambiguous (that is,
whether there exists more than one way ofgenerating
an
expression string using G) .
(2)
Intersection Problem:Determining
whether the context-free
languages L(G1)
andL(G2) generated by G1
andG2 respectively,
have anyexpression strings
in common(that is, determining
whetherL(G1) n L(G2) = Ø) .
(3)
GrammarEquivalence
Problem:Determining
whetherG1
andG2 ,
areequivalent (that
is whetherL(G1)
=L(G2)) -
In this paper we show that the
following
decisionproblems
are
equivalent
to the above undecidableproblems
forcontext-free
languages:
(4) Unification
Problem:Determining
whether there exists a set of mostgeneral
unifiers(mgu’s)
for any two terms inT(G) ,
the monoidal termalgebra generated by
G[Ru 87].
(5) Homotopy
Generation Problem:Determining whether,
for thepath algebra P(G) generated by G ,
the strong funda- mentalgroupoid
isisomorphic
to the weak fundamentalgroupoid (that is,
whether there existenough
strong"generating homotopies"
inP(G)) .
(6) Homotopy
Extension Problem:Determining whether,
foreach
pair
ofpaths
inP(G) ,
there exists a minimal setof strong
homotopy
extensions to elements of the funda- mentalgroupoid
The Unification and
Homotopy
Extension Problems areequiva-
lent to the
Ambiguity
and IntersectionProblems,
while theHomotopy
Generation Problem isequivalent
to the undecidableAmbiguity
Generation Problem which contains theAmbiguity
Problem:
(1’) Ambiguity
Generation Problem:Determining
whether there exists a(finite)
set ofambiguity pairs
that generatesthe
ambiguity/pre-order
congruence relation in thepath algebra P(G) .
Fortunately,
the second part of the connection between unifi- cationtheory
andhomotopy theory,
deals with sufficientconditions
[La89a]
on thepath algebra P(G)
of a CFG G sothat a unification
algorithm
for the termalgebra T(G) exists,
andequivalently,
so that stronggenerating
homoto-pies
for the fundamentalgroupoid
of thepath algebra P(G)
exist. For a
path algebra P(G)
that satisfies these suffi-cient
conditions, by examining
thegeometric "path
structure"of
P(G) along
with the"punctuation
structure" ofG ,
we are able to describe[La89a]
how to generate the unificationalgorithm
forT(G) (using
a "diamondcompletion" procedure).
In
[La89a],
a nontrivial CFG classGFP ,
used to generate205
expressions
for a class of formal functionalprogramming (FFP) languages [Ba78],
is shown tosatisfy
these conditions.In this paper, for
GFP ,
we present bothgenerating
mostgeneral
unifiers for the termalgebra T(GFP)
andgenerating homotopies
for the strong fundamentalgroupoid
of thepath algebra P(GFP) -
1.1 Unification
Theory
Unification
theory plays
animportant
role in computerscience;
forexample,
inconjunction
with Robinson’s resolu- tion method[Ro65]
for Horn clauselogic,
unification is the best method of evaluation of programs in PROLOG[ShSt87].
The literature
dealing
with unification is enormous and uni- fication is a basic tool in many areas of computerscience, including: computational complexity,
data structures andalg- orithms,
automated theoremproving, logic programming, higher
order
logic,
functionalprogramming languages
andpolymorph- ism,
feature structures, naturallanguage processing,
equa- tional theories for termrewriting
systems, latticetheory,
type
inference,
machinelearning,
etc. Anintroductory
sur-vey to unification
theory
isgiven
in[Kn89].
J.Siekmann presents a formal
theory
for and an extensivesurvey of the
theory
ofequational
unification in[Si89].
D.Rydeheard explores applying
categorytheory
to unificationalgorithms
with respect to the emptyequational theory [BuRy- 86]
and extends thiscategorical approach
tocombining
unifi-cation
algorithms
in[RySt87].
In[Baa89],
F.Baader shows that unification for commutative theories can be character- isedcategorically
as those which have semi-additive categor- ies of "substitutionmorphisms".
In[BHS89], H.-J.Bürchert, A.Herold,
and M.Schmidt-Schaussexplore decidability
ques- tions of unification for various classes ofequational
theo-ries, by exploiting properties
of their associated monads.However, probably
because of theundecidability
of theUnification Problem for context-free term
algebras,
there hasbeen little research on the
development
of unificationalgo-
rithms for these
algebras.
In[La89a]
sufficient conditionsare
given
for the existence of unificationalgorithms
forcontext-free
algebras,
andapplications
of thesealgorithms
are
presented
in[La89b], [LaSi88]
and[LSR90].
In[La89a]
and
[La89b],
thesealgorithms
are shown to exist for theclass
GFP,
ofambiguous
CFG’s for FFPlanguages [Ba78].
An
important
reason fordeveloping
unificationalgo-
rithms for
syntactic algebras
based onambiguous
CFG’s forFFP
languages
is that the collection of these unificationalgorithms
acts as a basic tool in thedevelopment
of a pro-gram verification and
analysis system
for functional programs(see [LaSi88]
and[LSR90]).
Theinput
to the verification andanalysis
system is a nonconstructive denotational seman-tics for the
primitives (that is,
theprimitive
functions andfunctional/combining forms)
of the FFPlanguage
and an opera- tional semantics in the form of a(possible nondeterministic)
collection of rewrite rules. For a
language primitive,
theoperational
semantics can contain head recursiverules,
tail recursiverules,
or rules whichemploy
combinations of head and tail recursion.The
(nonconstructive)
semantics describes the transform-ational effect of a
primitive
on itsoperand(s) (possibly resulting
in recursiveapplications
of thelanguage primi- tives).
Theperformance
of this transformationmight
be con-sidered a
"macro-operation",
or machineinstruction,
of theinterpreter, occuring
in asingle
"instructioncycle".
Theoperational
rewrite rules describe the"micro-operations"
that
implement
alanguage primitive,
where a basic "machine-cycle"
isoccupied by
asingle
rewriteoperation.
Variously
defined rewrite rule collections can reflect theoperational
nature of diverse targetarchitectures,
while the nonconstructive semantics hides this level of detail from thelanguage
user.Typically,
anoperational
semantics willassign
a fixed recursive structure tocompound objects. But,
at the
operational level,
it may beappropriate
to considerlists
(a
basic data structure in functionalprogramming
lan-guages) variously
asLisp-like lists,
tail-constructedlists,
a mixture of the two, or concatenated
sublists, depending
onthe context. For
example,
the last function that selects therightmost
element of a list has a minimal cost with aright-
constructed list
1,x,> (where 11
represents anexpression
list and x, denotes a
single expression component),
and amaximal cost with a left-constructed list
x2l2>.
For thefunctional form
apply-to
all(which
is similar to the LISPhigher-order primitive map)
thatapplies
a functionf
to eachcomponent of an
operand
listx1 ... xn> returning
a list ofapplications (f : x1) ... (f : xn)>,
the format of the con-structed
operand
list does not matter. Forapply-to-all,
theinterpreter
must construct a list with ntop-level applica-
tions
independent
of the construction of theoperand
listx1 ...
xn> ;hence,
the number n of thetop-level
components of theoperand
list is theinput
parameter to the costmeasure.
Different structural
interpretations,
reflected in dis-tinct
operational semantics,
couldrequire
separate valida- tions. The verification andanalysis
system,presented
in[LaSi88]
and[LSR90], permits ambiguity
in theoperational
"deconstruction" of
compound
structures.Thus,
forexample,
lists can be treated
variously by
different rewrite rules asleft-constructed, right-constructed
or a mixture of the two.Validation of an
operational
semantics thatpermits
this kindof
ambiguity
may be considered the certification of a class of deterministicinterpreter models,
or of one or more non-deterministic
models,
one for each subset of the rewrite rulesadequate
toimplement
the denotational semanticspeci-
fication.
The user of the FFP verification and
analysis
systemmust
provide
a set of rewrite rules based on a monoidal con-text-free term
algebra.
The system must be able to determinea finite set of
mgu’s [Si89]
for anypair
ofleft-hand-sides,
because it must bepossible
torecognize
when two or morerewrite rules
(for
the sameprimitive)
can be used to eval-uate the same
applicative expression. Moreover,
each left- hand-sidecorresponds
to a sentential form derivation in the grammar, and represents the set of allexpressions
that matchit.
Describing
allexpressions
that match more than oneleft-hand-side is
equivalent
both togiving
a unificationalgorithm
for the termalgebra (that is, finding ,
a set ofmgu’s)
and tofinding
a minimalpartition
of the intersection of the left-hand-sideexpression
sets or context-free sublan- guages[La89a]. Unfortunately,
since we are notdoing simple syntactic
unification(where
unifiable terms must have thesame
structure),
two left-hand-sides may have more than one, indeed an infinite numberof, mgu’s.
Each mgu orequivalent
intersection component represents a separate case that must be examined
by
the verifier. Morespecifically, then,
eachpair
of terms must haveonly
afinite
number ofmgu’s (that is,
there exist anequational
unificationalgorithm
of finitetype
[Si89]).
Theequational
unificationtheory
identifiestwo
differently
constructedexpression
lists whenever theircomponents are identified and is the
pre-order/ambiguity
congruence in the
syntactic path algebra
of the FFP grammar.In
[La89a],
sufficient conditions for the existence of such unificationalgorithms
aregiven. However,
there is nohope
of
determining
necessaryc,onditions, because,
in Section2.3,
the Unification Problem for context-free term
algebras
isshown to be
equivalent
to the undecidable GrammarEquivalence
Problem.
1.2
Homotopy Theory
Let A be a small category and Ab be the category of abelian groups. In
[La79]
it is shown that derived functors of coli-mit
form the
homology theory for C,
the category of small cate-gories,
and that derived functors of limitform the
cohomology theory
for if . Thehomological
dimensionof
A ,
denotedhd(A) ,
is thehighest non-vanishing
dimensionof the
homology (i.e. hd(A)
=max{k
Icolimk # 0})
and thecohomological
dimension ofA ,
denotedhd(A) ,
is thehighest non-vanishing
dimension of thecohomology (that is, cd(A)
=max {k
Ilimk # 0}) .
In[La73]
and[LaMi74],
it isshown
that,
if thecardinality
of A isN n ,
thenIn
addition,
if A is a directed poset with smallest cardinal-ity
of a cofinal subsetbeing Nn,
thencd(A) =
n + 1 .A search for
why
thecardinality
of a small categoryimplies
such a"topological"
theorem resulted in:. A weak
homotopy equivalence
between 19 and the categoryW of CW
complexes
that was inducedby
thecomposition
of the nerve functor N : C -> x from 19 to the category
x of
simplicial
sets, with thegeometric
realisation functor l_l: K -> W(see,
forexample, [La79]
and[LTW79]);
. A characterisation of a class of weak
homotopy
inver-ses
[FriLa81]
for the nerve functor N : C -> K .Because 19 does not have
enough
stronghomotopies (that is,
natural
transformations),
there is nohope
ofgiving
a "con-structive" definition for the fundamental
groupoid n(A)
ofA . In Section
3,
if A is the monoidalpath algebra P(G)
ofa CFG
G ,
we show that:.
Homotopy
Generation Problem isequivalent
to theundecidable
Ambiguity
GenerationProblem;
.
Homotopy
Extension Problem isequivalent
to theundecidable Unification Problem.
However,
if thepath algebra P(G) generated
from the CFG G satisfies the sufficient conditionsgiven
in[La89a],
thenthe fundamental
groupoid II(PG)
isconstructible,
since it isgenerated by
a finite set of stronghomotopies (see
Section3.4
below).
Organisation
of the paper: Section 2develops
the defini-tions of various
syntactic algebras
for CFG’s and defines the concepts of thegeneration
of theambiguity
congruence rela- tion for thepath algebra
and thegeneration
ofmgu’s
for thecontext-free term
algebra.
Also in Section 2 sufficient con- ditions for the existence ofambiguity
generators and the existence of unificationalgorithms
for these context-freealgebras
arepresented.
In Section3,
thehomotopy
defini-tions for the
category C
of smallcategories
aredeveloped, along
with the connection betweengiving generating
homoto-pies
for the fundamentalgroupoid
of thepath algebra P(G)
ofa CFG G and
giving
generatorpairs
for theambiguity
congru-ence on
P(G).
Also in Section3,
we present sufficient con-ditions on
P(G)
forconstructing
the(strong)
fundamentalgroupoid
ofP(G).
Acknowledgments:
This paper is in memory ofEvelyn
Nelsonwho showed me that undecidable
problems
need not behopeless problems.
The author would like to thank JohnChemiavsky, Geoffrey Frank, Quan Nguyen,
RonSigal
and Don Stanat forfundamental contributions to the unification
algorithm.
Shealso extends her
appreciation
to EllisCooper,
AlexHeller,
Michael
Singer,
and Jim Stasheff for many useful conversa- tions about the"homotopy
connection".2. Context-free
Syntactic Algebras
This section includes:
(1)
Areview,
similar to thedevelopment
in[D 2Qu7 8],
of acontext-free grammar
(CFG)
as a derivation system forexpressions, along
with anexample
classGFP
of CFG’sfor
expressions
in FFPlanguages [Ba78];
(2)
A universalalgebraic
definition of thetyped
deriva-tion/path algebra
of a CFG[Ru87], along
with monoidalhomomorphisms
between thepath algebra
and thetyped syntactic
termalgebra;
(3)
Theequivalence
between finiteambiguity presentations [La89a]
in thepath algebra
and unificationalgorithms
in the
syntactic
termalgebra, along
with a discussionof
(Turing) undecidability [HoU179]
of the existence of suchpresentations
andalgorithms.
2.1 A Traditional Presentation of Context-free
Languages
Let T be a finite set of
symbols,
and letTm
denote theset of all finite
strings
of msymbols
from T . IN repre-sents the
singleton
setcontaining only
the emptystring
E .The set T* of all
strings
ofsymbols
over T is thedisjoint
union u
I’ ,
m >0 ,
where "u’ denotes thedisjoint
union.The set
T* together
with the associativeoperation
of concat-enation is a free monoid with empty
string
E as theidentity.
Definition 2.1.1: Let T be a finite set of
symbols.
A lan-guage L over T is any subset of
T* ;
thatis, L c T* .
An element e E L is called anexpression (or
asentence)
ofL. D
Definition 2.1.2: A
context-free
grammar is afour-tuple (T, N, P, (y)
where:(a)
T is a finite set of terminaclsymbols;
(b)
N is a finite set of nonterminalsymbols;
(c)
T and N aredisjoint;
that isT n N = O;
(d)
P is the finite set ofproductions
orsyntactic
rewriterules;
P c N x(T
uN)* ;
(e)
a E N is thestarting
nonterminal. DNotation 2.1.3: If
p E P
and p =(A, S(p)) ,
then we writeA
->p . S (p)..
The left-hand-side of p is A and theright-
hand-side is
S(p) .
Theright-hand-side S(p)
can assume thefollowing "shape":
with
k > 0, Ai ~ N,
andSj ~ T*.
11The process of
generating
anexpression according
to a CFG Gis the successive
rewriting
of sententialforms (that is,
elements of(T
uN)*) through
the use ofproductions
of thegrammar,
starting
with the start nonterminal (y . The sequence of sentential forms andproductions required
togenerate an
expression
constitutes a derivation of the expre- ssionaccording
to the grammar G .Definition 2.1.4: Let G =
(T, N, P, a)
be a CFG.(a)
If A-->P S(p)
is aproduction
inG ,
and wo = u A vand wl = u
S(p)
v are sentential forms(with
u and vpossibly empty),
we say that w, isimmediately
derivedfrom wo in
G ,
and we indicate this relationby writing
Wo
*----> P. w1 .
The collectionFI(G)
of all immediate der- ivations is called thefree (monoidal) graph
on G .(b)
If(WO,
wl , ... ,Wn)
is a sequence of sentential forms such thatwe say that w. is derivable from wo and indicate this relation
by writing
wo *--4d w. , whereThe
sequence d
is called a derivation of w. from woaccording
to G . Thebeginning
sentential form wo is called the domain ofd ,
while the last sentential form Wn denotedby S(d) ,
is called the codomain of d .(c)
For each sentential form w e(T
uN)* ,
let w *-1w W (or
moresimply
w *-->w)
denote theidentity
derivationon w . The
free
categoryF(G)
of the free monoidalgraph, F1(G)
is the collection all derivations inG ,
closed withrespect
toidentity
derivations andcomposi-
tion of derivations
[Mac71].
(d)
A derivation wo *--> d wn isleftmost, whenever,
for eachj, 1 S j S n ,
the nonterminal rewritten at theJ1h
step is the leftmost nonterminal in the sentential formwj-1 -
(e)
The grammar G is said to beambiguous
if there exist aterminal
string
e ET*
and distinct leftmost derivationso * -->d1 e and a
* --->d2
e. 0Definition 2.1.5: Let G =
(T, N, P, 6)
be a CFG andu * --->d w .
(a)
The set I dl l ={u
* -->d w* -->d1, S(d1)}
consists of allderivation extensions of d .
(Within
the comma category ofF(G) [Mac71],
I d1 is the set of allmorphisms
withdomain
d .)
(b) I dI T
={u
*__4d w *->d 1 e leeT*}
is the set of allterminal derivation extensions of d .
Similarly, I dl T
=IU
.->d W*->d1 S(d1) I d1
isleftmost}
is theset of all
leftmost
derivation extensions of d .(c)
Let I w l denote the set of all derivations thatbegin
with w E
(T
uN)’ ;
thatis,
the set of all derivationextensions of the
identity
derivation w * -> w . If A eN,
I A I represents the set of all derivations with domain A .(d)
If Ae N ,
letLG(A)
denote thelanguage
of all terminalstrings
derivable fromA ;
thatis,
the set of all deri- vation codomains of derivations inl A lT .
(e)
Thelanguage LG(G)
is called thelanguage generated by
Gand is denoted
by L(G) .
If e EL(G) ,
we say that e isa
string,
anexpression
or a wordgenerated by
G. 0
An FFP
language 2 [Ba78]
contains three basic kinds ofexpressions: itoms, applications,
and sequences. Atomicexpressions
include numbers and boolean constants, as well as a collection of "names" for thelanguage primitives.
Anexpression ( f : e)
is called anapplication.
It represents theapplication
of the function(or
programcomponent) f
tothe
operand (or input)
component e ; bothf
and e can be anyexpression
in 2 .Sequences
are definedrecursively
as listsof FFP
expressions
bracketedby
’’ and ’>’. No unbracketed list is a well-formedexpression
in 2 . Thefollowing example develops
a context-freedescription
of thesyntactic
structure of a class of FFP
languages.
Example
2.1.6: Thefollowing
table describes a classGFP
ofCFG’s for
expressions
in FFPlanguages, by listing
theprin- cipal productions appearing
in any CFG inGFP .
The grammars in this class arepartially specified;
theremaining produc-
tions
appearing
in any CFG inGFP
which generate atomic ex-pression
collections are notgiven.
The set of atoms is thelanguage L(AT) (generated by GFP)
and is denotedby
I AT I(When
no confusionarises,
we refer to this class of CFG’s as"the grammar
GFP .")
Because the unification
algorithm
takes asinput
apair
of terms
(in
atyped syntactic algebra)
that contain vari-ables for
expressions,
atomicexpressions
andexpression lists,
the grammarGFP
contains anarbitrary
number of(term- inal) expression
variablesV(E) = { xi},
, atomic variablesV(AT) = {ai}
andexpression
list variablesV(L) = {lj}.
Alist variable
1.
represents a(possibly empty)
sublist of alist. For
example,
if 8 is a substitution such that the inst- antiation oflj,
denotedby 81j’
is the emptystring
e ,then the instantiated sequence
expression 8xi lj> = 8Xi> ; similarly
forljxj> .
IfOli
= e,...e" thenThe grammar
Gpp
isambiguous,
because lists can be con-structed
by gluing
asingle component
onto the left-end orthe
right-end
of analready
constructed list. Forexample,
the distinct leftmost derivations:
derive the same sentential
form, S(d1) =
E L E> =S (dr) .
In
fact, l ELE > l T
is the set of all derivations tosequence
expressions having
at least two components. 0Table 1: A Grammar for FFP Languages
Terminals: T = IATI u V(E) u V(AT) u V(L) u { "",">","(",")",":"}
Nonterminals: N={ E, AT, L}
Start Nonterminal: E Productions:
2.2
Typed Syntactic Algebras
Each derivation A *--4d
S (d)
in a CFG G can be viewed both as apath d
=p(d)
in a free monoidal universalalgebra [Ru87]
and as a term
t(d)
in the freetyped
termalgebra (once
vari-ables have been "derived" for the nonterminals in
S(d)) . Furthermore,
each term tcorresponds
to a "retracted" deriva- tionr(t)
with each variablereplaced by
itscorresponding nonterminal/type
so that the sentential formS(r(t))
is "var-iable-free."
For the remainder of the paper, we assume that G =
(T, N, P, a)
is a CFG and that G contains terminal vari- ableproductions
A --> vv(A)n
for each nonterminal A E N . The variablev(A)n
is said to have nonterminal typeA ;
thatis,
the nonterminal collection Nrepresents
the types. When the context isclear,
we use vn to denote the A-variablev(A)n .
Definition 2.2.1: For each CFG
G ,
there is asyntactic
con-gruence
relation,
theinterchange
congruence9(G),
for thefree derivation
category F(G) .
Twointerchange
congruent derivations differonly
in the order thatproductions
areapplied. 9(G)
isgenerated by
theinterchange
derivationpairs;
thatis,
ifWj
E(T
uN)*, j = 1,2,3 , A,
~N ,
i =
1,2 ,
andA; -->pi, S(Pi),
i =1,2 ,
then thefollowing
derivation
pair
is agenerating pair
ing(G) :
Each congruence class has a
unique
leftmost derivation whichcorresponds
to aunique
parse tree[HoU179];
hence the quo- tient categoryF(G)/g(G)
is called the"algebra"
of parse treesgenerated by
G. Under theinterchange
congruence, con- catenation of derivations ismonoidal;
thatis, g(G)
is thesmallest congruence so that
F(G)/1(G)
is monoidal(see,
forexample, [Be75]
or[Ne80]).
oDefinition 2.2.2: The
free path algebra P(G) generated by
theCFG G
[Ru87]
is the monoidal comma category[Mac71]
ofF(G)/1(G) .
(a)
The carrier sets(that is,
the sets used to define thealgebraic operations) correspond
to the nonterminal der- ivation sets(modulo g(G)) ;
thatis,
if AE N ,
then(b)
Theoperations correspond
to theproductions
as follows:(bl)
Each terminalproduction
A--4p ep , ep
E T* corres-ponds
to anullary operation
(b2)
Ifa
production,
thenis the
operation,
which we call ap-concatenation,
definedby:
Also we use the
following composition /
concatenation notation forp-concatenations:
Each
path
d inP(G) corresponds
to aunique
leftmostderivation from a nonterminal A to the sentential form
S(d) .
Example
2.2.3: In the free monoidalalgebra P(GFP) ,
theleftmost E-derivation
(a
derivation with domainE):
corresponds
to thealgebraically
constructed elementWhen no confusion
arises,
we use the derivation format forpaths
inP(G)
and suppress the "hat" notion.The
path algebra P(Gpp)
has thep-concatenations
thatare
given
below in Table 2. 0Table 2: FFP p -concatenation
Definition 2.2.4: The
free
termalgebra generated by
the CFGG
[Ru87]
is thetyped syntactic subalgebra T(G)
ofP(G) . (a)
The carrier setscorrespond
to the terminal derivation setswhere A e N .
(b)
Theoperations correspond
to theproductions
as follows:(bl)
Each non-variable terminalproduction
A-> ap, 3¡,
eT* corresponds
to anullary operation
The collection
AT(G)
of all non-variablenullary
opera- tions is the set of atoms ofT(G) .
(b2)
Each variableproduction
A ->, vicorresponds
to anullary operation
The collection
V(G)
of all variablenullary operations
is the set of
typed
variables ofT(G) .
(b3)
Ifa
production,
thenis called
p-concatenation
and is the restriction of thep-concatenation
inP(G) to l A l T x ... x l A l T.
Definition 2.2.5:
(a)
The extensionhomomorphism
t :P(G)
->T(G)
is definedas follows:
9 If A *->d
S(d)
is apath
inI AI ,
thent(d)
is theterm in
lAlT
where the leftmost terminal derivation extension
S (d)
*->vS(t(d)
is obtained fromS (d) by "repla- cing"
each nonterminalA; occuring
inS (d)
with thenext indexed
A,,-variable Ai
->v Vn+1 notoccuring
inS (d) .
(b)
The retractionhomomorphism
r :T(G)
->P(G)
is definedas follows:
. If
t=A*--4tS(t)
is a term inl A l T ,
thenr(t)
isthe "retract" derivation in I A I
obtained from t
by "replacing"
each variableAi->vn occuring
inS(t)
with its nonterminal typeA, (that is,
itsidentity
derivationAi *--->Ai) .
0Example
2.2.6: Forexample,
for the grammarGFP ,
thepath d,
has derivation extension term
t(dl)
While the term
has retract
The retract
homomorphism
r :T(G)
->P(G)
is used tomake
precise
the concept of substitution in the termalgebra T(G) .
Definition 2.2.7: A substitution in the free term
algebra T(G)
is a set 0 of variable-termspairs:
where,
for each variable v. of typeA1, Ai *->tn S(q)
is anA,-term.
For any A-term t = A *->tS(t)
inT(G)
and substi- tution8,
the instantiated term Ot is the derivation exten-sion
where
d(9)
is ther(t)-concatenation
inl S(r(t))l T formed using
the derivation collection(Ai *-> tn S(tn)} .
0Example
2.2.8: Forexample,
inT(GFP) ,
if theterm t1
and if 0 is the substitution
then the instantiated E-term
6t1
is the derivation concat-enation
(d, d2 d3) o r(t1) :
Definition 2.2.9:
Composition of
substitutions[Si89]
cor-responds
tocomposition
ofpaths
inP(G) , using
the retrac-tion
homomorphism
r :T(G)
->P(G) .
We use the notation02
o91
torepresent
thecomposition
of substitutions. If A *->S(t)
is an A-term inT(G) ,
then the instantiated term(02 - 0,)t
is the derivationcomposition
Composition
of substitutions is associative. D2.3 Unification and
Ambiguity
inSyntactic Algebras
Recall that a CFG G is
ambiguous
if there exist a terminalstring
e E T* and distinct leftmost derivations a*--->d1
e anda
*->d2
e . It ispossible
for a G to havesymbols
and pro- ductions that cannot occur in any derivation a *-> d e withdomain,
the start nonterminal J . There are effectivealgo-
rithms for
deleting
uselesssymbols
andproductions
from Gwithout
affecting
thelanguage L(G)
it generates. The"effectiveness
proofs" presented in,
forexample, [HoU179]
or[LePa8l], produce algebraically equivalent expression
sets.We assume that the CFG’s under consideration in the remainder of the paper are "useful"
(that is,
have no uselesssymbols
or
productions)
and generate a nontriviallanguage (that is, L(G) # (
e}).
In this case, the definition ofambiguity
ismore
general:
Proposition
2.3.1: Theuseful
CFG G isambiguous if
andonly if
there exists a nonterminal A and there exist distinctleft-most
A-derivations A*->d1
w and A*-->d2
W inP(G) .
DDefinition 2.3.2:
(a)
For each CFGG ,
there is asyntactic
congruence rela-tion,
theambiguity
congruenceA(G),
for the free deri- vation categoryF(G) .
Two derivationsAl *--->d1 S(dl)
and
A2 * -> d2 S(d2)
areambiguity
congruent, denotedd1 = d2 ,
if andonly
ifA,
=A2
andS(dl)
=S(d2) .
This
ambiguity
relationA(G) :
. is the
pre-order
congruence[Mac71]
onF(G) ;
. includes the
interchange
congruenceg(G) ;
. identifies any two derivations
having
the same domainsand codomains.
(b)
Thequotient algebra P(G) / 4(G)
is thetyped
monoidalalgebra S(G)
of sententialforms generated by
G . Eachambiguity
congruence class[A
*-->dS(d)]
inS(G)
can beviewed as a sentential form w =
S(d)
of typeA ,
since the actual derivation d is not "remembered" inS(G) .
(c)
Thequotient algebra T(G) / A(G)
is thetyped
monoidalalgebra E(G)
ofexpressions (terms) generated by
G .(d)
Theambiguity
congruenceA(G)
is said to be(finitely) generated by
a(finite)
setA , A C A(G),
ifA(G) = [A] ,
where[A]
is the smallest monoidal congru-ence
containing
the set A of derivationpairs.
oFor a nontrivial useful CFG
G ,
a second characterisa- tion ofambiguity
is:Proposition
2.3.3: A CFG G isambiguous if
andonly if
either
of
thefollowing equivalent
conditions hold:(1)
Theinterchange
congruence9(G)
is a propersubcongru-
ence
of
theambiguity
congruenceA(G)
(2)
There is a nonempty set Aof pairs of
distinct leftmost derivations that generates theambiguity
congruenceA(G) .
DIn order to demonstrate that the Unification and Homo- topy Extension Problems are
equivalent
to the undecidableAmbiguity
and IntersectionProblems,
we present the concept of unification inT(G)
and then show thatgiving
a finiteunification
algorithm
forT(G)
isequivalent
togiving
a fi-nite set A of generators for the
ambiguity
congruenceA(G) .
Recall that G is a CFG which is
useful,
nontrivial and hastyped
variableproductions {A
->vvn}
for each nonter-minal type A . Definition 2.3.4:
(a)
Lettl and t2
be A-terms inT(G) .
A substitutionpair (81,82)
is aunifier
for the(tl,t2)
if andonly
if(81t1 , 82t2)
is anambiguity
derivationpair
inA(G);
that