here

(1)

Baoluo Meng¹, Andrew Reynolds¹, Cesare Tinelli¹, and Clark Barrett²

1 Department of Computer Science, The University of Iowa

2 Department of Computer Science, Stanford University

Abstract. Relational logic is useful for reasoning about computational problems with relational structures, including high-level system design, architectural configurations of network systems, ontologies, and verification of programs with linked data structures. We present a modular extension of an earlier calculus for the theory of finite sets to a theory of finite relations with such operations as transpose, product, join, and transitive closure. We implement this extension as a theory solver of the SMT solver CVC4. Combining this new solver with the finite model finding features of CVC4 enables several compelling use cases.

For instance, native support for relations enables a natural mapping from Alloy, a declarative modeling language based on first-order relational logic, to SMT constraints. It also enables a natural encoding of several description logics with concrete domains, allowing the use of an SMT solver to analyze, for instance, Web Ontology Language (OWL) models. We provide an initial evaluation of our solver on a number of Alloy and OWL models which shows promising results.

Keywords: Relational Logic, SMT, Alloy, OWL

1 Introduction

Many computational problems require reasoning about relational structures. Examples include high-level system design, architectural configuration of network systems, reasoning about ontologies, and verification of programs with linked data structures. Re- lational logic is an appealing choice for reasoning about such problems. In this paper, we consider a many-sorted relational logic where relations of aritynare defined as sets ofn-tuples with parametrized sorts for tuple elements. We define a version of this logic as a first-order theory of finite relations where relation terms are built from relation constants and variables, set operators, and relational operators such as join, transpose, product, and transitive closure.

In previous work [3], Bansal et al. developed a decision procedure for a theory of finite sets with cardinality constraints. The theory of finite relations presented here is an extension of that theory to relational constraints. We present a calculus for the satisfiability of quantifier-free formulas in this theory. Our calculus is in general refutation- sound and model-sound. It is also terminating and refutation-complete for a restricted class of quantifier-free formulas that has useful applications.

The calculus is explicitly designed to be implementable as a theory solver in SMT solvers based on the DPLL(T) architecture [14]. We have implemented a modular component for it in our SMT solverCVC4[4], allowing CVC4to solve constraints on relations over elements from any of the theories it supports. This relational extension of

(2)

CVC4’s native input language enables natural mappings to SMT formulas from several modeling languages based on relations. This includes Alloy, a formal language based on first-order relational logic, as well as ontology languages such as OWL. A significant potential advantage of these mappings is that they bring to these languages the power of SMT solvers to reason natively about a variety of interpreted types, something that is challenging for existing reasoners for these languages.

1.1 Related Work

Alloy is a well-known declarative modeling language based on a relational logic with transitive closure, set cardinality, and integer arithmetic operators [13]. Alloy specifications, or models, can be analyzed for consistency or other entailment properties with the Alloy Analyzer, a static analyzer based on encoding models as propositional logic formulas, and passing them to an off-the-shelf propositional satisfiability (SAT) solver. This approach limits the analysis to models with explicit and concrete cardinality bounds on the relations involved; hence it is appropriate only for proving the consistency of a model or for disproving that a given property, encoded as a formula, holds for a model. Despite these limitations, Alloy and its analyzer have been quite useful for lightweight modeling and analysis of software systems. An earlier attempt to solve Alloy constraints without artificial cardinality bounds on relations was made by Ghazi and Taghdiri [8] using SMT solvers. They developed a translation from a subset of Alloy’s language to the SMT-LIB language [5] and used the SMT solver Yices [7]

to solve the resulting constraints. That approach can prove some properties of certain Alloy models, but it still requires explicitlyfinitizingrelations when dealing with transitive closure, limiting the kind of properties that can be proven in models that contain applications of transitive closure. Later, the same authors introduced more general methods [9, 10], implemented in the AlloyPE tool, to axiomatize relational operators as SMT formulas without finitization while covering the entire core language of Alloy.

However, since quantified relational logic is in general undecidable and quantifiers are heavily used in the translation, the resulting SMT formulas translated from Alloy are often difficult or even impossible for SMT solvers to solve, especially when transitive closure is involved.

Description Logics (DLs) [1, 2] are decidable fragments of relational logic explicitly developed for efficient knowledge representation and reasoning. They consider on purpose only unary and binary relations. The main building blocks of DLs are individuals, concepts, roles as well as operations over these, where concepts represent sets of individuals, roles represent binary relations between individuals, and operations include membership, subset, relational composition (join) and equality. Restricted use of quantifiers is allowed in DLs to encode more expressive constraints on roles and concepts.

OWL [20], a standardized semantic web ontology language, represents an important application of DLs to ontological modeling. It consists of entities similar to those in DLs except for superficial differences in concrete syntax and for the inclusion of additional features that make reasoning about OWL models undecidable in general. Many efficient reasoners have been built for reasoning about OWL ontologies written in restricted fragments of OWL. These include Konclude [16], FaCT++ [18], and Chainsaw [19].

(3)

1.2 Formal Preliminaries

We define our theory of relations and our calculus in the context of many-sorted first- order logic with equality. We assume the reader is familiar with the following notions from that logic: signature, term, literal, formula, free variable, interpretation, and satisfiability of a formula in an interpretation (see, e.g., [6] for more details). LetΣbe a many-sorted signature. We will use≈as the (infix) logical symbol for equality—which has typeσ×σfor all sortsσinΣ and is always interpreted as the identity relation.

We assume all signaturesΣcontain the Boolean sortBool, always interpreted as the binary set{true,false}, and a Boolean constant symboltruefortrue. Without loss of generality, we assume≈is the only predicate symbol inΣ, as all other predicates may be modeled as functions with return sortBool. We will commonly write, e.g.P(x), as shorthand forP(x)≈truewhereP(x)has sortBool. We writes 6≈tas an abbreviation of¬s ≈t. Ifeis a term or a formula, we denote byVars(e)the set ofe’s free variables, extending the notation to tuples and sets of terms or formulas as expected.

We writeϕ[x]to indicate that all the free variables of a formulaϕare from tuplex.

Ifϕis aΣ-formula andI aΣ-interpretation, we writeI |= ϕifI satisfiesϕ. If tis a term, we denote byt^I the value oftinI. Atheoryis a pairT = (Σ,I), where Σis a signature andIis a class ofΣ-interpretations that is closed under variable reas- signment (i.e., everyΣ-interpretation that differs from one inIonly in how it interprets the variables is also inI).Iis also referred to as themodels ofT. AΣ-formulaϕis satisfiable(resp.,unsatisfiable)inT if it is satisfied by some (resp., no) interpretation inI. A setΓ ofΣ-formulasentails inT aΣ-formulaϕ, writtenΓ |=T ϕ, if every interpretation inIthat satisfies all formulas inΓ satisfiesϕas well. We write|=T ϕas an abbreviation for∅ |=T ϕ. We writeΓ |=ϕto denote thatΓ entailsϕin the class of allΣ-interpretations. TwoΣ-formulas areequisatisfiable inT if for every modelAof T that satisfies one, there is a model ofT that satisfies the other and differs fromAat most over the free variables not shared by the two formulas. When convenient, we will treat a finite set of formulas as the conjunction of its elements and vice versa.

2 A Relational Extension to a Theory of Finite Sets

A many-sorted theory of finite sets with cardinalityT_S is described in detail in our previous work [3]. The theoryT_S includes a set sort constructorSet(α)parametrized by the sort of the set elements. The theoryTScan be combined with any other theoryT Nelson-Oppen-style, by instantiating the parameterαwith any sort inT. The signature of theoryTSincludes function and predicate symbols for set union (t), intersection (u), difference (\), the empty set ([ ]), singleton set construction ([ ]), set inclusion (v), and membership (@−), all interpreted as expected. A sound, complete and terminating tableaux-style calculus for the theoryTS is implemented in theCVC4SMT solver [4].

In this section, we describe an extensionTRof this theory which, however, does not include a set cardinality operator or cardinality constraints.³The new theoryTRextends TS with a parametric tuple sortTup_n(α1, . . . , αn)for everyn >0 and various relational operators defined over sets of tuples, that is, over values whose sort is an instance

3A further extension of the theory to cardinality constraints is planned for future work.

(4)

Set symbols:

[ ] :Set(α) t,u,\:Set(α)×Set(α)→Set(α) @−:α×Set(α)→Bool [ ] :α→Set(α) v:Set(α)×Set(α)→Bool

Relation symbols:

h , . . . , i:α1× · · · ×αn→Tup_n(α1, . . . , αn)

∗:Relm(α)×Reln(β)→Relm+n(α,β) o

n:Relp+1(α, γ)×Relq+1(γ,β)→Relp+q(α,β)withp+q >0

−1:Relm(α1, . . . , αm)→Relm(αm, . . . , α1) ⁺:Rel2(α, α)→Rel2(α, α)

wherem, n >0,α= (α1, . . . , αm),β= (β1, . . . , βn), andReln(γ) =Set(Tup_n(γ)).

Fig. 1: SignatureΣRof our relational theoryTR.

of Set(Tup_n(α₁, . . . , α_n)). We call any sortσof the form Set(Tup_n(σ₁, . . . , σ_n))a relational sort (of arityn)and abbreviate it asRel_n(σ₁, . . . , σ_n).

The full signatureΣRofTRis defined in Figure 1. Note that the function symbols

∗,on, and⁻¹are not only parametric but also overloaded, as they apply to relational sortsRelk(σ)of different aritiesk. The models ofTRare the expansions of the models ofTS that interpreth, . . . , ias then-tuple constructor,∗as relational product,onas relational join, ⁻¹as the transpose operator, and ⁺as the transitive closure operator.

Arelation term is aΣR-term of some relational sort. Atuple term is aΣR-term of some tuple sort. ATR-constraint is a (dis)equality of the form(¬)s≈t, wheresand t areΣR-terms. We writes 6@− t and[t1, . . . , tn] withn > 1 as an abbreviation of

¬(s@−t≈true)and[t1]t · · · t[tn], respectively.

3 A Calculus for the Relational Extension

In this section, we describe a tableaux-style calculus for determining the satisfiability of finite sets ofTR-constraints. The calculus consists of a set of derivation rules similar to those in the calculus from [3] that deal with set constraints as well as new rules to processTR-constraints. For simplicity, we will implicitly assume that the sort of any set or relation term is flat (i.e., set or relation elements are not themselves sets or relations) and allow only variables as terms of element sorts. Nested sets and relations and more complex element terms can be processed in a standard way by using a Nelson-Oppen- style approach which we will not discuss here.

The derivation rules modify astate data structure, where a state is either the dis- tinguished stateunsator a setSofT_R-constraints. The rules are provided in Figure 2 and 3 in guarded assignment form. In such form, the premises of a rule refer to the current stateSand the conclusion describes howSis changed by the rule’s application.

Rules with two or more conclusions, separated by the symbolk, are non-deterministic branching rules. In the rules, we writeS, cas an abbreviation ofS ∪ {c}, and denote by T(S)the set of all terms and subterms occurring inS. We define the following closure

(5)

INTERUP

x@−s∈ S^∗ x@−t∈ S^∗ sut∈ T(S) S:=S, x@−sut

INTERDOWN

x@−sut∈ S^∗ S:=S, x@−s, x@−t UNIONUP

x@−u∈ S^∗ u∈ {s, t} stt∈ T(S) S:=S, x@−stt

UNIONDOWN

x@−stt∈ S^∗ S:=S, x@−s k S:=S, x@−t DIFFUP

x@−s∈ S^∗ s\t∈ T(S) S:=S, x@−t k S:=S, x@−s\t

DIFFDOWN

x@−s\t∈ S^∗ S:=S, x@−s, x6@−t SINGLEUP

[x]∈ T(S) S:=S, x@−[x]

SINGLEDOWN

x@−[y]∈ S^∗ S:=S, x≈y

EMPTYUNSAT

x@−[ ]∈ S^∗ unsat SETDISEQ

s6≈t∈ S^∗

S :=S, z@−s, z6@−t k S:=S, z6@−s, z@−t

EQUNSAT

(t6≈t)∈ S^∗ unsat

Fig. 2: Basic rules for set intersection, union, difference, singleton, disequality and contradiction. In SETDISEQ,zis a fresh variable.

operator forSwhere|=tupdenotes entailment in theΣR-theory of tuples:⁴ S^∗={s≈t|s, t∈ T(S), S |=tups≈t} ∪

{s6≈t|s, t∈ T(S), S |=tups≈s⁰∧t≈t⁰for somes⁰6≈t⁰ ∈ S} ∪ {s@−t|s, t∈ T(S), S |=tups≈s⁰∧t≈t⁰for somes⁰@−t⁰ ∈ S}

The setS^∗is computable by extending congruence closure procedures with a rule for deducing consequences of tuple equalities of the formhs1, . . . , sni ≈ ht1, . . . , tni.

A derivation ruleapplies to a stateS if all the conditions in the rule’s premises hold for S and the rule application is not redundant. An application of a rule to a stateSwith a conclusionS ∪ {ϕ1[x₁,z], . . . , ϕ_n[x_n,z]}, wherezare the fresh variables introduced by the rule’s application (if any), isredundant ifS already contains ϕ₁[x₁,t], . . . , ϕ_n[x_n,t]for some termst.

For simplicity and without loss of generality, we consider only initial statesS₀that contain no variables of tuple sortsTup_n(σ₁, . . . , σ_n), since such variables can be replaced by a tuplehx1, . . . , xniwhere eachxi is a variable of sortσi. We also assume thatS0contains no atoms of the forms vt, since they can be replaced bys≈sut, or disequalities hs1, . . . , sni 6≈ ht1, . . . , tnibetween tuple terms, since those can be treated by guessing a disequalitysi 6≈tibetween two of their respective components.

All derivation rules preserve these restrictions.

Figure 2 presents the basic rules for the core set constraints in our theory. For each set operator, Figure 2 contains a downwardrule and anupward rule. Given a mem-

4Note that this theory has all the function symbols of ΣR, not just the tuple constructors h, . . . , i. The extra symbols are treated asuninterpreted.

(6)

TRANSPUP

hx1, . . . , xni@−R∈ S^∗ R⁻¹∈ T(S) S:=S,hxn, . . . , x1i@−R⁻¹

TRANSPDOWN

hx1, . . . , xni@−R⁻¹∈ S^∗ S:=S,hxn, . . . , x1i@−R PRODUP

hx1, . . . , xmi@−R1∈ S^∗ hy1, . . . , yni@−R2∈ S^∗ R1∗R2∈ T(S) S:=S,hx1, . . . , xm, y1, . . . , yni@−R1∗R2

PRODDOWN

hx1, . . . , xm, y1, . . . , yni@−R1∗R2∈ S^∗ ar(R1) =m S:=S,hx1, . . . , xmi@−R1,hy1, . . . , yni@−R2

JOINUP

hx1, . . . , xm, zi@−R1,hz, y1, . . . , yni@−R2∈ S^∗ m+n >0 R1onR2∈ T(S) S:=S,hx1, . . . , xm, y1, . . . , yni@−R1onR2

JOINDOWN

hx1, . . . , xm, y1, . . . , yni@−R1noR2∈ S^∗ ar(R1) =m+ 1 S:=S,hx1, . . . , xm, zi@−R1,hz, y1, . . . , yni@−R2

TCLOSUPI

hx1, x2i@−R∈ S^∗ R⁺∈ T(S) S:=S,hx1, x2i@−R⁺

TCLOSUPII

hx1, x2i@−R⁺,hx2, x3i@−R⁺∈ S^∗ S :=S,hx1, x3i@−R⁺ TCLOSDOWN

hx1, x2i@−R⁺∈ S^∗

S:=S,hx1, x2i@−R k S:=S,hx1, zi@−R,hz, x2i@−R k S:=S,hx1, z1i@−R,hz1, z2i@−R⁺,hz2, x2i@−R, z16≈z2

Fig. 3: Basic relational derivation rules. Lettersz, z1, z2denote fresh variables.

bership constraintx@−s, the downward rules infer either additional membership constraints over the immediate subterms ofs, or an equality in the case wheresis{y}. For example, rule INTERDOWN, infers the constraintsx@−sandx@−tifS^∗contains the constraintx@−sut. The upward rules handle the case where some setsoccurs inS, and infer membership constraints of the formx @−sbased on other constraints from S. Rule SETDISEQintroduces a witness for a disequality between two setssandtby using a fresh variablezto assert that there is an element that is insbut nott, or intbut not ins. There are two rules for derivingunsatfrom trivially unsatisfiable constraints inS: membership constraints of the formx@− ∅(EMPTYUNSAT) and disequalities of the formt6≈t(EQUNSAT).

We supplement the set-specific rules with an additional set of rules forT_R-constraints, given in Figure 3. From the membership of a tuple in the transpose of a relation R, rule TRANSPDOWNconcludes that the reverse of the tuple is inR. Conversely, rule TRANSPUPensures that the reverse of a tuple is in the transpose of a relationRif the tuple is inR andR⁻¹ occurs inS. From the constraint that a tupletbelongs to the join of two relationsR1 andR2 with aritiesm andnrespectively, rule JOINDOWN

infers that R1 contains a tuplet1 whose last element (named using a fresh Skolem

(7)

variablez) is the first element of a tuple t₂ in R₂, wheret is the join oft₁ andt₂. The JOINUPrule computes the join of pairs of tuples explicitly asserted to belong to a relationR₁and a relationR₂, respectively, provided thatR₁ on R₂ is a term inS.

The PRODDOWNand PRODUPrules are defined similarly for the product of relations.

The rules TCLOSUPI and TCLOSUPII compute members of the transitive closure ofRbased on the (currently asserted) members ofR. When it can be inferreed that a tuplehx1, x2ibelongs to the transitive closure of a binary relationR, TCLOSDOWN

can produce three alternative conclusions. In reachability terms, the first conclusion considers the case thatx2is directly reachable fromx1in the graph induced byR, the second thatx2is reachable fromx1in two steps, and the third that it is reachable in more steps. Note that the third case may lead to additional applications of TCLOSDOWN, possibly indefinitely, if the other constraints inS(implicitly) entail thathx1, x2idoes not in fact belong toR.

Example 1. LetS ={hx, yi@−R⁻¹, R ≈S,hy, xi 6@−S}. By rule TRANSPDOWN, we can derive a constraint hy, xi @− R, leading to a new S: {hx, yi @− R⁻¹, R ≈ S,hy, xi 6@−S,hy, xi@−R}. Then,hy, xi@−Ris both equal and disequal totrueinS^∗. Thus, we can deriveunsatby EQUNSAT, and conclude thatSisTR-unsatisfiable. ut Example 2. LetS be {hxi @− R,hyi @− R, R∗R ≈ SuT,hy, xi 6@− T}. By rule PRODUP, we derive constraintshx, yi@−R∗R,hy, xi@−R∗R,hx, xi@−R∗Rand hy, yi@−R∗R. By set reasoning rule INTERDOWN, we derive another four constraints hx, yi @−S,hy, xi @−S,hx, yi @−T, andhy, xi@− T, leading to a contradiction with hy, xi 6@−T. Thus, we can deriveunsatby rule EQUNSAT. ut Example 3. LetSbe{hx, yi@−R,hy, zi@−R,hx, zi 6@−R⁺}. By rule TCLOSUPI, we derive two new constraintshx, yi@−R⁺andhy, zi@−R⁺. Then, we can derive another constraint hx, zi @− R⁺, by rule TCLOSUPII, in contradiction with hx, zi 6@− R⁺.

Thus, we can deriveunsatby rule EQUNSAT. ut

Example 4. LetSbe{hx, yi@−R⁺,hx, yi 6@−R}. By rule TCLOSDOWN, we construct a derivation tree with three child branches, which add to S the sets {hx, yi @− R}, {hx, zi @−R,hz, yi @−R}, and{hx, z₁i@− R,hz₁, z₂i@− R⁺,hz₂, yi@− R, z₁ 6≈ z₂} respectively, wherez1 andz2 are fresh variables. By rule EQUNSAT, we can derive unsatin the first branch. Since no rules apply to the second branch, we can conclude,

as we will see, thatSisTR-satisfiable. ut

4 Calculus Correctness

In this section, we formalize the correctness properties satisfied by our calculus. These include refutation and model soundness in general and termination over a fragment of our language of constraints.The rules of the calculus define a notion ofderivation trees.

These are possibly infinite trees whose nodes are states where the children of each non- leaf node are the result of applying one of the derivation rules of the calculus to that node. A finite branch of a derivation tree isclosed if it ends withunsat; it issaturated if no rules apply to its leaf. A derivation tree isclosed if all of its branches are closed.

(8)

Lemma 1. For all applications of a rule from the calculus, the parent stateS isT_R- satisfiable only if there exists a childS⁰that isT_R-satisfiable.

Proof. We show this holds for each rule of the calculus. First, we remark that by construction,S^∗is entailed bySinTR. For the rules EMPTYUNSATand EQUNSATwith conclusionunsat, we have thatS is clearlyTR-unsatisfiable, sinceS^∗contains a constraint that isTR-unsatisfiable.

For the rules UNIONDOWNand DIFFUP, assumingSis satisfied by some interpre- tationI, that interpretation also satisfies one of the child branches. For rule SETDISEQ, ifSisT_R-satisfiable ands6≈t∈ S^∗, then all models ofSare such that∃x.x6@−u∧x@− v for{u, v} ={s, t}, and thus one of the child branches is T_R-satisfiable. Similarly, for rule TCLOSDOWN, ifhx₁, x₂i@−R⁺∈ S^∗, then ifSis satisfied by some interpre- tationI, that interpretation must satisfy eitherhx₁, x₂i @−Rorhx₁, y₁i,hy₁, y₂i,. . ., hy_n−1, yni,hyn, x2i ∈ Rfor somey1, . . . , yn wheren ≥1andy^I_i 6= y_j^I fori 6=j.

We have thatIsatisfies the first conclusion of TCLOSDOWNin the former case, the second conclusion of TCLOSDOWNin the latter case wheren= 1andzÎ =yÎ₁, and the third conclusion of TCLOSDOWNin the latter case where n > 1,z₁Î = yÎ₁ and z₂Î=y_nÎ. Thus, at least one child branch isTR-satisfiable.

All other rules of the calculus have a single conclusion of the form S⁰ = S ∪ {ϕ1, . . . , ϕn}. For each of these rules, it can be shown thatS |=T_R ∃zϕ1∧. . .∧ϕn, wherez are the fresh variables introduced by the rule application. For example, for INTERDOWN, we have thatx@−sut∈ S^∗. We have thatS⁰ =S ∪ {x@−s, x@−t}.

Sincex@−sut∈ S^∗, we have thatS |=_T_Rx@−sut, and henceS |=_T_R x@−s, x@−t as well. ThusS⁰andSare equivalent for all rule applications with a single conclusion,

and the statement holds for these rules as well. ut

Proposition 1 (Refutation Soundness).If there is a closed derivation tree with root nodeS, thenSisTR-unsatisfiable.

Proof. Assume there exists a closed derivation tree with root nodeS. Since all leaves of this tree are unsat, then for each node S⁰ of this derivation tree, it follows from Lemma 1 by induction on the structure of the derivation tree thatS⁰isT_R-unsatisfiable.

Thus,SisTR-unsatisfiable and the proposition holds. ut Proposition 2 (Model Soundness).LetSbe the leaf of a saturated branch in a derivation tree with root nodeS0. There is a modelIofTRthat satisfiesS0where:

1. for allS∈Vars(S)of set sort,S^I= x^I

x@−S∈ S^∗ , and 2. for all otherx, y∈Vars(S),x^I=y^Iif and only ifx≈y∈ S^∗.

Proof. LetSbe a the leaf of a saturated branch in a derivation tree with root nodeS₀. LetIbe a model ofTRwith the above properties.

We argue thatI is a model ofS. We first show an intermediate fact involving the interpretation of set terms inI. For any set terms, we define:

Elements(s) ={x^I|x@−s∈ S^∗} We show by structural induction that for all setss∈ T(S):

Elements(s) =s^I (1)

(9)

We show this for each case ofs.

Case 1 (sis a variable).In this case,Elements(s) = s^I by the first part of our construction ofI.

Case 2 (sis[ ]).Since EMPTYUNSATdoes not apply toS, there are no constraints of the formx@−sinS^∗, thusElements(s) =∅=s^I.

Case 3 (sis[x]).Since SINGLEUPdoes not apply to S, there must be a constraint of the form x @− s inS^∗, thus sÎ ⊆ Elements(s). For all constraints of the form y @−s∈ S^∗, since SINGLEDOWNdoes not apply toS, we have thaty ≈x∈ S, and thusyÎ=xÎ. Thus,Elements(s)⊆sÎ, and thussÎ = Elements(s).

Case 4 (siss⁻¹₁ ).By the induction hypothesis we have thatElements(s₁) =sÎ₁, and sÎ = s⁻¹₁ Î = Elements(s₁)⁻¹. Since TRANSPUP does not apply to S, for each hx₁, . . . , x_ni @− s₁ ∈ S^∗, we have that hx_n, . . . , x₁i @− s⁻¹₁ ∈ S. Thus, for each hx₁, . . . , x_niÎ ∈ Elements(s₁), we have that(hx₁, . . . , x_niÎ)⁻¹ ∈ Elements(s⁻¹₁ ), and thus we have thatElements(s1)⁻¹ ⊆ Elements(s). Since TRANSPDOWNdoes not apply, for eachhx1, . . . , xni @−s⁻¹₁ ∈ S^∗, we have thathxn, . . . , x1i@− s1 ∈ S.

Thus, we have thatElements(s) ⊆Elements(s1)⁻¹, thuss^I = Elements(s1)⁻¹ = Elements(s).

Case 5 (siss⁺₁).By the induction hypothesis we have thatElements(s1) = sÎ₁, and sÎ =s⁺₁Î = Elements(s1)⁺. Since TCLOSUPI does not apply toS, we have that for eachhx1, x₂i @− s₁ ∈ S^∗, we have thathx1, x₂i @− s⁺₁ ∈ S. Thus, we have that Elements(s₁)⊆Elements(s). Since TCLOSUPII does not apply toS, we have that Elements(s)⁺= Elements(s), and hence by the previous subset relation we have that Elements(s₁)⁺⊆Elements(s).

To showElements(s)⊆Elements(s₁)⁺, we assign a measurem(hx₁, x₂i)to each tuplehx₁, x₂ifor whichhx₁, x₂i@−s⁺₁ ∈ S^∗. For each such tuple, since TCLOSDOWN

does not apply toS, we are in one of three cases. If S is in a branch beneath either the first or second conclusion of TCLOSDOWNapplied tohx1, x2i @− s⁺₁ ∈ S^∗, we saym(hx1, x2i) = 0. Otherwise, ifS is in a branch beneath the third conclusion, we have thathz1, z2i @− s⁺₁ ∈ S for some variables z1, z2, and we say m(hx1, x2i) = m(hz1, z2i)+1. Notice thatm(hx1, x2i)is finite for allhx1, x2isinceSis a finite set. By induction onm(hx1, x2i), we show thathx1, x2iÎ∈Elements(s1)⁺. Ifm(hx1, x2i) = 0, then eitherhx1, x2i@− s1or for somezbothhx1, zi@− s1andhz, x2i @−s1are in S. In both cases, we have thathx1, x2iÎ∈Elements(s1)⁺. Ifm(hx1, x2i)>0, then it must be the case thathx1, z1i@−s1,hz1, z2i@−s⁺₁, andhz2, x2i@−s1are inSfor some z1, z2. Sincem(hz1, z2i) =m(hx1, x2i)−1, by the induction hypothesis, we have that hz1, z₂iÎ ∈ Elements(s₁)⁺. The other two constraints inSimply thathx1, z₁iÎ and hz2, x₂iÎ are inElements(s₁)and hencehx1, x₂iÎ ∈Elements(s₁)⁺. Thus, we have thatElements(s)⊆Elements(s₁)⁺. Thus,sÎ = Elements(s₁)⁺= Elements(s).

Case 6 (siss₁⊕s₂, where⊕ ∈ {u,t,\,on,∗}).For all other operators, by the induction hypothesis, we have that Elements(s1) = sÎ₁ andElements(s2) = sÎ₂. We have that sÎ = sÎ₁ ⊕sÎ₂ = Elements(s1)⊕Elements(s2). For each operator ⊕,

(10)

the upward rule of the calculus for⊕ensures thatElements(s₁)⊕Elements(s₂) ⊆ Elements(s)and the downward rule of the calculus for⊕ensures thatElements(s)⊆ Elements(s₁)⊕Elements(s₂). For example, fort, since UNIONUPdoes not apply toS, we have that for eachx @− si inS^∗wherei ∈ {1,2}, it must be the case that x@−s1ts2is inSas well, and thusElements(s1)tElements(s2)⊆Elements(s).

Since UNIONDOWNdoes not apply toS, we have that for eachx@−s1ts2, it must be the case thatx@−sifor somei∈ {1,2}, and thusElements(s)⊆Elements(s1)t Elements(s2). Similar reasoning showsElements(s1)⊕Elements(s2) = Elements(s) for the other operators, and thuss^I= Elements(s).

Thus,Elements(s) = s^I for all termssof set sort in T(S). We now show that all constraints inSare satisfied byI.

(Dis)equalities between non-set variables)SinceS^∗ is a superset ofS, all equalities between non-set variablesx≈ y inS are satisfied due to the second part of our construction ofI. Since the EQUNSATdoes not apply toS, all disequalitiesx 6≈ y between non-set variables inS are satisfied due to the second part of our construction ofIas well.

(Dis)equalities between tuples)If hx1, . . . , xni ≈ hy1, . . . , yni ∈ S, then x1 ≈ y1, . . . , xn ≈ yn ∈ S^∗ by the first part of our construction of S^∗. By assumption, x1, . . . , xn, y1, . . . , yn are variables whose sort is not a tuple or set. Thus for the same reasoning as above, we have thatxÎ₁ =y₁Î, . . . , xÎ_n =yÎ_n and hencehx1, . . . , xniÎ = hy1, . . . , yniÎ. By assumption,Scontains no disequalities between tuples.

(Dis)equalities between set variables)For eachs₁ ≈s₂ ∈ S wheres₁ands₂are set terms, by the third part of our construction ofS^∗, we have thatx@−s₁ ∈ S^∗if and only ifx @− s₂ ∈ S^∗ for all termsx. Hence,Elements(s₁) = Elements(s₂). Since Elements(s) =sÎ for all set termssinT(S), we have thatsÎ₁ = Elements(s₁) = Elements(s₂) = sÎ₂, and hence I |= s₁ ≈ s₂. To show all disequalities s₁ 6≈ s₂ between set terms in S are satisfied, note that since rule SETDISEQ does not apply toS, we have z @− s1 ∈ S andz 6@− s2 ∈ S for some z, or vice versa. Since rule EQUNSATdoes not apply, we have thatElements(s1)6= Elements(s2),sÎ₁ 6=sÎ₂ and henceI |=s16≈s2.

Set memberships)For all x @− s ∈ S, we have thatx @− s ∈ S^∗ and thus xÎ ∈ Elements(s)by construction. SinceElements(s) = sÎ for all termssof set sort in T(S), we have that xÎ ∈ sÎ, and henceI |= x @− s. For allx 6@− s ∈ S, since EQUNSATdoes not apply toS, we have thatx@−s6∈ S^∗and hencexÎ 6∈Elements(s).

SinceElements(s) =sÎfor all termssof set sort inT(S), we have thatxÎ 6∈sÎ, and henceI |=x6@−s.

Thus,I satisfies all constraints inS. SinceS is a superset ofS0, we have thatI

satisfiesS0as well. ut

Our calculus is terminating for a sublanguage of constraints involving only unary and binary relations and excluding transitive closure, product, or equality between relations. While this sublanguage, defined in Figure 4, is quite restricted, it is useful in reductions of description logics to relational logic, which we discuss in Section 5.2.

Proposition 3 (Termination).IfSis a finite set of constraints generated by the grammar in Figure 4, then all derivation trees with root nodeSare finite.

(11)

(element) e :=x

(unary relation) u:=x|[ ]|u1tu2|u1uu2 |[hei]|bonu (binary relation) b :=x|[ ]|b1tb2|b1ub2|[he1, e2i]|b⁻¹ (constraint) ϕ:=e1≈e2| hei@−u| he1, e2i@−b| ¬ϕ1

Fig. 4: A restricted fragment ofTR-constraints. Letterxdenotes variables.

Proof. Assume thatSis a finite set containing only constraintsϕfrom the grammar in Figure 4. First, we construct the following mapping from relation terms to tuple terms.

LetD_u(resp.D_b) be a mapping from unary (resp. binary) relation terms to sets of unary (resp. binary) tuple terms defined as the least solution to the following set of constraints, where the e’s, the u’s and the b’s are implicitly universally quantified metavariables ranging over terms respectively of element, unary relation and binary relation sort:

hei ∈Du(u) if hei@−u∈ S

hei ∈Du(u1) if hei ∈Du(u2)andu1∈ T(u2) hze,b,ui ∈Du(u) if hei ∈Du(bonu)

he1, e₂i ∈D_b(b) if he1, e₂i@−b∈ S

he1, e₂i ∈D_b(b₁) if hei, e_ji ∈D_b(b₂)for{i, j}={1,2}andb₁∈ T(b₂) he, ze,b,ui ∈D_b(b) if hei ∈D_u(bonu)

whereze,b,u denotes a unique fresh variable for each value ofe,b, andu. We require only one such variable for each triple(e, b, u)since our redundancy criteria for rule applications ensures that JOINDOWNcannot be applied more than once for the same premisehei ∈Du(bonu). Intuitively,Dumaps each relation termuinSto an overap- proximation of the set of unary tuplesheifor which our calculus can infer the constraint hei@−uusing downward rules only, and similarly for the binary caseD_b. The domain of D_uand that ofD_bcontain only relation terms occurring inS, and thus are finite.

All sets in the ranges ofD_uandD_bare also finite. To show this, we argue that only a finite number of fresh variablesz_e,b,u are introduced by this construction. We define a measuredepthon element terms such thatdepth(e) = 0for alle∈ T(S), and depth(ze,b,u) = 1 +depth(e). For all variablesze,b,uin the range ofDuandDb, we have thatbonu∈ T(S), and ifeis a variable of the formze⁰,b⁰,u⁰, thenbonuis either a subterm ofb⁰oru⁰. Thus, the depth of all element terms in the range ofDuandDb

is finite. Since there are finitely many element terms inT(S), and finitely many terms of the formbonuinT(S), there are finitely many variables of the formze,b,uand thus finitely many element terms occur in tuple terms in the range ofDuandDb. Therefore, there are finitely many tuple terms in sets in the ranges ofDuandDb.

Now, letU_u(resp.U_b) be a mapping from unary (resp. binary) relation terms to sets of unary (resp. binary) tuple terms, constructed to be the least solution to the following set of constraints (where again the e’s, the u’s and the b’s are implicitly universally

(12)

quantified metavariables as above):

hei ∈U_u(u) ifhei ∈D_u(u)

hei ∈U_u(u₁) ifhei ∈U_u(u₂), u₂∈ T(u₁)andu₁∈ T(S) hei ∈Uu(u) ifu= [e]andu∈ T(S)

he1i ∈Uu(u1) ifu1=b1onu2andhe1, e2i ∈Ub(b1) he1, e2i ∈Ub(b) ifhe1, e2i ∈Db(b)

he1, e2i ∈Ub(b1) ifhei, eji ∈Ub(b2)for{i, j}={1,2}, b2∈ T(b1), b1∈ T(S) he1, e2i ∈Ub(b) ifb= [he1, e2i]andb∈ T(S)

Similar to the previous construction,Uumaps each relation termuinS to an over- approximation of the set of unary tuplesheifor which our calculus can infer the constraint hei @− uusing both downward and upward rules, and similarly for the binary caseU_b. By construction, since the domains ofD_uandD_bare subsets ofT(S), the domains ofU_uandU_bare also subsets ofT(S), and thus are finite, hence their respective rangesR_uandR_bare finite too. Each set inR_uorR_bis finite as well, since the tuples in these ranges are built from element termsethat occur in the range ofD_uandD_bor in singleton sets of the form[hei],[he, e⁰i]or[he⁰, ei]inT(S).

Now letSbbe the following set of constraints:

Sb={(¬)hei@−u| hei ∈Uu(u), u∈ T(S)} ∪ {(¬)he1, e2i@−b| he1, e2i ∈Ub(b), b∈ T(S)} ∪ {(¬)e1≈e2|e1, e2∈ T(Ru∪Rb)} ∪

{he1i ≈ he2i |e1, e2∈ T(Ru∪Rb)} ∪

{he1, e2i ≈ he3, e4i |e1, e2, e3, e4∈ T(Ru∪Rb)}

From the arguments above we can conclude thatSbis finite. By construction,S ⊆S.b One can show by structural induction on derivation trees that all descendants ofS in a derivation tree are also subsets ofSb. Since the size of a state strictly grows with each rule application not derivingunsat, we can conclude that no derivation tree can be grown indefinitely. Hence all derivation trees with rootSare finite. ut By Proposition 1, 2, and 3, we have thatanyrule application strategy for the calculus is a decision procedure for finite sets of constraints in the language generated by the grammar from Figure 4.

5 Applications of T

_R

Our main motivation for adding native support for relations in an SMT solver is that it enables more natural mappings from other logical formalisms ultimately based on relations. This opens the possibility of leveraging the power and flexibility of SMT to reason about problems expressed in those formalisms. We discuss here two potential applications: reasoning about Alloy specifications and reasoning about OWL ontologies.

It should be clear to the knowledgeable reader though that the set of potential applications is much larger, encompassing description logics in general as well as various modal logics—via an encoding of their accessibility relation.

(13)

5.1 Alloy specifications

Alloy is a formal specification language based on relational logic which is widely used for modeling structurally-rich problems [12]. Alloy specifications, calledmodelsin the Alloy literature, are built from relations and relational algebra operations in addition to the usual logical connectives and quantifiers. One can also specify expected properties of a specification as formulas, calledassertionsin Alloy, that should be entailed by the specification.

The analysis of Alloy specifications can be performed automatically by a tool called the Alloy Analyzer which uses as its reasoning engine Kodkod, a SAT-based finite model finder [17]. This requires the user to impose a (concrete, artificial) finite upper bound on the size of the domains of each relation, limiting the analyzer’s ability to determine that a specification is consistent or has a given property. In the first case, the user has to manually increase the bounds until the Alloy Analyzer is able to find a satisfying interpretation for the specification; in the second case, until it can find a counter-example for the property. As a consequence, the analyzer cannot be used to prove that(i)a specification is inconsistent or(ii)it does have a certain property.

In contrast, thanks to its new theory solver for relational constraints based on the calculus described earlier, CVC4 is now able in many cases to do(i) and(ii)automatically, with no artificial upper bounds on domain sizes. Also, because of its own finite model finding capabilities [15], it can find minimal satisfying interpretations for consistent specifications or minimal counter-examples for properties without the need of user-provided artificial upper bounds on domain sizes. Finally, since its relational theory solver is fully integrated with its theory solvers for other theories (such as lin- ear arithmetic, strings, arrays, and so on),CVC4can natively support mixed constraints using relations over its various built-in types, something that is possible in the Alloy Analyzer only in rather limited form.⁵ To evaluate CVC4’s capabilities in solving Al- loy problems, we have defined a translation from Alloy specifications to semantically equivalent SMT formulas that leverages our theory of relations. The translation focuses on Alloy’s kernel language since non-kernel features can be rewritten to the kernel language by the Alloy Analyzer itself. We sketch our translation below.⁶

In Alloy, asignatureis a set of uninterpreted atomic elements, calledatoms. Sig- natures are defined with a syntax that is reminiscent of classes in object-oriented languages. Arelation (of arityn) is a set ofn-tuples of atoms and is declared as afieldof some signatureS, which acts as the domain of the elements in the first component. Mul- tiplicity constraints on signatures and fields can be added with keywords such assome, no,lone, andone, which specify that a signature is non-empty, empty, has cardinality at most one, and is a singleton, respectively. Other keywords specify that a relation is one-to-one, one-to-many, and so on. One or more signatures can be declared to be subsets of another signature withextendsorin. Withextends, all specified signatures are additionally assumed to be mutually disjoint.

5The Alloy Analyzer currently has built-in support for bounded integers. Any other data types need to be axiomatized in the specification.

6The translation is sound only if all Alloy signatures are assumed to be finite. A full account of the translation and a proof of its soundness are beyond the scope of this paper.

(14)

In the translation, we introduce an uninterpreted sortAtom for Alloy atoms. An Alloy signaturesig Sis translated as a constant⁷Sof sortRel₁(Atom), that is, a set of unary tuples. A fieldf:S₁->· · · -> S_nof a signatureSis translated as a constant fof sortReln+1(Atom, . . . ,Atom)together with the additional constraintfvS∗S1∗

· · · ∗Sn to ensure that the components off’s tuples are from the intended signatures.

The signature hierarchy is encoded using subset constraints. For example, the Alloy constraintsig S₁,. . ., S_n extends Sis translated as the set of constraints{S1 v S, . . . ,Sn vS} ∪ {SiuSj ≈[ ] |1 ≤i < j ≤n}.IfSabove is also declared to be abstract(a notion similar to abstract classes in object-oriented languages), the additional constraintS1t · · · tSn ≈Sis added to enforce that. Similarly, signature declarations of the formsig S₁,· · ·, S_n in S, are translated just as{S1 v S, . . . ,Sn v S}.

Multiplicity constraints are translated as quantified formulas.

Since our theory supports all constructs and operators in Alloy’s kernel language, Alloy expressions and formulas, which can include quantifiers ranging over atoms, can be more or less directly translated to their counterparts inCVC4’s language. It is worth mentioning that our translation supports Alloy’s set comprehension construct, by in- troducing a fresh relational constant for the set and adding definitional axioms for it.

In addition, we partially support Alloy cardinality constraints of the form#(r)op n whereris a relation term,op ∈ {^<,>,=},n ∈ N, and ^#is the cardinality operator, by encoding them as subset constraints. For example, the Alloy constraint#(S) < 3 on a signatureSis translated toS v[hk1i,hk2i]whereSis the corresponding unary relation andk₁andk₂are two fresh constants of sortAtom.

5.2 OWL DL Ontologies

OWL is an ontology language whose current version, OWL 2, was adopted as a standard Semantic Web language by the W3 consortium. It includes a sublanguage, called OWL DL, that corresponds to the expressive, yet decidable, description logicSHOIN(D)[2].

We have defined an initial, partial translation fromSHOIN(D)to SMT formulas that again leverages our theory of relations.

A mapping from salientSHOIN(D)constructs to their SMT counterparts is illus- trated in Figure 5. The figure shows only relations whose elements do not belong to the so-calledconcrete domain(s)Dof SHOIN(D).⁸ As with the Alloy translation, we use the single sortAtomfor all elements of non-concrete domains. The set comprehension notation is used here for brevity: for a set comprehension term of the form[x|ϕ] wherexhasn-tuple sort, we introduce a fresh set constantSaccompanied by the defining axiom ∀x:Tup_n(Atom) (x @− S ⇔ ϕ). The constantUnivin the figure denotes the universal unary relation overAtom. Since this constant is currently not built-in as a symbol ofT_R, it is accompanied by the defining axiom∀a:Atomhai@−Univ. The ex- pressiondist(a1, . . . , an)states thata1, . . . , anare pairwise different. We observe how the translation is immediate for most constructs, with the notable exception of universal and number restrictions, which require the use of complex quantified formulas.

7Free constants have the same effect as free variables for satisfiability purposes.

8Some of those domains in OWL correspond to built-in sorts inCVC4. A full translation from OWL concrete domains toCVC4 built-in sorts is beyond the scope of this work.

(15)

DL CVC4

individual name:a a:Atom

atomic conceptC, roleR C:Rel1(Atom), R:Rel2(Atom,Atom) intersectionCuD, unionCtD CuD, CtD

inverse roleR⁻, complement¬C R⁻¹, Univ\C top concept>, bottom concept⊥ Univ, [ ] existential restriction∃R.C RnoC

universal restriction∀R.C [x|x@−Univ∧[x]onRvC]

at-least restriction≥nR.C [x|x@−Univ∧ ∃a1, . . . , an:Atom([x]onR)u Cw[ha1i, . . . ,hani]∧dist(a1, . . . , an) ] at-most restriction≤nR.C [x|x@−Univ∧ ∃a1, . . . , an:Atom([x]onR)u

C)v[ha1i, . . . ,hani]∧[ha1i, . . . ,hani]vC] local reflexivity∃R.Self [hx, yi | hx, yi@−R∧x≈y]

nominal{a} [hai]

concept, role assertionC(a), R(a,b) a@−C, ha,bi@−R individual (dis)equalitya≈b, a6≈b a≈b, a6≈b concept, role inclusionCvD, RvS CvD, RvS concept, role equiv.C≡D, R≡S C≈D, R≈S complex role inclusionR1◦R2vS R1onR2vS role disjointnessDisjoint(R,S) RuS≈[ ]

Fig. 5: A mapping from DL language toΣR-constraints.

6 Evaluation

To evaluate our theory solver forT_RinCVC4we implemented translators from Alloy and from OWL based on the translations sketched in the previous section. This section presents an initial evaluation on a selection of Alloy and OWL benchmarks.⁹

6.1 Experimental Evaluation on Alloy Models

We considered two sets of Alloy benchmarks; the first consists of 40 examples from the Alloy distribution and from a formal methods course taught by one of the authors;

the second were used in [10] to evaluate AlloyPE. All benchmarks consist of an Alloy model together with a single property. We evaluated two configurations ofCVC4. The first, denotedCVC4, enables full native support for relational operators via the calculus from Section 3. The second, denotedCVC4+AX, instead encodes all relational operators as uninterpreted functions and supplements the translation of benchmarks with additional axioms that specify their semantics. To compareCVC4with other tools, we also evaluated the Alloy Analyzer, version 4.2, downloaded from the Alloy website, and

9Detailed results and all benchmarks are available athttp://cvc4.cs.stanford.edu/papers/

CADE2017-relations/.

(16)

Alloy Analyzer CVC4 CVC4+AX AlloyPE Problem Res. Time/Scope Res. Time Res. Time Res. Time

academia 0 sat 0.60/3 sat 1.55 - to unk 84.76

academia 1 sat 0.53/2 sat 1.93 - to - to

academia 2 sat 0.45/2 sat 0.49 - to unk 0.15

social 1 sat 0.52/3 sat 1.20 - to n/a -

cf 0 sat 0.47/3 sat 0.51 - to n/a -

cf 1 sat 0.49/3 sat 0.78 - to n/a -

javatypes sat 0.50/3 sat 0.42 - to uns 2.35

set sat 0.45/2 sat 0.46 - to unk 0.92

loc int sat 0.57/1 sat 2.82 - to n/a -

genealogy sat 0.64/6 sat 89.20 - to n/a -

number 1 sat 0.81/2 sat 8.65 - to n/a -

railway sat 0.67/4 sat 156.45 - to n/a -

academia 3 b-uns 162.17/63 uns 0.49 uns 1.05 uns 0.28

academia 4 b-uns 246.92/162 uns 0.43 uns 0.54 uns 0.13

family 1 b-uns 146.62/68 uns 0.41 uns 0.44 uns 0.15

family 2 b-uns 279.77/30 uns 1.02 uns 48.78 uns 0.23

social 2 b-uns 256.98/56 uns 0.66 - to n/a -

social 3 b-uns 191.45/57 uns 0.49 uns 35.91 n/a -

social 4 b-uns 171.26/64 uns 0.46 uns 18.13 n/a -

birthday b-uns 156.08/53 uns 0.45 uns 0.61 uns 0.13

library b-uns 259.54/119 uns 0.42 uns 0.40 uns 1.11

lights b-uns 228.89/122 uns 32.69 - to n/a -

INSLabel b-uns 198.53/8 uns 1.46 - to n/a -

farmers 1 sat 1.04/8 - to - to n/a -

views sat 9.91/9 - to - to n/a -

Fig. 6: Evaluation on Alloy benchmarks.

El Ghazi et al.’s AlloyPE tool (kindly provided to us directly by its authors) using the SMT solver Z3 version 4.5.1 as a backend. All experiments were performed with a 300 second timeout on a machine with a 2.9 GHz Intel Core i7 CPU with 8 GB of memory.

Figures 6 and 7 show the results from running the Alloy Analyzer,CVC4and Al- loyPE on the two sets of Alloy benchmarks. We omit results for 13 of the benchmarks from the first set that no system solved. The second and third columns show the results of running the Alloy Analyzer. To evaluate the Alloy Analyzer on these benchmarks, we considered bounded scopes in an incremental fashion. Using a script, we set an initial upper bound, orscope, of 1 for the cardinality of all signatures in the problem, and kept increasing it by 1 if the Alloy Analyzer found the problem unsatisfiable (b-uns) in the current scope—meaning that it was not able todisprovethe property in the problem.

We terminated on time out (to), or when the analyzer was able to disprove the property (sat) or ran out of memory for a scope. We report the scope size for benchmarks where the tool returned an answer. The fourth and fifth columns show the results from

CVC4when invoked in finite model finding mode on the translated SMT problem. The last two columns are the results from AlloyPE, wheren/aindicates that AlloyPE failed due either to the presence of unsupported Alloy constructs in the input problem or to internal errors during solving or translation.

As shown in the table, our approach is overall slower than the Alloy Analyzer for satisfiable benchmarks. For the unsatisfiable ones, CVC4 returns an answer within a reasonable time limit for most of the benchmarks and has advantages over the state of the art. It is important to note, however, that anunsatanswer fromCVC4indicates that