HAL Id: cel-01203517
https://hal.archives-ouvertes.fr/cel-01203517
Submitted on 23 Sep 2015
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Programming and Proving: Practice with FoCaLiZe
François Pessaux
To cite this version:
François Pessaux. Programming and Proving: Practice with FoCaLiZe. Doctoral. Rennes, France. 2014, pp.44. �cel-01203517�
Lecture at ´Ecole Jeunes Chercheurs en Programmation (EJCP) 2014
Programming and Proving:
Practice with FoCaLiZe
Franc¸ois Pessaux September 23, 2015
Contents
1 Introduction 3
1.1 Why Proving? . . . 3
1.2 Proofs, Formal Proofs, Proofs of What? . . . 3
1.3 Scope of the Lecture . . . 5
1.4 Plan of the Lecture . . . 6
1.5 Notations . . . 6
2 Starting by the Beginning: First-Order Logic 6 2.1 A First Tautology . . . 6
2.2 Yet Another Tautology . . . 9
2.3 Je ne veux pas travailler. . . (Pink Martini) . . . 9
3 Basic Programming inFoCaLiZe 9 4 Structuring the Development by Encapsulation: Species and Collection 12 4.1 Components of a species . . . 12
4.2 Our Case of Study: Association Maps . . . 13
4.3 A First Attempt: Building our First Species (Poor Structuring) . . . 13
4.4 Final Encapsulation: Turning the Species into a Collection . . . 14
5 Adding Proofs in Species 16 5.1 Simple Proofs . . . 16
5.2 Proofs Getting Harder . . . 17
6 More Abstraction: Parameterization 20 6.1 A Second Attempt: Adding (not Enough) Collection Parameters . . . 20
6.1.1 Abstracting the Maps Keys/Values . . . 20
6.1.2 Adapting the Species . . . 21
6.1.3 Re-Introducing a First Proof . . . 21
6.1.4 The Proof that Kills . . . 22
6.2 A Third Attempt: Inheritance and Correct Modeling . . . 24
6.2.2 Adapting the Species AssocMap . . . 26
6.2.3 Re-Introducing Proofs . . . 26
7 When to Prove What? 31 7.1 When to prove? . . . 31
7.1.1 Expressing Specifications . . . 32
7.2 When Should we do the Proofs? . . . 32
7.2.1 How the Compiler may Help? . . . 32
8 Advanced Proofs 33 8.1 Proofs by Cases . . . 33
8.2 “Manual” Induction . . . 35
8.2.1 Simple Example . . . 36
8.2.2 Back to Association Maps . . . 37
This lecture has been prepared using FoCaLiZe version 0.8.0 “++”, i.e. the snapshot of the GIT repository of March 2014. The material related to this lecture will be available online at:
http://perso.ensta-paristech.fr/˜pessaux/ejcp-2014/
Additionally, resources related to FoCaLiZe can be consulted from its WEB site at http:// focalize.inria.fr. Issues and bugs (related to FoCaLiZe or Zenon) can be reported via the dedicated bug-tracker at http://focalize.inria.fr/bugzilla/.
This document was written also taking inspiration in several documents and presentations the Fo-CaLiZe team and other folks did write. I want to thank them even if I can’t cite all of them.
About the author: Franc¸ois Pessaux
ENSTA ParisTech - U2IS
828 boulevard des Mar´echaux, 91120 Palaiseau France
Email: francois.pessaux@ensta-paristech•fr
1
Introduction
Any decent developer should wonder when he wrote a piece of code whether this program “works”. Ho-wever, even if this fact seems trivial, one may be surprised by the amount of people not even addressing this question. When it is stated, in most cases, the programmer gets convinced (possibly convinces his boss or customer) that the code “works” on the basis of nebulous intuitions based on his (also nebulous) understanding of what “works” means (we will come back later on this point). And debug is still an important part of the cost of software development.
1.1 Why Proving?
Nowadays, software engineering is faced to systems of increasing complexity, with sharpened safety and/or security requirements. The problems differ, according to the domain: military, medical, transport, energy, finances, telecoms, etc. But the objective is the same: any software component embedded in those systems has to “properly work”. This goes through the respect of standards[10, 9, 4], which ask for the demonstration that the software is designed, developed, tested according to their prescriptions. Among the recommended practices, formal methods start to be advocated. Formal methods encompass number of techniques: abstract interpretation, model-checking, other static analyses, etc and formal proofs. These latter will be the topic of the present lecture.
For safety or security high levels, the demonstration is mandatory and must be done with formal methods. Moreover, this demonstration is examined by third parties experts, along what is called the assessment process. Aside this “basic industrial” point of view (which remains crucial since it aims at avoiding nuclear plants from exploding, planes from landing under the ground or banks from dividing their capital by zero), the exigence of a true demonstration is shared by numerous scientists, who want to demonstrate that indeed their program computes exactly what they wish, that a function terminates, that an inference system is sound and complete, etc.
Even if the concerns may differ in size and complexity, the question is the same: ensuring, with the highest possible certainty, that a piece of software behaves “correctly”.
1.2 Proofs, Formal Proofs, Proofs of What?
One can sometimes hear “I proved my program”. What does it mean ? Basically nothing. First, one can only prove that a program satisfies a given property: a proof is not an absolute essence of a program. This point is crucial since it means that to prove that a program “works”, one needs to state its expected properties, and to ultimately prove that the program satisifies them.
Now, let’s consider a program computing the absolute value of the difference between its arguments: abs diff(x, y) = if x > y then x − y else y − x. Now, I will “prove” it. . . I prove the property ∀x, x = x. Right, since = is an equivalence relation, it is symmetric, transitive, and (what we need) re-flexive: QED. What did I prove? I proved a property for sure, but it has nothing related to our abs diff function: it does not demonstrate that our function really computes what we expect!
From these two points, we understand that “proving” that a program “works” implies proving proper-tiesthat are related to the expected behavior of the program. Such a set of properties is the specification of the software. In other words, one has to demonstrate that the software implementation fulfills the software specification. The specification describes in detail the requirements which must be satisfied by the developed system. Hence is a document issued (normally) prior to the development itself.
Returning to our example, we could have proved the more interesting fact: the result of abs diff is always positive. But how to state “Is always positive”? One needs a language to express such a property and we also need to “process” this language to prove the property. Ensuring that proofs are
mathematically rigorous needs a logical language and a logic to work with properties. For instance, one can express the previous property by ∀x, y ∈ N, abs diff(x, y) ≥ 0.
The question of extracting relevant properties from the specification and express them in a given logic is not trivial and is not be the topic of this lecture. It is not rare that specifications are provided in natural language, hence with ambiguities, not even complete, even sometimes containing inconsisten-cies. Moreover some parts of the specifications cannot be expressed either because the chosen logical language does not fit to them or because they are non-functional properties (e.g. “the program must run in less than 10ms and in at most 10 Kb of stack” or “it must induce a power consumption lower than 10 mW”).
Now that it is clear that proof of programs requires a specification, and a logical framework, there are still several ways to carry out proofs. The first one is by “hand” in an informal style. Considering our previous example of absolute value of a difference, we can demonstrate that is it always positive as follows.
Let’s have the function abs diff(x, y) = if x > y then x − y else y − x. We want to prove that∀x, y ∈ N, abs diff(x, y) ≥ 0.
There are two cases. Ifx > y then one computes x − y and “obviously”, since x is greater thany the result is positive. Otherwise, x is not greater than y, hence it is lower or equal, hencey is greater or equal to x, and “obviously” y − x is positive (or null). Hence in both cases the result is positive or null. QED.
In this style, the reader must get convinced that the proof is correct by its own reading. He must himself infer than all the cases have been taken into account. In particular, it is implicit that the if then else induces two cases. Moreover, the “obviously” used twice is a well-known property of natural numbers but is totally informal: in fact it is due to the definition of the subtraction on natural numbers and can (must?) be demonstrated. In more complex proofs, hypotheses and intermediate steps can quickly increase, making the understanding and verification pretty more risky. In such a situation, how is it possible to be confident in the correctness of the proof, hence in the reliability of the related software? On another hand, if the program is modified, the proof has to be modified accordingly and has to be read again to be verified again (with the same risks).
An answer to these problems is to write proofs that can mechanically be checked by a computer: a formal proof. Unfortunately (or not?) the proof still has to be done by the human. Unfortunately this requires more work, more explicit details since a machine will have to be able to check the proof. But . . . at least verification can be done automatically and without risk of error. Fortunately, recent progresses in automated theorem provers allow to automatically discharge an important amount of proof obligation steps. Nonetheless, in the today state of the art, “fully proving” industrial-size programs is still out of reach for several reasons:
• huge number of lines of code,
• non-functional properties impossible to express in logic,
• complex programming constructs difficult to logically model (mutability, pointers, objects, etc.) • . . .
However, it is already possible to address more reduced parts of the software, especially the critical ones, possibly mixing usage of other formal methods, in order to increase the confidence one can have in the software.
1.3 Scope of the Lecture
In this lecture we will introduce FoCaLiZe, a language allowing to write specifications, programs and proofs that the latter fulfill the former. The user may only be interested in working on specifications, checking for absence of inconsistencies or for completeness with regard to a set of specification require-ments (i.e. properties). In this case, he will only work at the specification level, without providing an effective implementation.
FoCaLiZe provides a logical language to express requirements by first-order formulae depending on identifiers present in the context, a strongly typed pure functional language, powerful data-types, polymorphism and pattern-matching, object-oriented and modular features and language of proofs done by Zenon[3, 1], which may use unfolding of functions defined by pattern-matching.
A FoCaLiZe development is a source file containing properties, definitions and proofs. It must be compiled by focalizec which generates two outputs. The executable view of the development is an OCaml[7] source file. The logical view is a Coq[5] source file containing the complete model of the program: definitions (i.e. functions) and properties accompanied by their proofs. Proofs are obtained by invocations of the Zenon automated theorem prover which returns Coq terms that are embedded in the global generated Coq model. This model is then submitted to Coq whose role is simply to check for its consistency. In other words, Coq is an assessor of the whole development.
FoCaLiZe aims, as far as possible, at providing facilities to develop using a “programming-like” language, providing high-level constructs to allow structuring, incremental development and modularity. Moreover, its role is to hide the underlying logical target language from the user, allowing him to write proofs without mastering the specificities of a particular proof assistant. This also allows to change the target underlying languages without having the user to change his program and proofs. Finally, FoCaLiZe provides automatic documentation generation through various outputs (inheritance graph, dependency graph, interfaces, user documentation in HTML, LaTeX, extracted from the program and the comments it contains). The general compilation flow can be represented as follows:
source file file.v file.ml file.zv file.fcl file.fcd Syntax Tree
Abstract Inheritance, Dependencies Well−formed Hierarchy Code Generation Parse Source file Tools LaTeX HTML Dependency graphs Inheritance graph Zenon Documentation OCaml file generated source file Coq generated
Conversely to some other formal tools and environment like [5], [6], FoCaLiZe cannot be used in an interactive mode. The user writes is program, compiles it. Proofs he wrote are sent to Zenon which may succeed in finding what was delegated to it. In this case, compilation is a success and the resulting OCaml code is compiled, leading to an executable that can be ran. Otherwise, Zenon says a proof was not found and the user has to modify his source code to fix it before compiling it again. Proofs goals are not directed by the tool: they are explicitly stated . . . and proved by the user.
1.4 Plan of the Lecture
The remaining of this document is organized as follows. Chapter 2 introduces the basis of proofs in FoCaLiZe on first-order properties, showing both fully automated and step-by-step methods. Chapter 3 presents the usage of FoCaLiZe as a standard functional programming language allowing to write proofs about functions, without dealing with the structuring, parameterization and abstraction features of the language. The following section, 4, shows the basic FoCaLiZe concepts to structure a development, presenting a partially satisfactory architecture and abstraction, but introducing only few concepts and no proofs. Proofs are re-introduced in chapter 5 to take benefit from the structuring features. In section 6 we come back on the non satisfactory model introduced in 4 to show in two steps how to take benefit from parametrization to ensure modularity and reusability, going on with proofs and seeing the impact on their shapes. Having seen how to write specifications, programs and proofs incrementally, we shortly address in chapter 7 hints of good practices in order to minimize the amount of proof to write, giving clues to guess what to prove and at which level of the development. Finally, section 8 explains the way to “manually” decompose hard proofs, to make them fitting case analysis and, more generally, induction. 1.5 Notations
In the rest of this lecture, pieces of FoCaLiZe code will be presented in frames, as in this example: use "basics" ;;
species Sample =
When introduced in the running text, FoCaLiZe keywords will by typeset in a special font like property. Terms representing specific concepts of FoCaLiZe are introduced using an emphasized font, for example collection. Finally, commands and file names are in bold font, for example focalizec fo logic.fcl.
2
Starting by the Beginning: First-Order Logic
Before entering in the “proof of programs”, we just introduce the basics of formal proofs on the simple first order logic. This will be the occasion to remind ground notions and introduce FoCaLiZe’s logical and basic proof language. We will first examine how to “manually” write proofs using Zenon, then see that this later can indeed work for us, making whole parts of basic inference steps itself.
It must be clearly understood that Zenon is not a proof checker but a theorem prover. Clearly, it will not automatically demonstrate itself any property. However, combined with the FoCaLiZe proof language, it will automate tedious combinations of “sub-lemmas” one usually “think intuitively feasi-ble”.
2.1 A First Tautology
The FoCaLiZe logical language provides usual typed first order formulae extended by usual functional languages `a la ML expressions. As we will see later, these formulae can make reference to identifiers bound in the context, i.e. functions and properties names. The table below shows the usual logical connectors and their syntax in FoCaLiZe.
Semantics Connector FoCaLiZe syntax Conjunction ∧ /\ Disjunction ∨ \/ Implication ⇒ -> Equivalence ⇔ <-> Negation ¬ ˜
Universal quantification ∀ all
Existential quantification ∃ ex
We first start without using complex structures of the language and state the theorem: ∀a, b : boolean, a ⇒ b ⇒ a. In a file ex implications.fcl we write the following source code:
Listing 1: ex implications.fcl open "basics" ;;
theorem implications : all a b : bool, a -> (b -> a)
proof = assumed ;;
This program defines a theorem with its statement and its proof, currently assumed, i.e. not done. In other words, this is an axiom. As we will see later, it is a very common and handy way to mark steps of a proof assumed, to check if these steps make the global proof feasible, then to incrementally remove the assumed and replace them by effective proofs. Since we are in a typed world, note the type information labeling the (universally) quantified variables a and b. The directive open allows to import the definitions of the file basics.fcl of the standard library, avoiding to make them explicitly qualified (for instance, using bool instead of basics#bool). This file can be compiled invoking the focalizec compiler with the file in argument:
focalizec ex implications.fcl
resulting in few messages with no errors:
$ focalizec ex_implications.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c ex_implications.ml Invoking zvtov...
>> zvtov -zenon zenon -new ex_implications.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon ex_implications.v $
Two source files have been generated: ex implications.ml and ex implications.v. The first one is empty since theorems are not leading to executable (OCaml) code while the second contains a few lines of Coq code . . . ending by a “suspicious” apply coq_builtins.magic_prove. (which is normal since we admitted the proof). As stated in the output messages, both generated source have been compiled by their respective compiler (ocamlc and coqc).
It is now time to write a real proof. Following the usual techniques of proofs in natural deduction, such a proof is done by inserting in the context the two hypotheses introduced by the implications, then applying exactly the hypothesis a. In other words, this proof is represented by the following sequent where hypotheses (context) are ingreenand goals inblue:
a, b`a
a`b ⇒ a (⇒-INTRO)
`a ⇒ b ⇒ a (⇒-INTRO)
Hence, we can mimic such a proof in the FoCaLiZe Proof Language modifying our program as follows:
Listing 2: ex implications.fcl (2)
1 open "basics" ;;
2
3 theorem implications : all a b : bool, a -> (b -> a)
4 proof =
5 <1>1 assume a : bool, b : bool,
6 hypothesis h1 : a, 7 prove b -> a 8 <2>1 hypothesis h2 : b, 9 prove a 10 by hypothesis h1 11 <2>2 qed 12 by step <2>1 13 <1>2 conclude
14 (* or: qed conclude
15 or: qed by step <1>1 *) ;;
A compound proof is made of steps identified by a bullet (e.g. <1>1). A step introduces hypotheses then a goal following the keyword prove. It is the proposition that is asserted and proved by this step. After the statement is the proof of the step. This is where either you ask Zenon to do the proof from facts (hints) you give it, or you decide to split the proof in “sub-steps” on which Zenon will finally by called and that will serve (still with Zenon) to finally prove the current goal by combining these “sub-steps” lemmas.
Here, the steps <1>1 and <1>2 are at level 1 and form the compound proof of the top-level theorem. Step <1>1 also has a compound proof (whose goal is b ->a), made of steps <2>1 and <2>2. These latter are at level 2 (one more than the level of their enclosing step).
At the end of a step’s proof, this step becomes available as a fact for the next steps of this proof and deeper levels’ subgoals. In our example, step <2>1 is available in the proof of <2>2, and <1>1 is available in the proof of <1>2. Note that <2>1 is not available in the proof of <1>2 since <1>2 is located at a strictly lower nesting level than <2>1.
Let’s dissect this proof. In line 5, we introduce the two variables a and b as hypotheses in the context. Since we will make reference to them, this can be seen as the way to make them bound. Line 6 exactly performs the lower ⇒-intro of our above sequent, lifting a in hypothesis in the context under the name h1. Hence, our goal is now to prove b ->a as stated in line 7.
Now, we do not inspect the proof deeper and remain at the same nesting level. Under the hypothesis a, the step <1>1 proves b → a. Hence it is the exact global goal of our theorem. Hence, to end the proof, we just need an ending step (a qed step ), <1>2, telling “use step <1>1”. This is what is written in line 13 with different shortcuts. We invoke the statement conclude which is equivalent to tell Zenon to use as facts, “all the available proof steps in the scope”. Hence this is equivalent here to write qed by step <1>1. Note that with conclude the qed is optional since this statement implicitly marks the end of the proof related to the current goal.
We now return to the proof of the step <1>1 we left aside. At this point, we had to prove b → a under the hypothesis h1:a. We apply the same process, mimicking the upper ⇒-intro of our above sequent lifting b in hypothesis under the name h2 in line 8, leaving to prove the goal a stated in line 9. This goal is exactly our hypothesis h1. Hence we use the fact by hypothesis h1. Zenon is asked to find a proof of the current goal, using hints it is provided with.
From the proof of a, i.e. the step <2>1, we can conclude the enclosing goal (prove b ->a). This is done by the qed step whose aim is to close the enclosing proof by mean of the provided facts (here the intermediate lemma stated by the step <2>1). Our proof is now complete. We can compile our source file again with no error.
Note that the user is responsible in giving Zenon facts it will need to finally find a proof. It will combine them in accordance with logical rules, but if it is missing material for the proof, it will never succeed.
2.2 Yet Another Tautology
In the previous example, we did all the job of the proof by hand, up to splitting implications until the goal to prove was exactly one of the hypotheses. We will now discover a little of Zenon’s magic by proving the statement ∀a, b : boolean, (a ∧ b) ⇒ (a ∨ b). The tree of this proof is very simple
a ∧ b`a ∧ b
a ∧ b`b
a ∧ b`a ∨ b (∨-INTRO)
(∧-ELIM)
`a ∧ b ⇒ a ∨ b (⇒-INTRO)
where the implication is pushed as hypothesis by the ⇒-intro, then we split the disjunction by the ∨-intro to chose to prove b and finally use the stronger hypothesis by a ∧-elim. We mimic this proof tree in a source file and or.fcl:
Listing 3: and or.fcl open "basics" ;;
theorem and_or : all a b : bool, (a /\ b) -> (a \/ b)
proof =
(* Sketch: assume a /\ b, then prove b as trivial consequence of a /\ b. *) <1>1 assume a : bool, b : bool,
hypothesis h1: a /\ b, prove a \/ b <2>1 prove b by hypothesis h1 <2>2 qed by step <2>1
<1>2 conclude (* or, qed conclude, or even qed by step <1>1 *) ;;
As previously, the step <1>1 introduces the bound variables and the implication in the context, leaving to prove a ∨ b under the hypothesis h1 stating we have a ∧ b. What is interesting is that the proof of the step 2<1> is done using the hypothesis h1 which is not exactly the desired goal: we did not make explicit the use of ∧-elim: we rely on Zenon to find itself the proof. Would it be the begin of its magic? 2.3 Je ne veux pas travailler. . . (Pink Martini)
That was quite instructive to elaborate a proof manually from alpha to omega . . . but it was a bit boring! We have a theorem prover to help proofs, couldn’t it help us more? Obviously yes (and it is satisfactory since for tautologies, automated decision procedures are well-known)! Such formulae can be proved automatically by Zenon who takes care of the “administrative” steps of basic logic. In our example, replacing each compound proof by a simple conclude step is sufficient:
Listing 4: tautos.fcl open "basics" ;;
theorem implications : all a b : bool, a -> (b -> a)
proof = conclude ;;
theorem and_or : all a b : bool, (a /\ b) -> (a \/ b)
proof = conclude ;;
3
Basic Programming in
FoCaLiZe
Since the aim is to write programs and make proofs on these pieces of code, it is time to have a look at the programming languages provided by FoCaLiZe. As previously introduced, the core language is a
functional language `a la ML with:
• basic types, like int, bool, string, . . .
• type constructors: products, sums and records, possibly recursive, • let-definitions, pattern-matching, if-then-else.
Definitions contained in a compilation unit (a “.fcl” file) can be used from other compilation units, either transparently by using the open directive seen in Section 2.1 or by explicitly prefix their name by their hosting compilation unit followed by a # (dash character).
In a file loose sets.fcl’ let’s define a quick and dirty structure to represent sets, in other words, something based on lists (which are native in FoCaLiZe with the OCaml notations :: and []).
Listing 5: loose sets.fcl open "basics" ;;
type set_t (’a) =
| Empty
| Elem (’a, set_t (’a))
;;
let is_empty (s) = s = Empty ;;
let add (x, s) = Elem (x, s) ;;
let rec belongs (x, s) =
match s with
| Empty -> false
| Elem (e, s) -> if e = x then true else belongs (x, s)
termination proof = structural s
;;
In this program, we first define a polymorphic recursive sum type with two value constructors: Emptywith no argument and Elem parametrized by two arguments, one being one element of the set, the other being the remaining elements of the set (i.e. itself a set). Obviously this representation is naive since elements should not appear several times to have an efficient representation.
Then we define three functions:
• is_empty: it takes a set and returns the boolean value telling if this set is empty,
• add: it takes an element, a set and returns the set extended with this element (no verification of multiple times the same element),
• belongs: it take an element, a set and returns a boolean value telling if the element appears in the set. This function is defined by pattern-matching on values of the type set_t.
We can already notice that the third function is recursive. To ensure logical consistency, its termina-tion has to be demonstrated. Terminatermina-tion proofs won’t be addressed in this lecture, but we can simply see that structural recursion is simply stated by the structural keyword followed by the name of the strictly decreasing argument of the function.
In another source file, int loose sets.fcl, we can define some functions using definitions of our pre-vious source file and prove a few theorems on sets of integers. In the following source, we intentionally not open-ed the compilation unit loose sets to explicitly name qualification using the #-notation. Ho-wever, to allow access to this unit we still at least need to add a use directive.
Listing 6: int loose sets.fcl open "basics" ;;
let add_except_0 (x : int, s) =
if x = 0 then s else loose_sets#add (x, s)
;;
theorem zero_not_added: all x: int, all s : loose_sets#set_t (int), (loose_sets#is_empty (s) /\ x = 0) ->
loose_sets#is_empty (add_except_0 (x, s))
proof = by definition of add_except_0 ;;
theorem zero_not_added_weaker: all s : loose_sets#set_t (int), loose_sets#is_empty (s) ->
loose_sets#is_empty (add_except_0 (0, s))
proof = by property zero_not_added ;;
The proof of the theorem zero_not_added is simple, not even made of several steps and shows a new Zenon fact: definition of. It indicates Zenon to try find a proof using the body of a definition, unfolding it. In effect, to prove that if x = 0 then the set returned by add_except_0 is unchanged, we need to look at the body of this function. Note that we do not even need to consider the function loose_sets#is_empty even if it appears in the statement of the theorem. In effect, the proof doesn’t matter what this function does, the only important point is that if x = 0 add_except_0 returns the same set than it was given.
Once this theorem is proved, it may serve as lemma for proving other ones. This is what is done in the proof of the second theorem, zero_not_added_weaker. This latter is in fact a simple instance of the former, having directly instantiated x by 0. We see a new Zenon fact: property which indicates to use the type of a definition, which in case of theorems, by the Curry-Howard isomorphism, is a logical statement. Hence, Zenon will use the property (previously proved in this case) to prove the current one.
We can compile our program and see that Zenon completes the proofs:
$ focalizec int_loose_sets.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c int_loose_sets.ml Invoking zvtov...
>> zvtov -zenon zenon -new int_loose_sets.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon int_loose_sets.v $
In order to end this first contact with “programming in FoCaLiZe like in ML and making basic proofs”, we write in a file basic induct.fcl a theorem telling that if we add an element to a set, then it belongs to the resulting set.
Listing 7: basic induct.fcl open "basics" ;;
open "loose_sets" ;;
theorem added_forcibly_belongs: all x : int, all s : set_t (int), belongs (x, add (x, s))
proof = by definition of add, belongs type set_t ;;
This proof has to be done using the definition of belongs, by case on the values of the type set_t. Proof by case is a weak particular case of proof by induction and is triggered in Zenon by the new (and last) fact: by type . Such a fact tells to Zenon to invoke the induction principle of the related type to solve goals.
Later, in Section 8.2, we will come back deeper on proofs by induction to exactly understand what must be the shape of the goals allowing Zenon to succeed with induction and how to proceed when automatic proving fails.
4
Structuring the Development by Encapsulation: Species and Collection
In the previous sections, we developed programs “at toplevel”, i.e. with all the definitions laying in a flat way, directly coping with implementation.
In a more realistic development, the usage is rather to express specifications and to go step by step (in an incremental approach) to design and implementation while proving that such an implementation meets its specification or design requirements.
Moreover, to reduce coupling between software components, modularity and encapsulation is widely used, grouping data-structures and operations manipulating them inside structures forming some kind of Abstract Data-Type. The structuring constructs of FoCaLiZe will allow this encapsulation, preventing users of a software component from accessing directly the internals of a data-structure. Users will hence be forced to manipulate the data-structure via the provided functions, preventing from breaking possible invariants this data-structure requires.
4.1 Components of a species
FoCaLiZe also follows this programming approach and provides a basic brick the species. It can be viewed as a record grouping “things” related to the same concept. Like in most modular design systems (i.e. objected oriented, algebraic abstract types), the idea is to group a data structure with the operations that operate on it. Since in FoCaLiZe we don’t only address data-type and operations, these “things” also comprise the declaration (or specification) of the operations, their stated properties (which represent the requirements for the operations), and the proofs of these properties. Species are the nodes of the FoCaLiZe hierarchy. Without spoiling the coming chapters, species will be combined by inheritance to incrementally build more complex ones.
A species is a sequence of methods or fields , each one ended by a semi character (“;”). Hence, a basic species looks like:
species Name = meth1 ; meth2 ;
end ;;
Species names are always capitalized. As any toplevel definition, a species ends with a double semi-character (“;;”). A species can contain several kinds of methods:
• The representation, giving the data representation of the entities manipulated by the species. It is a type defined by a type expression. The representation definition may be deferred, which means that the structure of the embedded data-type does not need to be known at this point but each species has one unique representation.
• Declarations which are made of the keyword signature followed by a name and a type. They announce a method to be defined later. The type of the method is given but the implementation is still “omitted”. Once a method is declared, this method can be used in the text following the declaration, in particular in the definition of other methods. In effect, the type provided by the signature allows the compiler to check that the method is consistently used in all the contexts with a type compatible with the declared type. Furthermore, the inheritance, late-binding and collection mechanisms introduced in Section 6.2.1, ensure that the definition of the method is known when the method is effectively invoked. Hence, signature serve for specification purposes since they do not bring any implementation information.
• Definitions are made of the keyword let, followed by a name, an optional type, and an expres-sion. They serve to introduce constants or functions, i.e. computational operations. Mutually recursive definitions are introduced by let rec . Hence, definitions serve for implementation purposes.
A variant of definition is the logical let allowing to bind an identifier to a logical expression. For example, “being even” is a proposition, not always true, hence is not a theorem. However we can use it to build more complex logical statements which may be part of theorems statements. Hence a logical let makes possible to create parametrized logical expressions, which may hold or not (conversely to a property / theorem which has to be proved holding).
• Property statements are composed of the keyword property followed by a name and a first-order formula. Hence properties serve to express requirements, i.e. for specification purposes. Formulae can use methods names known within the species’ context, especially to state require-ments on these methods. Proof of properties are in fact delayed. Despite this delay, like for signatures, properties can be used to prove theorems.
• Theorems (theorem) made of a name, a statement and a proof are in fact properties packed with the formal proof that their statement holds in the context of the species. As said above, the proof of a theorem may use properties available in the context of the species, despite these former are not yet proved.
• Proofs are introduced by the keyword proof of followed by the name of a previously intro-duced property. They are intended to be merged with their related property to form a theorem. 4.2 Our Case of Study: Association Maps
We want to write a program providing association maps. As usually, such a data-structure allows to bind a key to a value and to retrieve the value bound to a given key afterwards in the map. To restrict the scope of this lecture, we won’t address removing bindings, sorting and so on. Hence, an association map is a structure providing at least three operations:
• empty :map the initial map containing no binding,
• add (k, v, m):key ->value ->map ->map the function adding the binding (k 7→ v) to the map m (hiding the possibly existing previous binding of k),
• find (k, m):key ->map ->’’value or error’’ the function looking for and re-turning the value bound to the key k in the map m if it finds it, otherwise signaling an error. Since in FoCaLiZe exceptions do not exist, we will encode the result of find in a type “option”:
type option_t (’a) =
| None (* No available value. *)
| Some (’a) (* Value available, argument of the constructor. *)
4.3 A First Attempt: Building our First Species (Poor Structuring)
Instead of laying all the functions to manage maps et toplevel, we will group them inside a species. However, species cannot contain type definitions (except the representation) and we need to define the “option” type and the type allowing to record bindings in a map. Hence, we need to put these 2 type definitions outside the species.
For sake of simplicity, the recording structure of a map will be a list whose type is explicitly given by ourselves (we could use the built-in type of lists but this would hide some subtleties in coming proofs since the theorem prover Zenon internally knows this type). Because we are lazy, we decide that keys will be integers and values will be strings. From this we can start writing our source code:
Listing 8: (Beginning of) assoc map1.fcl open "basics" ;;
(* Structure recording bindings of a map: a hand-made basic list. *)
type int_str_list_t =
| Node (int, string , int_str_list_t) (* Keys: ints, values: strings. *)
;;
(* Return value of the lookup function: nothing or something. *)
type option_t (’a) =
| None
| Some (’a)
;;
It is now time to implement the species providing the association maps between integers and strings. Since we decided that its underlying data-structure is a list of type int_str_list_t, we can define its representation. Next we define the empty map as the empty list, the add function as the “cons” of the pair (k, v) in head of the list representing the current map and finally the lookup function as a recursive search along the list, returning None if no binding if found for the key k in the map m.
Listing 9: (Continuation of) assoc map1.fcl species AssocMap =
representation = int_str_list_t ;
let empty : Self = Nil ; (* Empty association map: no bindings. *) (* Addition to the map m of the value v bound to the key k. *)
let add (k: int, v: string, m : Self) : Self = Node (k, v, m) ; (* Lookup the the value bound to the key k in the map m. *)
let rec find (k: int, m: Self) =
match m with
| Nil -> None
| Node (kcur, v, q) -> if kcur = k then Some (v) else find (k, q)
termination proof = structural m ;
end ;;
It is worthy to note that we explicitly annotated the type of m’s with Self to ensure that the inferred types will contain Self instead of the type expression of the representation. In effect, one submitted to the encapsulation process to make an Abstract Data-Type, the structure of the representation will be hidden, hence the methods empty, add and find must only reveal that they work on the abstracted type and not on its real structure.
We can compile this draft of program invoking the focalizec command, which should raise no errors:
$ focalizec assoc_map1.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map1.ml Invoking zvtov...
>> zvtov -zenon zenon -new assoc_map1.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon assoc_map1.v $
4.4 Final Encapsulation: Turning the Species into a Collection
At this point, we did no proof, we only encapsulated all the methods dealing with our association maps into a grouping structure, the species. Before going further in proving, we can try our implementation and turn our species into the expected Abstract Data-Type: let’s create a collection.
A collection can be seen as an “instance” of a species where definitions get opaque: only their types remain visible. For theorems (not yet written for our species), only their statement will be visible. A collection is the only executable proved code. This has two major consequences:
• Since code must be executable, all the declared methods (signature, not yet used in our exam-ple) must be defined, even the representation.
• Since code must be proved, all the stated properties (property, also not yet used in our example) must be given a proof, i.e. turned into a theorem (theorem).
A species complying with these two conditions is said complete . In our sample code everything is defined, hence we can create a collection and add some toplevel expressions to test our software com-ponent. We also add a function printing values of type option_t (string), we build a new map containing the binding (5, ‘‘five’’), and print the results of looking up for the values bound to the keys 5 and 3. The complete source code is available in code/assoc_map1_partial_test.fcl.
Listing 10: (Continuation of) assoc map1.fcl collection MyMap = implement AssocMap ; end ;;
(* Printer of value of type option (string). *)
let print_string_option (v) =
match v with
| None -> print_string ("Not found\n")
| Some (s) ->
let _a = print_string ("Found value: ") in
let _b = print_string (s) in print_newline (())
;;
let m = MyMap!add (5, "five", MyMap!empty) ;; print_string_option (MyMap!find (5, m)) ;; print_string_option (MyMap!find (3, m)) ;;
We then can compile our program using the focalizec command, without error.
$ focalizec assoc_map1_partial_test.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map1_partial_test.ml Invoking zvtov...
>> zvtov -zenon zenon -new assoc_map1_partial_test.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon assoc_map1_partial_test.v print_string_option (MyMap.find 5 m)
: basics.unit__t
print_string_option (MyMap.find 3 m) : basics.unit__t
We can see the Coq verified the generated code and displays the types of trailing toplevel definitions. By default, the generated OCaml source code is compiled as an object file (to possibly link with some other software components, either coming from FoCaLiZe or manually written in OCaml). To get the running executable, we need to link the OCaml object file with a few object files of FoCaLiZe’s standard library (the ones corresponding to the open-ed or use-ed libraries in the FoCaLiZe program).
$ ocamlc -I /usr/local/lib/focalize ml_builtins.cmo basics.cmo assoc_map1_partial_test.cmo $
And finally, we can run the whole stuff:
$ ./a.out
Found value: five Not found
$
with the expected result. To clearly see that our software component (the collection) is now encapsulated, i.e. is a kind of Abstract Data-Type, hence can be only manipulated via the functions it provides, we can try to modify our program by trying to build a map “by hand”, using its internal representation outside the species. We just change our creation of the map using the Nil constructor instead of the empty method (whose definition is however exactly Nil).
Listing 11: Broken program
...
$ focalizec assoc_map1_partial_test.fcl
File "assoc_map1_partial_test.fcl", line 42, characters 8-34: Error: Types assoc_map1_partial_test#int_str_list_t and assoc_map1_partial_test#MyMap are not compatible. $
The FoCaLiZe compiler hence complains, telling that the type int_str_list_t whom Nil belongs to is not compatible with the (now abstracted) representation of our collection MyMap: encap-sulation prevented from seeing the internals of our data-type.
5
Adding Proofs in Species
Until now, our association maps have no proved nor even stated properties. It is now time to add some. 5.1 Simple Proofs
We start by two properties that will be proved by simple facts to illustrate that the shape of proofs introduced in 3 is not impacted by the structuring added using species.
Let’s first demonstrate that finding the value bound to a key just inserted in a map never fails. In other words, calling find with a key k on a map built by add-ing to it k with any bound value never returns None. We add this method after the definition of find.
(* Add make find a success. *)
theorem find_added_not_fails: all k : int, all v : string, all m1 m2 : Self, m2 = add (k, v, m1) -> ˜ (find (k, m2) = None)
proof = ????
We now need to give Zenon facts to find a proof. Clearly, this is a consequence of how add and find are implemented: the former adds in head of the list and the latter destruct this head. Hence the proof needs definition of add, find. However, Zenon needs to know the constructors of the type int_str_list_t to destruct it and those of option_t since they also appear in find’s body. Hence the proof also needs a type int_str_list_t, option_t. The complete theorem is then:
(* Add make find a success. *)
theorem find_added_not_fails: all k : int, all v : string, all m1 m2 : Self, m2 = add (k, v, m1) -> ˜ (find (k, m2) = None)
proof = by definition of add, find type int_str_list_t, option_t ;
Let’s add another theorem, stating that looking up for two identical keys returns the same result.
(* Same key -> same value. *)
theorem find_same_key_same_value: all k1 k2: int, all m : Self, k1 = k2 -> find (k1, m) = find (k2, m)
proof = conclude ;
Hopefully Zenon natively knows the equality =, its reflexive, transitive, symmetry properties. . . and especially the fact that any function is a morphism for this relation. Hence, Zenon is able to determine itself that calling a same function with the “same” arguments will lead to a same result. Hence the proof is reduced to conclude, telling Zenon to work itself with its internal laws and without any other required information.
Compiling the obtained FoCaLiZe code is achieved as usual and invokes Zenon which works a bit, finds proofs, and compiles the generated OCaml and Coq files without errors.
Conversely to our initial program, we now proved on it that some properties are holding with respect to our implementation of computational methods.
5.2 Proofs Getting Harder
We now try a new property, reflecting the specification of a decent lookup function: a bound value is found for a key if and only if either this key is the first one of the map or it is found in the remaining bin-dings of the map. We now state the theorem and try a simple proof. This proof is for sure a consequence of the definitions of add, find and of the types int_str_list_t, option_t.
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof = by definition of add, find type int_str_list_t, option_t ;
We now compile the program.
Attention:
we still have an issue in Zenon when it generates some Coq proofs as a λ-term: the printer is missing a case:Zenon error: uncaught exception File "coqterm.ml", line 335, characters 6-12: Assertion failed File "assoc_map1.fcl", line 33, characters 10-66:
This has to be fixed in a coming release. The workaround is to as it to generate the proof as a Coq script using the extra option -zvtovopt -script when invoking the FoCaLiZe compiler (in the remaining of this lecture, we will use this option pretty often):
$ focalizec -zvtovopt -script assoc_map1.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map1.ml Invoking zvtov...
>> zvtov -zenon zenon -new -script assoc_map1.zv 42
*****######-. *****######-. *****######-. And bad luck, Zenon goes on searching for a while, a long while*****######-. *****######-. *****######-. This simply means that the proof is too complex and must be manually split into intermediate steps!
We will now proceed incrementally: we split the current goal into several subgoals and mark them assumed (i.e. we don’t prove them), then we add a qed step using the intermediate steps (and possibly other facts) to see if, with such a decomposition, Zenon now succeeds. If so, then we have to now prove each assumed intermediate step using the same method. If Zenon doesn’t find any proof, this means that our split is not fine enough or we forgot some facts. In this case, it is not useful to try proving intermediate assumed steps since they will not be enough: let’s not do some job for nothing.
The first step is to introduce the hypotheses of the theorem in the context and to state the remaining goal to prove. We then end by a qed step (conclude) that will trivially succeed since we only intro-duced hypotheses and left the body of the formula assumed. However, this is a pretty mechanical way of proceeding, so let’s apply.
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
assumed
<1>e conclude ;
We can compile the program and hopefully get a success: Zenon finds a proof and both OCaml and Coq generated code are accepted.
$ focalizec -zvtovopt -script assoc_map1.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map1.ml Invoking zvtov...
>> zvtov -zenon zenon -new -script assoc_map1.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon assoc_map1.v $
Let’s now find intermediate steps to prove the goal of step <1>1. We have to prove an equivalence, this means that we have two intermediate goals, each one being one side (left/right) of the equivalence under the hypothesis of the other side. Hence, we write these two intermediate steps, mark them assumed and ensure, by a qed step using these two intermediate steps that Zenon succeeds:
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
<2>1 hypothesis H1: find (s, m) = Some (v) \/ s = k,
prove find (s, add (k, v, m)) = Some (v)
assumed
<2>2 hypothesis H2: find (s, add (k, v, m)) = Some (v),
prove find (s, m) = Some (v) \/ s = k
assumed
<2>e qed by step <2>1, <2>2
<1>e conclude ;
And . . .
$ focalizec -zvtovopt -script assoc_map1.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map1.ml Invoking zvtov...
>> zvtov -zenon zenon -new -script assoc_map1.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon assoc_map1.v $
. . . it works! We refined our proof in two steps that are simpler and that suffice to prove the above goal. It now remains to address the effective proof of these assumed steps. Let’s start by the first one and try to prove the goal of <2>1. We can hope we will be lucky by telling Zenon that it should find a proof using the definitions of find, add, the hypothesis H1 and the type int_str_list_t:
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
<2>1 hypothesis H1: find (s, m) = Some (v) \/ s = k,
prove find (s, add (k, v, m)) = Some (v)
by definition of add, find type int_str_list_t hypothesis H1
<2>2 hypothesis H2: find (s, add (k, v, m)) = Some (v),
prove find (s, m) = Some (v) \/ s = k
assumed
<2>e qed by step <2>1, <2>2 <1>e conclude ;
Unfortunately, compiling the program shows that Zenon gets unable to find a proof, searching for a (too) long time : we have to split again! How to prove that find (s, add (k, v, m))=Some (v) knowing by H1 that find (s, m)=Some (v)\/s =k? May be we can first prove that
add (k, v, m)=Node (k, v, m)and from this, using the definition of find, the hypothesis H1and the type int_str_list_t (since the proof must make a case analysis on it in the body of find), the proof should be feasible. Hence, we introduce two new steps, the first one, <3>1 still left assumed, and the conclusion one, <3>e, trying to conclude the goal.
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
<2>1 hypothesis H1: find (s, m) = Some (v) \/ s = k,
prove find (s, add (k, v, m)) = Some (v)
<3>1 prove add (k, v, m) = Node (k, v, m)
assumed
<3>e qed by step <3>1 definition of find
hypothesis H1 type int_str_list_t
<2>2 hypothesis H2: find (s, add (k, v, m)) = Some (v),
prove find (s, m) = Some (v) \/ s = k
assumed
<2>e qed by step <2>1, <2>2 <1>e conclude ;
We can compile the program and see that indeed Zenon found the proof of <3>e that concludes the surrounding goal of <2>1! Again, we have split a proof “too complex for Zenon” into simpler steps allowing it to find a proof. It now remains to provide an effective proof for the step <3>1, and we decently guess that it can be done using the definition of add and the type int_str_list_t. In effect, it suffices to unfold the definition of add and to know that Node is a constructor of the type int_str_list_t:
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
<2>1 hypothesis H1: find (s, m) = Some (v) \/ s = k,
prove find (s, add (k, v, m)) = Some (v) <3>1 prove add (k, v, m) = Node (k, v, m)
by definition of add type int_str_list_t
<3>e qed by step <3>1 definition of find
hypothesis H1 type int_str_list_t
<2>2 hypothesis H2: find (s, add (k, v, m)) = Some (v),
prove find (s, m) = Some (v) \/ s = k
assumed
<2>e qed by step <2>1, <2>2 <1>e conclude ;
We can again compile our program and see it is a success. There only remains to apply the same methodology to prove the left aside step <2>2, trying some hints (by definition, type, step, etc.) and introducing intermediate steps if Zenon fails finding a proof. The path is exactly similar to what we did until now and the proof of <2>2 will be in fact the same that for <2>1 in our case:
theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume m : Self,
assume s k : int,
assume v : string,
prove (find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
prove find (s, add (k, v, m)) = Some (v) <3>1 prove add (k, v, m) = Node (k, v, m)
by definition of add type int_str_list_t <3>e qed by step <3>1 definition of find
hypothesis H1 type int_str_list_t
<2>2 hypothesis H2: find (s, add (k, v, m)) = Some (v),
prove find (s, m) = Some (v) \/ s = k
<3>1 prove add (k, v, m) = Node (k, v, m)
by definition of add type int_str_list_t <3>e qed by step <3>1 definition of find
hypothesis H2 type int_str_list_t
<2>e qed by step <2>1, <2>2 <1>e conclude ;
6
More Abstraction: Parameterization
In the previous example, we chose to implement association maps between integers and strings. Howe-ver such a structure is naturally polymorphic: it behaves the same way whateHowe-ver the key and value types are. The only thing one needs is to be able to test the equality between keys: a kind of “equal” function on the data-type representing the keys. Moreover, using basic types does not provide any properties on them (except some low-level toplevel theorems of the FoCaLiZe standard library we won’t investigate here). Only species and collections are the way to get Abstract Data-Type with (proved) properties. Hence, to get more abstract, less specialized association maps, we want to parameterize them by col-lection parameters representing the data-types of keys and values. We hence will achieve a kind of polymorphism of species, a way to make a species depending on other ones, relying on their functions and properties.
6.1 A Second Attempt: Adding (not Enough) Collection Parameters
As introduced, an association map depends on the data-type of its keys and of its values. It has hence two parameters. We said we just need to be able to test the equality of keys to implement the function find. However, probably we will need to also compare values for other functions or to state some theorems about maps. So, we can decide that both keys and values will be “at least” of the same data-type, i.e. a collection whose signature is a “subclass” of this same data-type.
6.1.1 Abstracting the Maps Keys/Values
Let’s define the corresponding species and call it Comparable. It has to contain an eq signature taking two parameters of type Self and returning a boolean. We do not need for it to be defined yet. Next, what are the basic characteristics of an equality function? It has to be an equivalence relation, hence must be reflexive, symmetric an transitive. Hence these must be declared as properties of this species.
Listing 12: (Beginning of) assoc map2.fcl open "basics" ;;
species Comparable =
signature eq : Self -> Self -> bool ;
property eq_reflexive: all x : Self, eq (x, x) ;
property eq_symmetric: all x y : Self, eq (x, y) -> eq (y, x) ;
property eq_transitive:
all x y z : Self, eq (x, y) -> eq (y, z) -> eq (x, z) ;
end ;;
Note that this species is not yet complete, hence we can’t make yet a collection from as we did in 4.4. But this is not our concern now. We will simply use it as an interface for the coming collection
parameters, that will ultimately instantiated by effective parameters issued from collections (i.e. with everything defined) implementing the interface of Comparable.
6.1.2 Adapting the Species
We can start our species AssocMap, still keeping the type option_t and making our previous type of lists now polymorphic (definition of pair_list_t instead of the previous int_str_list_t).
Listing 13: (Continuation of) assoc map2.fcl
(* Structure recording bindings of a map: a hand-made basic list. *)
type pair_list_t (’a, ’b) =
| Nil
| Node (’a, ’b , pair_list_t (’a, ’b))
;;
(* Return value of the lookup function: nothing or something. *)
type option_t (’a) =
| None
| Some (’a)
;;
species AssocMap (Key is Comparable, Value is Comparable) = representation = int_str_list_t (Key, Value) ;
let empty : Self = Nil ;
let add (k, v, m : Self) : Self = Node (k, v, m) ;
let rec find (k, m: Self) =
match m with
| Nil -> None
| Node (kcur, v, q) -> if Key!eq (kcur, k) then Some (v) else find (k, q)
termination proof = structural m ;
end ;;
The important change in the method find is that we replaced the = operator by the eq method provided by the collection parameter Key.
6.1.3 Re-Introducing a First Proof
This program perfectly compiles but does not include the proofs we wrote in its previous version. We
will now re-introduce them and check if differences occur. We start by the theorem find_added_not_fails and leave both the statement and the proof unchanged up to the name of the type of our lists:
Listing 14: (Continuation of) assoc map2.fcl
(* Add make find a success. *)
theorem find_added_not_fails: all k : Key, all v : Value, all m1 m2 : Self, m2 = add (k, v, m1) -> ˜ (find (k, m2) = None)
proof = by definition of add, find type pair_list_t, option_t ;
We now compile the program:
$ focalizec assoc_map2.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map2.ml Invoking zvtov...
>> zvtov -zenon zenon -new assoc_map2.zv
>> zvtov -zenon zenon -new -script assoc_map2.zv 42
##***********-. ##***********-. ##***********-. and $#!@, the proof does not pass anymore: Zenon does not stop searching! The question is what did we change? The answer is “the definition of find” in which we use now Key!equal instead
of =. Indeed, it is not so surprising that the proof fails since in it, we didn’t state anything about the method Key!eq appearing in the body of find and implying that a key is equal to another. And furthermore, this method is only declared so how could Zenon guess? The fact that the k appearing in add (k, v, m1) and in find (k, m2) is the same is clear in the statement of the theorem by syntactic equality, however, Zenon cannot guess that it also means that Key!eq (k, k)! To solve this, we need to use the properties we stated for the Key!eq, i.e. that it is an equivalence relation, hence symmetric, transitive and reflexive.
For this proof, the reflexivity will be sufficient. How did we determine this? The answer is “by guessing what Zenon can encounter as subgoals during its search”. This is a bit informal and surely tricky, but basically, in such a case, because we clearly understand that we need to tell Zenon that eq is an equality-like function, we first add the three properties as facts for the proof. If the proof succeeds, we try removing some of them to check if Zenon really needs them. Hence, we complete our old proof by just adding the fact property Key!eq_reflexive:
Listing 15: (Continuation of) assoc map2.fcl
(* Add make find a success. *)
theorem find_added_not_fails: all k : Key, all v : Value, all m1 m2 : Self, m2 = add (k, v, m1) -> ˜ (find (k, m2) = None)
proof = by definition of add, find type pair_list_t, option_t
property Key!eq_reflexive ;
$ focalizec assoc_map2.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map2.ml Invoking zvtov...
>> zvtov -zenon zenon -new assoc_map2.zv
>> zvtov -zenon zenon -new -script assoc_map2.zv >> zvtov -zenon zenon -new -script assoc_map2.zv Invoking coqc...
>> coqc -I /usr/local/lib/focalize -I /usr/local/lib/zenon assoc_map2.v $
and it finally works!
6.1.4 The Proof that Kills
Since we adapted our species very easily, with minor changes due to the gain of abstraction provided by parameterization, we want now to go on and continue the adaptation of our initial version of association maps implementation. We will re-introduce the second theorem, find_same_key_same_value. Obviously, the first thing is to hope we are lucky and that it can be directly copy-pasted modulo the use of the method Key!eq in place of =, and (by intuition) the use of its three properties (of equivalence relation) instead of a simple conclude. . .
Listing 16: (Continuation of) assoc map2.fcl
(* Same key -> same value. *)
theorem find_same_key_same_value: all k1 k2: Key, all m : Self,
Key!eq (k1, k2) -> find (k1, m) = find (k2, m)
proof = by property Key!eq_reflexive, Key!eq_transitive, Key!eq_symmetric ; Let’s compile:
$ focalizec -zvtovopt -script assoc_map2.fcl Invoking ocamlc...
>> ocamlc -I /usr/local/lib/focalize -c assoc_map2.ml Invoking zvtov...
>> zvtov -zenon zenon -new -script assoc_map2.zv File "assoc_map2.fcl", line 48, characters 10-75:
Zenon error: exhausted search space without finding a proof ### proof failed
$
. . . and bad luck again, Zenon fails, telling it tried all the possible combinations using the given facts without finding any proof. We are missing something, but what? To understand we need to try splitting the proof as usual. Hence we make two steps, <1>1 introducing the hypotheses in the context and leaving assumed the remaining of the goal, and <1>e simply concluding by the previous step.
Listing 17: (Continuation of) assoc map2.fcl theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume k1 k2: Key,
assume m : Self,
hypothesis H: Key!eq (k1, k2),
prove find (k1, m) = find (k2, m)
assumed
<1>e conclude ;
This obviously works since we only lifted the hypotheses leaving the core of our property assumed. We now try to add intermediate steps. We know by the hypothesis H that “k1 equals k2”, so if we prove that k1 =k2 this should be fine because “obviously applying a same function twice with equal arguments leads to the same result” (c.f. remark in 5.1)! So, let’s write down this proof:
Listing 18: (Continuation of) assoc map2.fcl theorem find_spec: all m : Self, all s k : int, all v : string,
(find (s, m) = Some (v) \/ s = k) <-> find (s, add (k, v, m)) = Some (v)
proof =
<1>1 assume k1 k2: Key,
assume m : Self,
hypothesis H: Key!eq (k1, k2),
prove find (k1, m) = find (k2, m)
<2>1 prove k1 = k2
assumed
<2>e qed by step <2>1
<1>e conclude ;
We can now compile our program and see that it is accepted. It only remains now to prove our last goal, the one of <2>1 left assumed. The argument is that since by hypothesis H with have Key! eq (k1, k2)and Key!eq is our “equality test function”, an equivalence relation, i.e. symmetric, transitive and reflexive, may be it would be sufficient . . .
Listing 19: (End of) assoc map2.fcl theorem find_same_key_same_value: all k1 k2: Key, all m : Self,
Key!eq (k1, k2) -> find (k1, m) = find (k2, m)
proof =
<1>1 assume k1 k2: Key,
assume m : Self,
hypothesis H: Key!eq (k1, k2),
prove find (k1, m) = find (k2, m) <2>1 prove k1 = k2
by property Key!eq_reflexive, Key!eq_transitive, Key!eq_symmetric
hypothesis H
<2>e qed by step <2>1 <1>e conclude ;
Trying to compile the program. . .
$ focalizec -zvtovopt -script assoc_map2.fcl Invoking ocamlc...