10.3 An Overview of Coq's Architecture

The implementation of Coq is based on eight logical modules. By ``module'' we mean here a logical piece of code having a conceptual unity, that may concern several Ocaml files. By the sake of organization, all the Ocaml files concerning a logical module are grouped altogether into the same sub-directory. The eight modules are:

1.	The logical framework	(directory `src/generic`)
2.	The language of constructions	(directory `src/constr`)
3.	The type-checker	(directory `src/typing`)
4.	The proof engine	(directory `src/proofs`)
5.	The language of basic tactics	(directory `src/tactics`)
6.	The vernacular interpreter	(directory `src/env`)
7.	The parser and the pretty-printer	(directory `src/parsing`)
8.	The standard library	(directory `src/lib`)

The following sections briefly present each of the modules above. This presentation is not intended to be a complete description of Coq's implementation, but rather a guideline to be read before taking a look at the sources. For each of the modules, we also present some of its most important functions, which are sufficient to implement a large class of tactics.

10.3.1 The Logical Framework

At the very heart of Coqthere is a generic untyped language for expressing abstractions, applications and global constants. This language is used as a meta-language for expressing the terms of the Calculus of Inductive Constructions. General operations on terms like collecting the free variables of an expression, substituting a term for a free variable, etc, are expressed in this language.

The meta-language 'op term of terms has seven main constructors:

( VAR id), a reference to a global identifier called id;
( Rel n), a bound variable, whose binder is the nth binder up in the term;
DLAM (x,t), a deBruijn's binder on the term t;
DLAMV (x,vt), a deBruijn's binder on all the terms of the vector vt;
( DOP0 op), a unary operator op;
DOP2 (op,t₁,t₂), the application of a binary operator op to the terms t₁ and t₂;
DOPN (op,vt), the application of an n-ary operator op to the vector of terms vt.

In this meta-language, bound variables are represented using the so-called deBrujin's indexes. In this representation, an occurrence of a bound variable is denoted by an integer, meaning the number of binders that must be traversed to reach its own binder(Actually, ( Rel n) means that (n-1) binders have to be traversed, since indexes are represented by strictly positive integers.). On the other hand, constants are referred by its name, as usual. For example, if A is a variable of the current section, then the lambda abstraction [x:A]x of the Calculus of Constructions is represented in the meta-language by the term:

(DOP2 (Lambda,(Var A),DLAM (x,(Rel 1)))

In this term, Lambda is a binary operator. Its first argument correspond to the type A of the bound variable, while the second is a body of the abstraction, where x is bound. The name x is just kept to pretty-print the occurrences of the bound variable.

The following functions perform some of the most frequent operations on the terms of the meta-language:

: val Generic.subst1 : 'op term -> 'op term -> 'op term.
( subst1 t₁ t₂) substitutes t₁ for (Rel 1) in t₂.
: val Generic.occur_var : identifier -> 'op term -> bool.
Returns true when the given identifier appears in the term, and false otherwise.
: val Generic.eq_term : 'op term -> 'op term -> bool.
Implements a-equality for terms.
: val Generic.dependent : 'op term -> 'op term -> bool.
Returns true if the first term is a sub-term of the second.

Identifiers, names and sections paths.

Three different kinds of names are used in the meta-language. They are all defined in the Ocaml file Names.

Identifiers.

The simplest kind of names are identifiers. An identifier is a string possibly indexed by an integer. They are used to represent names that are not unique, like for example the name of a variable in the scope of a section. The following operations can be used for handling identifiers:

: val Names.make_ident : string -> int -> identifier.
The value ( make_ident x i) creates the identifier x_i. If i=-1, then the identifier has is created with no index at all.
: val Names.repr_ident : identifier -> string * int.
The inverse operation of make_ident: it yields the string and the index of the identifier.
: val Names.lift_ident : identifier -> identifier.
Increases the index of the identifier by one.
: val Names.next_ident_away : identifier -> identifier list -> identifier.

Generates a new identifier with the same root string than the given one, but with a new index, different from all the indexes of a given list of identifiers.
: val Names.id_of_string : string -> identifier.
Creates an identifier from a string.
: val Names.string_of_id : identifier -> string.
The inverse operation: transforms an identifier into a string

Names.

A name is either an identifier or the special name Anonymous. Names are used as arguments of binders, in order to pretty print bound variables. The following operations can be used for handling names:

: val Names.Name: identifier -> Name.
Constructs a name from an identifier.
: val Names.Anonymous : Name.
Constructs a special, anonymous identifier, like the variable abstracted in the term [_:A]0.
: val Names.next_name_away_with_default : string->name->identifier list->identifier.

If the name is not anonymous, then this function generates a new identifier different from all the ones in a given list. Otherwise, it generates an identifier from the given string.

Section paths.

A section-path is a global name to refer to an object without ambiguity. It can be seen as a sort of filename, where open sections play the role of directories. Each section path is formed by three components: a directory (the list of open sections); a basename (the identifier for the object); and a kind (either CCI for the terms of the Calculus of Constructions, FW for the the terms of F_w, or OBJ for other objects). For example, the name of the following constant:

     Section A.
     Section B.
     Section C.
     Definition zero := O.

is internally represented by the section path:

#A#B#C

dirpath

#zero

basename

.cci

kind

When one of the sections is closed, a new constant is created with an updated section-path,a nd the old one is no longer reachable. In our example, after closing the section C, the new section-path for the constant zero becomes:

#A#B#zero.cci

The following operations can be used to handle section paths:

: val Names.string_of_path : section_path -> string.
Transforms the section path into a string.
: val Names.path_of_string : string -> section_path.
Parses a string an returs the corresponding section path.
: val Names.basename : section_path -> identifier.
Provides the basename of a section path
: val Names.dirpath : section_path -> string list.
Provides the directory of a section path
: val Names.kind_of_path : section_path -> path_kind.
Provides the kind of a section path

Signatures

A signature is a mapping associating different informations to identifiers (for example, its type, its definition, etc). The following operations could be useful for working with signatures:

: val Names.ids_of_sign : 'a signature -> identifier list.
Gets the list of identifiers of the signature.
: val Names.vals_of_sign : 'a signature -> 'a list.
Gets the list of values associated to the identifiers of the signature.
: val Names.lookup_glob1 : identifier -> 'a signature -> (identifier * 'a).

Gets the value associated to a given identifier of the signature.

10.3.2 The Terms of the Calculus of Constructions

The language of the Calculus of Inductive Constructions described in Chapter 4 is implemented on the top of the logical framework, instantiating the parameter op of the meta-language with a particular set of operators. In the implementation this language is called constr, the language of constructions.

Building Constructions

The user does not need to know the choices made to represent constr in the meta-language. They are abstracted away by the following constructor functions:

: val Term.mkRel : int -> constr.
( mkRel n) represents deBrujin's index n.
: val Term.mkVar : identifier -> constr.
( mkVar id) represents a global identifier named id, like a variable inside the scope of a section, or a hypothesis in a proof.
: val Term.mkExistential : constr.
mkExistential represents an implicit sub-term, like the question marks in the term (pair ? ? O true).
: val Term.mkProp : constr.
mkProp represents the sort Prop.
: val Term.mkSet : constr.
mkSet represents the sort Set.
: val Term.mkType : Impuniv.universe -> constr.
( mkType u) represents the term Type(u). The universe u is represented as a section path indexed by an integer.
: val Term.mkConst : section_path -> constr array -> constr.
( mkConst c v) represents a constant whose name is c. The body of the constant is stored in a global table, accessible through the name of the constant. The array of terms v corresponds to the variables of the environment appearing in the body of the constant when it was defined. For instance, a constant defined in the section Foo containing the variable A, and whose body is [x:Prop� Prop](x A) is represented inside the scope of the section by ( mkConst #foo#f.cci [| mkVAR A |]). Once the section is closed, the constant is represented by the term ( mkConst #f.cci [| |]), and its body becomes [A:Prop][x:Prop� Prop](x A).
: val Term.mkMutInd : section_path -> int -> constr array ->constr.
( mkMutInd c i) represents the ith type (starting from zero) of the block of mutually dependent (co)inductive types, whose first type is c. Similarly to the case of constants, the array of terms represents the current environment of the (co)inductive type. The definition of the type (its arity, its constructors, whether it is inductive or co-inductive, etc.) is stored in a global hash table, accessible through the name of the type.
: val Term.mkMutConstruct : section_path -> int -> int -> constr array ->constr.

( mkMutConstruct c i j) represents the jth constructor of the ith type of the block of mutually dependent (co)inductive types whose first type is c. The array of terms represents the current environment of the (co)inductive type.
: val Term.mkCast : constr -> constr -> constr.
( mkCast t T) represents the annotated term t::T in Coq's syntax.
: val Term.mkProd : name ->constr ->constr -> constr.
( mkProd x A B) represents the product (x:A)B. The free ocurrences of x in B are represented by deBrujin's indexes.
: val Term.mkNamedProd : identifier -> constr -> constr -> constr.
( produit x A B) represents the product (x:A)B, but the bound occurrences of x in B are denoted by the identifier ( mkVar x). The function automatically changes each ocurrencs of this identifier into the corresponding deBrujin's index.
: val Term.mkArrow : constr -> constr -> constr.
( arrow A B) represents the type (A� B).
: val Term.mkLambda : name -> constr -> constr -> constr.
( mkLambda x A b) represents the lambda abstraction [x:A]b. The free ocurrences of x in B are represented by deBrujin's indexes.
: val Term.mkNamedLambda : identifier -> constr -> constr -> constr.
( lambda x A b) represents the lambda abstraction [x:A]b, but the bound occurrences of x in B are denoted by the identifier ( mkVar x).
: val Term.mkAppLA : constr array -> constr.
( mkAppLA t [|t₁... t_n|]) represents the application (t t₁ ... t_n).
: val Term.mkMutCaseA : case_info -> constr ->constr ->constr array -> constr.

( mkMutCaseA r P m [|f₁... f_n|]) represents the term <P>Cases m of f₁... f_n end. The first argument r is either None or Some (c,i), where the pair (c,i) refers to the inductive type that m belongs to.
: val Term.mkFix : int array->int->constr array->name list->constr array->constr.

( mkFix [|k₁... k_n |] i [|A₁... A_n|] [|f₁... f_n|] [|t₁... t_n|]) represents the term Fix f_i{f₁/k₁:A₁:=t₁ ... f_n/k_n:A_n:=t_n}
: val Term.mkCoFix : int -> constr array -> name list -> constr array -> constr.

( mkCoFix i [|A₁... A_n|] [|f₁... f_n|] [|t₁... t_n|]) represents the term CoFix f_i{f₁:A₁:=t₁ ... f_n:A_n:=t_n}. There are no decreasing indexes in this case.

Decomposing Constructions

Each of the construction functions above has its corresponding (partial) destruction function, whose name is obtained changing the prefix mk by dest. In addition to these functions, a concrete datatype kindOfTerm can be used to do pattern matching on terms without dealing with their internal representation in the meta-language. This concrete dataype is described in the Ocaml file term.mli. The following function transforms a construction into an element of type kindOfTerm:

: val Term.kind_of_term : constr -> kindOfTerm.
Destructs a term of the language constr, yielding the direct components of the term. Hence, in order to do pattern matching on an object c of constr, it is sufficient to do pattern matching on the value ( kind_of_term c).

Part of the information associated to the constants is stored in global tables. The following functions give access to such information:

: val Termenv.constant_value : constr -> constr.
If the term denotes a constant, projects the body of a constant
: Termenv.constant_type : constr -> constr.
If the term denotes a constant, projects the type of the constant
: val mind_arity : constr -> constr.
If the term denotes an inductive type, projects its arity (i.e., the type of the inductive type).
: val Termenv.mis_is_finite : mind_specif -> bool.
Determines whether a recursive type is inductive or co-inductive.
: val Termenv.mind_nparams : constr -> int.
If the term denotes an inductive type, projects the number of its general parameters.
: val Termenv.mind_is_recursive : constr -> bool.
If the term denotes an inductive type, determines if the type has at least one recursive constructor.
: val Termenv.mind_recargs : constr -> recarg list array array.
If the term denotes an inductive type, returns an array v such that the nth element of v.(i).(j) is Mrec if the nth argument of the jth constructor of the ith type is recursive, and Norec if it is not..

10.3.3 The Type Checker

The third logical module is the type checker. It concentrates two main tasks concerning the language of constructions.

On one hand, it contains the type inference and type-checking functions. The type inference function takes a term a and a signature G, and yields a term A such that G |- a:A. The type-checking function takes two terms a and A and a signature G, and determines whether or not G |- a:A.

On the other hand, this module is in charge of the compilation of Coq's abstract syntax trees into the language constr of constructions. This compilation seeks to eliminate all the ambiguities contained in Coq's abstract syntax, restoring the information necessary to type-check it. It concerns at least the following steps:

Compiling the pattern-matching expressions containing constructor patterns, wild-cards, etc, into terms that only use the primitive Case described in Chapter 4
Restoring type coercions and synthesizing the implicit arguments (the one denoted by question marks in Coq syntax: cf section 2.7).
Transforming the named bound variables into deBrujin's indexes.
Classifying the global names into the different classes of constants (defined constants, constructors, inductive types, etc).

10.3.4 The Proof Engine

The fourth stage of Coq's implementation is the proof engine: the interactive machine for constructing proofs. The aim of the proof engine is to construct a top-down derivation or proof tree, by the application of tactics. A proof tree has the following general structure:

�
�
�

G_i₁

?_i₁ : G_i₁

(tac₁)

�
�
�

G_{i_m}

?_{i_m} : G_{i_m}

(tac_n)

(tac)

Each node of the tree is called a goal. A goal is a record type containing the following three fields:

the conclusion G to be proven;
a typing signature G for the free variables in G;
if the goal is an internal node of the proof tree, the definition t(?₁,...?_n) of an existential variable (i.e. a possible undefined constant) ? of type G in terms of the existential variables of the children sub-goals. If the node is a leaf, the existential variable maybe still undefined.

Once all the existential variables have been defined the derivation is completed, and a construction can be generated from the proof tree, replacing each of the existential variables by its definition. This is exactly what happens when one of the commands Qed, Save or Defined is invoked (cf. Section 6.1.2). The saved theorem becomes a defined constant, whose body is the proof object generated.

Important:

Before being added to the context, the proof object is type-checked, in order to verify that it is actually an object of the expected type G. Hence, the correctness of the proof actually does not depend on the tactics applied to generate it or the machinery of the proof engine, but only on the type-checker. In other words, extending the system with a potentially bugged new tactic never endangers the consistency of the system.

What is a Tactic?

From an operational point of view, the current state of the proof engine is given by the mapping emap from existential variables into goals, plus a pointer to one of the leaf goals g. Such a pointer indicates where the proof tree will be refined by the application of a tactic. A tactic is a function from the current state (g,emap) of the proof engine into a pair (l,val). The first component of this pair is the list of children sub-goals g₁,... g_n of g to be yielded by the tactic. The second one is a validation function. Once the proof trees p₁,... p_n for g₁,... g_n have been completed, this validation function must yield a proof tree (val p₁,... p_n) deriving g.

Tactics can be classified into primitive ones and defined ones. Primitive tactics correspond to the five basic operations of the proof engine:

Introducing a universally quantified variable into the local context of the goal.
Defining an undefined existential variable
Changing the conclusion of the goal for another --definitionally equal-- term.
Changing the type of a variable in the local context for another definitionally equal term.
Erasing a variable from the local context.

Defined tactics are tactics constructed by combining these primitive operations. Defined tactics are registered in a hash table, so that they can be introduced dynamically. In order to define such a tactic table, it is necessary to fix what a possible argument of a tactic may be. The type tactic_arg of the possible arguments for tactics is a union type including:

quoted strings;
integers;
identifiers;
lists of identifiers;
plain terms, represented by its abstract syntax tree;
well-typed terms, represented by a construction;
a substitution for bound variables, like the substitution in the tactic
Apply t with x:=t₁... x_n:=t_n, (cf. Section 7.3.4);
a reduction expression, denoting the reduction strategy to be followed.

Therefore, for each function tac:a � tactic implementing a defined tactic, an associated dynamic tactic tacargs_tac: tactic_arg list � tactic calling tac must be written. The aim of the auxiliary function tacargs_tac is to inject the arguments of the tactic tac into the type of possible arguments for a tactic.

The following function can be used for registering and calling a defined tactic:

: val Tacmach.add_tactic : string -> (tactic_arg list ->tactic) -> unit.

Registers a dynamic tactic with the given string as access index.
: val Tacinterp.vernac_tactic : string*tactic_arg list -> tactic.
Interprets a defined tactic given by its entry in the tactics table with a particular list of possible arguments.
: val Tacinterp.vernac_interp : CoqAst.t -> tactic.
Interprets a tactic expression formed combining Coq's tactics and tacticals, and described by its abstract syntax tree.

When programming a new tactic that calls an already defined tactic tac, we have the choice between using the Ocaml function implementing tac, or calling the tactic interpreter with the name and arguments for interpreting tac. In the first case, a tactic call will left the trace of the whole implementation of tac in the proof tree. In the second, the implementation of tac will be hidden, and only an invocation of tac will be recalled (cf. the example of Section 10.5. The following combinators can be used to hide the implementation of a tactic:

type 'a hiding_combinator = string -> ('a -> tactic) -> ('a -> tactic)
val Tacmach.hide_atomic_tactic  : string -> tactic -> tactic
val Tacmach.hide_constr_tactic  : constr          hiding_combinator
val Tacmach.hide_constrl_tactic : (constr list)   hiding_combinator
val Tacmach.hide_numarg_tactic  : int             hiding_combinator
val Tacmach.hide_ident_tactic   : identifier      hiding_combinator
val Tacmach.hide_identl_tactic  : identifier      hiding_combinator
val Tacmach.hide_string_tactic  : string          hiding_combinator
val Tacmach.hide_bindl_tactic   : substitution    hiding_combinator
val Tacmach.hide_cbindl_tactic  : 
          (constr * substitution) hiding_combinator

These functions first register the tactic by a side effect, and then yield a function calling the interpreter with the registered name and the right injection into the type of possible arguments.

10.3.5 Tactics and Tacticals Provided by Coq

The fifth logical module is the library of tacticals and basic tactics provided by Coq. This library is distributed into the directories tactics and src/tactics. The former contains those basic tactics that make use of the types contained in the basic state of Coq. For example, inversion or rewriting tactics are in the directory tactics, since they make use of the propositional equality type. Those tactics which are independent from the context --like for example Cut, Intros, etc-- are defined in the directory src/tactics. This latter directory also contains some useful tools for programming new tactics, referred in Section 10.4.

In practice, it is very unusual that the list of sub-goals and the validation function of the tactic must be explicitly constructed by the user. In most of the cases, the implementation of a new tactic consists in supplying the appropriate arguments to the basic tactics and tacticals.

Basic Tactics

The file Tactics contain the implementation of the basic tactics provided by Coq. The following tactics are some of the most used ones:

val Tactics.intro           : tactic
val Tactics.assumption      : tactic
val Tactics.clear           : identifier list -> tactic
val Tactics.apply           : constr -> constr substitution -> tactic
val Tactics.one_constructor : int -> constr substitution -> tactic
val Tactics.simplest_elim   : constr -> tactic
val Tactics.elimType        : constr -> tactic
val Tactics.simplest_case   : constr -> tactic
val Tactics.caseType        : constr -> tactic
val Tactics.cut             : constr -> tactic
val Tactics.reduce          : redexpr -> tactic
val Tactics.exact           : constr -> tactic
val Auto.auto               : int option -> tactic
val Auto.trivial            : tactic

The functions hiding the implementation of these tactics are defined in the module Hiddentac. Their names are prefixed by ``h_''.

Tacticals

The following tacticals can be used to combine already existing tactics:

: val Tacticals.tclIDTAC : tactic.
The identity tactic: it leaves the goal as it is.
: val Tacticals.tclORELSE : tactic -> tactic -> tactic.
Tries the first tactic and in case of failure applies the second one.
: val Tacticals.tclTHEN : tactic -> tactic -> tactic.
Applies the first tactic and then the second one to each generated subgoal.
: val Tacticals.tclTHENS : tactic -> tactic list -> tactic.
Applies a tactic, and then applies each tactic of the tactic list to the corresponding generated subgoal.
: val Tacticals.tclTHENL : tactic -> tactic -> tactic.
Applies the first tactic, and then applies the second one to the last generated subgoal.
: val Tacticals.tclREPEAT : tactic -> tactic.
If the given tactic succeeds in producing a subgoal, then it is recursively applied to each generated subgoal, and so on until it fails.
: val Tacticals.tclFIRST : tactic list -> tactic.
Tries the tactics of the given list one by one, until one of them succeeds.
: val Tacticals.tclTRY : tactic -> tactic.
Tries the given tactic and in case of failure applies the tclIDTAC tactical to the original goal.
: val Tacticals.tclDO : int -> tactic -> tactic.
Applies the tactic a given number of times.
: val Tacticals.tclFAIL : tactic.
The always failing tactic: it raises a UserError exception.
: val Tacticals.tclPROGRESS : tactic -> tactic.
Applies the given tactic to the current goal and fails if the tactic leaves the goal unchanged
: val Tacticals.tclNTH_HYP : int -> (constr -> tactic) -> tactic.
Applies a tactic to the nth hypothesis of the local context. The last hypothesis introduced correspond to the integer 1.
: val Tacticals.tclLAST_HYP : (constr -> tactic) -> tactic.
Applies a tactic to the last hypothesis introduced.

10.3.6 The Vernacular Interpreter

The sixth logical module of the implementation corresponds to the interpreter of the vernacular phrases of Coq. These phrases may be expressions from the Gallina language (definitions), general directives (setting commands) or tactics to be applied by the proof engine.

10.3.7 The Parser and the Pretty-Printer

The last logical module is the parser and pretty printer of Coq, which is the interface between the vernacular interpreter and the user. They translate the chains of characters entered at the input into abstract syntax trees, and vice versa. Abstract syntax trees are represented by labeled n-ary trees, and its type is called CoqAst.t. For instance, the abstract syntax tree associated to the term [x:A]x is:

Node ((0,6), "LAMBDA", [ Nvar ((3, 4),"A"); Slam ((0,6), Some "x", Nvar ((5,6),"x"))])

The numbers correspond to locations, used to point to some input line and character positions in the error messages. As it was already explained in Section 10.3.3, this term is then translated into a construction term in order to be typed.

The parser of Coq is implemented using Camlp4. The lexer and the data used by Camlp4 to generate the parser lay in the directory src/parsing. This directory also contains Coq's pretty-printer. The printing rules lay in the directory src/syntax. The different entries of the grammar are described in the module Pcoq.Entry. Let us present here two important functions of this logical module:

: val Pcoq.parse_string : 'a Grammar.Entry.e -> string -> 'a.
Parses a given string, trying to recognize a phrase corresponding to some entry in the grammar. If it succeeds, it yields a value associated to the grammar entry. For example, applied to the entry Command.command, this function parses a term of Coq's language, and yields a value of type CoqAst.t.
: val gentermpr : path_kind -> constr assumptions -> constr -> std_ppcmds.

Pretty-prints a well-typed term of certain kind (cf. Section 10.3.1) under its context of typing assumption.
: val gentacpr : CoqAst.t -> std_ppcmds.
Pretty-prints a given abstract syntax tree representing a tactic expression.

10.3.8 The General Library

In addition to the ones laying in the standard library of Ocaml, several useful modules about lists, arrays, sets, mappings, balanced trees, and other frequently used data structures can be found in the directory lib. Before writing a new one, check if it is not already there!

Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.1.2.html at 8/10/98