10.3 An Overview of Coq's Architecture
10.3 An Overview of Coq's Architecture
The implementation of Coq is based on eight logical
modules. By ``module'' we mean here a logical piece of code having a
conceptual unity, that may concern several Ocaml files. By the sake of
organization, all the Ocaml files concerning a logical module are
grouped altogether into the same sub-directory. The eight modules
are:
1. |
The logical framework |
(directory src/generic) |
2. |
The language of constructions |
(directory src/constr) |
3. |
The type-checker |
(directory src/typing) |
4. |
The proof engine |
(directory src/proofs) |
5. |
The language of basic tactics |
(directory src/tactics) |
6. |
The vernacular interpreter |
(directory src/env) |
7. |
The parser and the pretty-printer |
(directory src/parsing) |
8. |
The standard library |
(directory src/lib) |
The following sections briefly present each of the modules above.
This presentation is not intended to be a complete description of Coq's
implementation, but rather a guideline to be read before taking a look
at the sources. For each of the modules, we also present some of its
most important functions, which are sufficient to implement a large
class of tactics.
10.3.1 The Logical Framework
At the very heart of Coqthere is a generic untyped language for
expressing abstractions, applications and global constants. This
language is used as a meta-language for expressing the terms of the
Calculus of Inductive Constructions. General operations on terms like
collecting the free variables of an expression, substituting a term for
a free variable, etc, are expressed in this language.
The meta-language 'op term of terms has seven main
constructors:
-
( VAR id), a reference to a global identifier called id;
- ( Rel n), a bound variable, whose binder is the nth
binder up in the term;
- DLAM (x,t), a deBruijn's binder on the term t;
- DLAMV (x,vt), a deBruijn's binder on all the terms of
the vector vt;
- ( DOP0 op), a unary operator op;
- DOP2 (op,t1,t2), the application of a binary
operator op to the terms t1 and t2;
- DOPN (op,vt), the application of an n-ary operator op to the
vector of terms vt.
In this meta-language, bound variables are represented using the
so-called deBrujin's indexes. In this representation, an occurrence of
a bound variable is denoted by an integer, meaning the number of
binders that must be traversed to reach its own
binder(Actually, ( Rel n) means that (n-1) binders
have to be traversed, since indexes are represented by strictly
positive integers.). On the other hand, constants are referred by its
name, as usual. For example, if A is a variable of the current
section, then the lambda abstraction [x:A]x of the Calculus of
Constructions is represented in the meta-language by the term:
(DOP2 (Lambda,(Var A),DLAM (x,(Rel 1)))
In this term, Lambda is a binary operator. Its first argument
correspond to the type A of the bound variable, while the second is
a body of the abstraction, where x is bound. The name x is just kept
to pretty-print the occurrences of the bound variable.
The following functions perform some of the most frequent operations
on the terms of the meta-language:
-
- val Generic.subst1 : 'op term -> 'op term -> 'op term.
( subst1 t1 t2) substitutes t1 for
(Rel 1) in t2.
- val Generic.occur_var : identifier -> 'op term -> bool.
Returns true when the given identifier appears in the term,
and false otherwise.
- val Generic.eq_term : 'op term -> 'op term -> bool.
Implements a-equality for terms.
- val Generic.dependent : 'op term -> 'op term -> bool.
Returns true if the first term is a sub-term of the second.
Identifiers, names and sections paths.
Three different kinds of names are used in the meta-language. They are
all defined in the Ocaml file Names.
Identifiers.
The simplest kind of names are
identifiers. An identifier is a string possibly indexed by an
integer. They are used to represent names that are not unique, like
for example the name of a variable in the scope of a section. The
following operations can be used for handling identifiers:
-
- val Names.make_ident : string -> int -> identifier.
The value ( make_ident x i) creates the
identifier xi. If i=-1, then the identifier has
is created with no index at all.
- val Names.repr_ident : identifier -> string * int.
The inverse operation of make_ident:
it yields the string and the index of the identifier.
- val Names.lift_ident : identifier -> identifier.
Increases the index of the identifier by one.
- val Names.next_ident_away :
identifier -> identifier list -> identifier.
Generates a new identifier with the same root string than the
given one, but with a new index, different from all the indexes of
a given list of identifiers.
- val Names.id_of_string : string ->
identifier.
Creates an identifier from a string.
- val Names.string_of_id : identifier -> string.
The inverse operation: transforms an identifier into a string
Names.
A name is either an identifier or the
special name Anonymous. Names are used as arguments of
binders, in order to pretty print bound variables.
The following operations can be used for handling names:
-
- val Names.Name: identifier -> Name.
Constructs a name from an identifier.
- val Names.Anonymous : Name.
Constructs a special, anonymous identifier, like the variable abstracted
in the term [_:A]0.
- val
Names.next_name_away_with_default :
string->name->identifier list->identifier.
If the name is not anonymous, then this function generates a new
identifier different from all the ones in a given list. Otherwise, it
generates an identifier from the given string.
Section paths.
A section-path is a global name to refer to an object without
ambiguity. It can be seen as a sort of filename, where open sections
play the role of directories. Each section path is formed by three
components: a directory (the list of open sections); a
basename (the identifier for the object); and a kind
(either CCI for the terms of the Calculus of Constructions, FW for the
the terms of Fw, or OBJ for other objects). For example, the
name of the following constant:
Section A.
Section B.
Section C.
Definition zero := O.
is internally represented by the section path:
When one of the sections is closed, a new constant is created with an
updated section-path,a nd the old one is no longer reachable. In our
example, after closing the section C, the new section-path
for the constant zero becomes:
#A#B#zero.cci
The following operations can be used to handle section paths:
-
- val Names.string_of_path : section_path -> string.
Transforms the section path into a string.
- val Names.path_of_string : string -> section_path.
Parses a string an returs the corresponding section path.
- val Names.basename : section_path -> identifier.
Provides the basename of a section path
- val Names.dirpath : section_path -> string list.
Provides the directory of a section path
- val Names.kind_of_path : section_path -> path_kind.
Provides the kind of a section path
Signatures
A signature is a mapping associating different informations
to identifiers (for example, its type, its definition, etc). The
following operations could be useful for working with signatures:
-
- val Names.ids_of_sign : 'a signature -> identifier list.
Gets the list of identifiers of the signature.
- val Names.vals_of_sign : 'a signature -> 'a list.
Gets the list of values associated to the identifiers of the signature.
- val Names.lookup_glob1 :
identifier -> 'a signature -> (identifier *
'a).
Gets the value associated to a given identifier of the signature.
10.3.2 The Terms of the Calculus of Constructions
The language of the Calculus of Inductive Constructions described in
Chapter 4 is implemented on the top of the logical framework,
instantiating the parameter op of the meta-language with a
particular set of operators. In the implementation this language is
called constr, the language of constructions.
Building Constructions
The user does not need to know the choices made to represent
constr in the meta-language. They are abstracted away by the
following constructor functions:
-
- val Term.mkRel : int -> constr.
( mkRel n) represents deBrujin's index n.
- val Term.mkVar : identifier -> constr.
( mkVar id)
represents a global identifier named id, like a variable
inside the scope of a section, or a hypothesis in a proof.
- val Term.mkExistential : constr.
mkExistential represents an implicit sub-term, like the question
marks in the term (pair ? ? O true).
- val Term.mkProp : constr.
mkProp represents the sort Prop.
- val Term.mkSet : constr.
mkSet represents the sort Set.
- val Term.mkType : Impuniv.universe -> constr.
( mkType u) represents the term
Type(u). The universe u is represented as a
section path indexed by an integer.
- val Term.mkConst : section_path -> constr array -> constr.
( mkConst c v) represents a constant whose name is
c. The body of the constant is stored in a global table,
accessible through the name of the constant. The array of terms
v corresponds to the variables of the environment appearing in
the body of the constant when it was defined. For instance, a
constant defined in the section Foo containing the
variable A, and whose body is [x:Prop® Prop](x A) is
represented inside the scope of the section by
( mkConst #foo#f.cci [| mkVAR A
|]). Once the section is closed, the constant is represented by
the term ( mkConst #f.cci [| |]), and its body
becomes [A:Prop][x:Prop® Prop](x A).
- val Term.mkMutInd : section_path -> int -> constr array ->constr.
( mkMutInd c i) represents the ith type
(starting from zero) of the block of mutually dependent
(co)inductive types, whose first type is c. Similarly to the
case of constants, the array of terms represents the current
environment of the (co)inductive type. The definition of the type
(its arity, its constructors, whether it is inductive or co-inductive, etc.)
is stored in a global hash table, accessible through the name of
the type.
- val Term.mkMutConstruct :
section_path -> int -> int -> constr array
->constr.
( mkMutConstruct c i j) represents the
jth constructor of the ith type of the block of mutually
dependent (co)inductive types whose first type is c. The array
of terms represents the current environment of the (co)inductive
type.
- val Term.mkCast : constr -> constr -> constr.
( mkCast t T) represents the annotated term t::T in
Coq's syntax.
- val Term.mkProd : name ->constr ->constr -> constr.
( mkProd x A B) represents the product (x:A)B.
The free ocurrences of x in B are represented by deBrujin's
indexes.
- val Term.mkNamedProd : identifier -> constr -> constr -> constr.
( produit x A B) represents the product (x:A)B,
but the bound occurrences of x in B are denoted by
the identifier ( mkVar x). The function automatically
changes each ocurrencs of this identifier into the corresponding
deBrujin's index.
- val Term.mkArrow : constr -> constr -> constr.
( arrow A B) represents the type (A® B).
- val Term.mkLambda : name -> constr -> constr -> constr.
( mkLambda x A b) represents the lambda abstraction
[x:A]b. The free ocurrences of x in B are represented by deBrujin's
indexes.
- val Term.mkNamedLambda : identifier -> constr -> constr -> constr.
( lambda x A b) represents the lambda abstraction
[x:A]b, but the bound occurrences of x in B are denoted by
the identifier ( mkVar x).
- val Term.mkAppLA : constr array -> constr.
( mkAppLA t [|t1... tn|]) represents the application
(t t1 ... tn).
- val Term.mkMutCaseA :
case_info -> constr ->constr
->constr array -> constr.
( mkMutCaseA r P m [|f1... fn|])
represents the term <P>Cases m of f1... fn end. The first argument
r is either None or Some (c,i), where the
pair (c,i) refers to the inductive type that m belongs to.
- val Term.mkFix :
int array->int->constr array->name
list->constr array->constr.
( mkFix [|k1... kn |] i [|A1...
An|] [|f1... fn|] [|t1... tn|]) represents the term
Fix fi{f1/k1:A1:=t1 ... fn/kn:An:=tn}
- val Term.mkCoFix :
int -> constr array -> name list ->
constr array -> constr.
( mkCoFix i [|A1...
An|] [|f1... fn|] [|t1... tn|]) represents the term
CoFix fi{f1:A1:=t1 ... fn:An:=tn}. There are no
decreasing indexes in this case.
Decomposing Constructions
Each of the construction functions above has its corresponding
(partial) destruction function, whose name is obtained changing the
prefix mk by dest. In addition to these functions, a
concrete datatype kindOfTerm can be used to do pattern
matching on terms without dealing with their internal representation
in the meta-language. This concrete dataype is described in the Ocaml
file term.mli. The following function transforms a construction
into an element of type kindOfTerm:
-
- val Term.kind_of_term : constr -> kindOfTerm.
Destructs a term of the language constr,
yielding the direct components of the term. Hence, in order to do
pattern matching on an object c of constr, it is sufficient
to do pattern matching on the value ( kind_of_term c).
Part of the information associated to the constants is stored in
global tables. The following functions give access to such
information:
-
- val Termenv.constant_value : constr -> constr.
If the term denotes a constant, projects the body of a constant
- Termenv.constant_type : constr -> constr.
If the term denotes a constant, projects the type of the constant
- val mind_arity : constr -> constr.
If the term denotes an inductive type, projects its arity (i.e.,
the type of the inductive type).
- val Termenv.mis_is_finite : mind_specif -> bool.
Determines whether a recursive type is inductive or co-inductive.
- val Termenv.mind_nparams : constr -> int.
If the term denotes an inductive type, projects the number of
its general parameters.
- val Termenv.mind_is_recursive : constr -> bool.
If the term denotes an inductive type,
determines if the type has at least one recursive constructor.
- val Termenv.mind_recargs : constr -> recarg list array array.
If the term denotes an inductive type, returns an array v such
that the nth element of v.(i).(j) is
Mrec if the nth argument of the jth constructor of
the ith type is recursive, and Norec if it is not..
10.3.3 The Type Checker
The third logical module is the type checker. It concentrates two main
tasks concerning the language of constructions.
On one hand, it contains the type inference and type-checking
functions. The type inference function takes a term
a and a signature G, and yields a term A such that
G |- a:A. The type-checking function takes two terms a
and A and a signature G, and determines whether or not
G |- a:A.
On the other hand, this module is in charge of the compilation of
Coq's abstract syntax trees into the language constr of
constructions. This compilation seeks to eliminate all the ambiguities
contained in Coq's abstract syntax, restoring the information
necessary to type-check it. It concerns at least the following steps:
-
Compiling the pattern-matching expressions containing
constructor patterns, wild-cards, etc, into terms that only
use the primitive Case described in Chapter 4
- Restoring type coercions and synthesizing the implicit arguments
(the one denoted by question marks in
Coq syntax: cf section 2.7).
- Transforming the named bound variables into deBrujin's indexes.
- Classifying the global names into the different classes of
constants (defined constants, constructors, inductive types, etc).
10.3.4 The Proof Engine
The fourth stage of Coq's implementation is the proof engine:
the interactive machine for constructing proofs. The aim of the proof
engine is to construct a top-down derivation or proof tree,
by the application of tactics. A proof tree has the following
general structure:
Each node of the tree is called a goal. A goal
is a record type containing the following three fields:
-
the conclusion G to be proven;
- a typing signature G for the free variables in G;
- if the goal is an internal node of the proof tree, the
definition t(?1,...?n) of an existential variable
(i.e. a possible undefined constant) ? of type G in terms of the
existential variables of the children sub-goals. If the node is a
leaf, the existential variable maybe still undefined.
Once all the existential variables have been defined the derivation is
completed, and a construction can be generated from the proof tree,
replacing each of the existential variables by its definition. This
is exactly what happens when one of the commands
Qed, Save or Defined is invoked
(cf. Section 6.1.2). The saved theorem becomes a defined constant,
whose body is the proof object generated.
Important:
Before being added to the
context, the proof object is type-checked, in order to verify that it is
actually an object of the expected type G. Hence, the correctness
of the proof actually does not depend on the tactics applied to
generate it or the machinery of the proof engine, but only on the
type-checker. In other words, extending the system with a potentially
bugged new tactic never endangers the consistency of the system.
What is a Tactic?
From an operational point of view, the current state of the proof
engine is given by the mapping emap from existential variables into
goals, plus a pointer to one of the leaf goals g. Such a pointer
indicates where the proof tree will be refined by the application of a
tactic. A tactic is a function from the current state
(g,emap) of the proof engine into a pair (l,val). The first
component of this pair is the list of children sub-goals g1,...
gn of g to be yielded by the tactic. The second one is a
validation function. Once the proof trees p1,...
pn for g1,... gn have been completed, this validation
function must yield a proof tree (val p1,... pn) deriving
g.
Tactics can be classified into primitive ones and
defined ones. Primitive tactics correspond to the five basic
operations of the proof engine:
-
Introducing a universally quantified variable into the local
context of the goal.
- Defining an undefined existential variable
- Changing the conclusion of the goal for another
--definitionally equal-- term.
- Changing the type of a variable in the local context for another
definitionally equal term.
- Erasing a variable from the local context.
Defined tactics are tactics constructed by combining these
primitive operations. Defined tactics are registered in a hash table,
so that they can be introduced dynamically. In order to define such a
tactic table, it is necessary to fix what a possible argument
of a tactic may be. The type tactic_arg of the possible
arguments for tactics is a union type including:
-
quoted strings;
- integers;
- identifiers;
- lists of identifiers;
- plain terms, represented by its abstract syntax tree;
- well-typed terms, represented by a construction;
- a substitution for bound variables, like the
substitution in the tactic
Apply t with x:=t1...
xn:=tn, (cf. Section 7.3.4);
- a reduction expression, denoting the reduction strategy to be
followed.
Therefore, for each function tac:a ® tactic implementing a
defined tactic, an associated dynamic tactic tacargs_tac:
tactic_arg list ® tactic calling tac must be
written. The aim of the auxiliary function tacargs_tac is to inject
the arguments of the tactic tac into the type of possible arguments
for a tactic.
The following function can be used for registering and calling a
defined tactic:
-
- val Tacmach.add_tactic :
string -> (tactic_arg list ->tactic) -> unit.
Registers a dynamic tactic with the given string as access index.
- val Tacinterp.vernac_tactic : string*tactic_arg list -> tactic.
Interprets a defined tactic given by its entry in the
tactics table with a particular list of possible arguments.
- val Tacinterp.vernac_interp : CoqAst.t -> tactic.
Interprets a tactic expression formed combining Coq's tactics and
tacticals, and described by its abstract syntax tree.
When programming a new tactic that calls an already defined tactic
tac, we have the choice between using the Ocaml function
implementing tac, or calling the tactic interpreter with the name
and arguments for interpreting tac. In the first case, a tactic call
will left the trace of the whole implementation of tac in the proof
tree. In the second, the implementation of tac will be hidden, and
only an invocation of tac will be recalled (cf. the example of
Section 10.5. The following combinators can be used
to hide the implementation of a tactic:
type 'a hiding_combinator = string -> ('a -> tactic) -> ('a -> tactic)
val Tacmach.hide_atomic_tactic : string -> tactic -> tactic
val Tacmach.hide_constr_tactic : constr hiding_combinator
val Tacmach.hide_constrl_tactic : (constr list) hiding_combinator
val Tacmach.hide_numarg_tactic : int hiding_combinator
val Tacmach.hide_ident_tactic : identifier hiding_combinator
val Tacmach.hide_identl_tactic : identifier hiding_combinator
val Tacmach.hide_string_tactic : string hiding_combinator
val Tacmach.hide_bindl_tactic : substitution hiding_combinator
val Tacmach.hide_cbindl_tactic :
(constr * substitution) hiding_combinator
These functions first register the tactic by a side effect, and then
yield a function calling the interpreter with the registered name and
the right injection into the type of possible arguments.
10.3.5 Tactics and Tacticals Provided by Coq
The fifth logical module is the library of tacticals and basic tactics
provided by Coq. This library is distributed into the directories
tactics and src/tactics. The former contains those
basic tactics that make use of the types contained in the basic state
of Coq. For example, inversion or rewriting tactics are in the
directory tactics, since they make use of the propositional
equality type. Those tactics which are independent from the context
--like for example Cut, Intros, etc-- are defined in
the directory src/tactics. This latter directory also
contains some useful tools for programming new tactics, referred in
Section 10.4.
In practice, it is very unusual that the list of sub-goals and the
validation function of the tactic must be explicitly constructed by
the user. In most of the cases, the implementation of a new tactic
consists in supplying the appropriate arguments to the basic tactics
and tacticals.
Basic Tactics
The file Tactics contain the implementation of the basic
tactics provided by Coq. The following tactics are some of the most
used ones:
val Tactics.intro : tactic
val Tactics.assumption : tactic
val Tactics.clear : identifier list -> tactic
val Tactics.apply : constr -> constr substitution -> tactic
val Tactics.one_constructor : int -> constr substitution -> tactic
val Tactics.simplest_elim : constr -> tactic
val Tactics.elimType : constr -> tactic
val Tactics.simplest_case : constr -> tactic
val Tactics.caseType : constr -> tactic
val Tactics.cut : constr -> tactic
val Tactics.reduce : redexpr -> tactic
val Tactics.exact : constr -> tactic
val Auto.auto : int option -> tactic
val Auto.trivial : tactic
The functions hiding the implementation of these tactics are defined
in the module Hiddentac. Their names are prefixed by ``h_''.
Tacticals
The following tacticals can be used to combine already existing
tactics:
-
- val Tacticals.tclIDTAC : tactic.
The identity tactic: it leaves the goal as it is.
- val Tacticals.tclORELSE : tactic -> tactic -> tactic.
Tries the first tactic and in case of failure applies the second one.
- val Tacticals.tclTHEN : tactic -> tactic -> tactic.
Applies the first tactic and then the second one to each generated subgoal.
- val Tacticals.tclTHENS : tactic -> tactic list -> tactic.
Applies a tactic, and then applies each tactic of the tactic list to the
corresponding generated subgoal.
- val Tacticals.tclTHENL : tactic -> tactic -> tactic.
Applies the first tactic, and then applies the second one to the last
generated subgoal.
- val Tacticals.tclREPEAT : tactic -> tactic.
If the given tactic succeeds in producing a subgoal, then it
is recursively applied to each generated subgoal,
and so on until it fails.
- val Tacticals.tclFIRST : tactic list -> tactic.
Tries the tactics of the given list one by one, until one of them
succeeds.
- val Tacticals.tclTRY : tactic -> tactic.
Tries the given tactic and in case of failure applies the tclIDTAC tactical to the original goal.
- val Tacticals.tclDO : int -> tactic -> tactic.
Applies the tactic a given number of times.
- val Tacticals.tclFAIL : tactic.
The always failing tactic: it raises a UserError exception.
- val Tacticals.tclPROGRESS : tactic -> tactic.
Applies the given tactic to the current goal and fails if the
tactic leaves the goal unchanged
- val Tacticals.tclNTH_HYP : int -> (constr -> tactic) -> tactic.
Applies a tactic to the nth hypothesis of the local context.
The last hypothesis introduced correspond to the integer 1.
- val Tacticals.tclLAST_HYP : (constr -> tactic) -> tactic.
Applies a tactic to the last hypothesis introduced.
10.3.6 The Vernacular Interpreter
The sixth logical module of the implementation corresponds to the
interpreter of the vernacular phrases of Coq. These phrases may be
expressions from the Gallina language (definitions), general
directives (setting commands) or tactics to be applied by the proof
engine.
10.3.7 The Parser and the Pretty-Printer
The last logical module is the parser and pretty printer of Coq,
which is the interface between the vernacular interpreter and the
user. They translate the chains of characters entered at the input
into abstract syntax trees, and vice versa. Abstract syntax trees are
represented by labeled n-ary trees, and its type is called
CoqAst.t. For instance, the abstract syntax tree associated
to the term [x:A]x is:
Node |
((0,6), "LAMBDA", [ |
Nvar |
((3, 4),"A"); |
Slam |
((0,6), Some "x", |
Nvar |
((5,6),"x"))]) |
The numbers correspond to locations, used to point to some
input line and character positions in the error messages. As it was
already explained in Section 10.3.3, this term is then
translated into a construction term in order to be typed.
The parser of Coq is implemented using Camlp4. The lexer and the data
used by Camlp4 to generate the parser lay in the directory
src/parsing. This directory also contains Coq's
pretty-printer. The printing rules lay in the directory
src/syntax. The different entries of the grammar are
described in the module Pcoq.Entry. Let us present here two
important functions of this logical module:
-
- val Pcoq.parse_string : 'a Grammar.Entry.e -> string -> 'a.
Parses a given string, trying to recognize a phrase
corresponding to some entry in the grammar. If it succeeds,
it yields a value associated to the grammar entry. For example,
applied to the entry Command.command, this function
parses a term of Coq's language, and yields a value of type
CoqAst.t.
- val gentermpr :
path_kind -> constr assumptions -> constr -> std_ppcmds.
Pretty-prints a well-typed term of certain kind (cf. Section
10.3.1) under its context of typing assumption.
- val gentacpr : CoqAst.t -> std_ppcmds.
Pretty-prints a given abstract syntax tree representing a tactic
expression.
10.3.8 The General Library
In addition to the ones laying in the standard library of Ocaml,
several useful modules about lists, arrays, sets, mappings, balanced
trees, and other frequently used data structures can be found in the
directory lib. Before writing a new one, check if it is not
already there!
Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.1.2.html at 8/10/98