9.3 Writing your own pretty printing rules

There is a mechanism for extending the vernacular's printer by adding, in the interactive toplevel, new printing rules. The printing rules are stored into a table and will be recovered at the moment of the printing by the vernacular's printer.

The user can print new constants, tactics and vernacular phrases with his desired syntax. The printing rules for new constants should be written after the definition of the constants. The rules should be outside a section if the user wants them to be exported.

The printing rules corresponding to the heart of the system (primitive tactics, commands and the vernacular language) are defined, respectively, in the files PPTactic.v and PPCommand.v (in the directory src/syntax). These files are automatically loaded in the initial state. The user is not expected to modify these files unless he dislikes the way primitive things are printed, in which case he will have to compile the system after doing the modifications.

When the system fails to find a suitable printing rule, a tag #GENTERMappears in the message.

In the following we give some examples showing how to write the printing rules for the non-terminal and terminal symbols of a grammar. We will test them frequently by inspecting the error messages. Then, we give the grammar of printing rules and a description of its semantics.

9.3.1 The Printing Rules

The printing of non terminals

The printing is the inverse process of parsing. While a grammar rule maps an input stream of characters into an AST, a printing rule maps an AST into an output stream of printing orders. So given a certain grammar rule, the printing rule is generally obtained by inverting the grammar rule.

Like grammar rules, it is possible to define several rules at the same time. The exact syntax for complex rules is described in 9.3.2. A simple printing rule is of the form:

Syntax universe level precedence : name [ pattern ] -> [ printing-orders ].

where :

universe is an identifier denoting the universe of the AST to be printed. There is a correspondence between the universe of the grammar rule used to generate the AST and the one of the printing rule:

Univ. Grammar Univ. Printing

tactic tactic

command constr

The vernac universe has no equivalent in pretty-printing since vernac phrases are never printed by the system. Error messages are reported by re-displaying what the user entered.
precedence is positive integer indicating the precedence of the rule. In general the precedence for tactics is 0. The universe of commands is implicitly stratified by the hierarchy of the parsing rules. We have non terminals command0, command1, ..., command10. The idea is that objects parsed with the non terminal command_i have precedence i. In most of the cases we fix the precedence of the printing rules for commands to be the same number of the non terminal with which it is parsed.

A precedence may also be a triple of integers. The triples are ordered in lexicographic order, and the level n is equal to [n 0 0].
name is the name of the printing rule. A rule is identified by both its universe and name, if there are two rules with both the same name and universe, then the last one overrides the former.
pattern is a pattern that matches the AST to be printed. The syntax of patterns is very similar to the grammar for ASTs. A description of their syntax is given in section 9.1.
printing-orders is the sequence of orders indicating the concrete layout of the printer.

Example 1: Syntax for user-defined tactics.

The first usage of the Syntax command might be the printing order for a user-defined tactic:

Coq < Declare ML Module "eqdecide".

Coq < Syntax tactic level 0:
Coq < ComparePP [(Compare $com1 $com2)] ->
Coq < ["Compare" [1 2] $com1 [1 2] $com2].

If such a printing rule is not given, a disgraceful #GENTERM will appear when typing Show Script or Save. For a tactic macro defined by a Tactic Definition command, a printing rule is automatically generated so the user don't have to write one.

Example 2: Defining the syntax for new constants.

Let's define the constant Xor in Coq:

Coq < Definition Xor := [A,B:Prop] A/\~B \/ ~A/\B.

Given this definition, we want to use the syntax of A X B to denote (Xor A B). To do that we give the grammar rule:

Coq < Grammar command command7 :=
Coq < Xor [ command6($c1) "X" command7($c2) ] -> [<<(Xor $c1 $c2)>>].

Note that the operator is associative to the right. Now True X False is well parsed:

Coq < Goal True X False.
1 subgoal

  ============================
   (Xor True False)

To have it well printed we extend the printer:

Coq < Syntax constr level 7:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].

and now we have the desired syntax:

Coq < Show.
1 subgoal

  ============================
   True X False

Let's comment the rule:

constr is the universe of the printing rule.
7 is the rule's precedence and it is the same one than the parsing production (command7).
Pxor is the name of the printing rule.
<<(Xor $t1 $t2)>> is the pattern of the AST to be printed. Between << >> we are allowed to use the syntax of command instead of syntax of ASTs. Metavariables may occur in the pattern but preceded by $.
$t1:L " X " $t2:E are the printing orders, it tells to print the value of $t1 then the symbol X and then the value of $t2.

The L in the little box $t1:L indicates not to put parentheses around the value of $t1 if its precedence is less than the rule's one. An E instead of the L would mean not to put parentheses around the value of $t1 if its the precedence is less or equal than the rule's one.

The associativity of the operator can be expressed in the following way:

$t1:L " X " $t2:E associates the operator to the right.

$t1:E " X " $t2:L associates to the left.

$t1:L " X " $t2:L is non-associative.

Note that while grammar rules are related by the name of non-terminals (such as command6 and command7) printing rules are isolated. The Pxor rule tells how to print an Xor expression but not how to print its subterms. The printer looks up recursively the rules for the values of $t1 and $t2. The selection of the printing rules is strictly determined by the structure of the AST to be printed.

This could have been defined with the Infix command.

Example 3: Forcing to parenthesize a new syntactic construction

You can force to parenthesize a new syntactic construction by fixing the precedence of its printing rule to a number greater than 9. For example a possible printing rule for the Xor connector in the prefix notation would be:

Coq < Syntax constr level 10:
Coq < ex_imp [<<(Xor $t1 $t2)>>] -> [ "X " $t1:L " " $t2:L ].

No explicit parentheses are contained in the rule, nevertheless, when using the connector, the parentheses are automatically written:

Coq < Show.
1 subgoal

  ============================
   (X True False)

A precedence higher than 9 ensures that the AST value will be parenthesized by default in either the empty context or if it occurs in a context where the instructions are of the form $t:L or $t:E.

Example 4: Dealing with list patterns in the syntax rules

The following productions extend the parser to recognize a tactic called MyIntros that receives a list of identifiers as argument as the primitive Intros tactic does:

Coq < Grammar tactic simple_tactic :=
Coq < my_intros [ "MyIntros" ne_identarg_list($idl) ] ->
Coq < [(MyIntrosWith ($LIST $idl))].

To define the printing rule for MyIntros it is necessary to define the printing rule for the non terminal ne_identarg_list. In grammar productions the dependency between the non terminals is explicit. This is not the case for printing rules, where the dependency between the rules is determined by the structure of the pattern. So, the way to make explicit the relation between printing rules is by adding structure to the patterns.

Coq < Syntax tactic level 0:
Coq < myintroswith [(MyIntrosWith ($LIST $L))] ->
Coq < [ "MyIntros " (NEIDENTARGLIST ($LIST $L)) ].

This rule says to print the string MyIntros and then to print the value of
(NEIDENTARGLIST ($LIST $L)).

Coq < Syntax tactic level 0:
Coq < ne_identarg_list_cons [(NEIDENTARGLIST $id ($LIST $l))]
Coq < -> [ $id " " (NEIDENTARGLIST ($LIST $l)) ]
Coq < | ne_identarg_list_single [(NEIDENTARGLIST $id)] -> [ $id ].
Warning: overriding syntax rule tactic:ne_identarg_list_cons.

The first rule says how to print a non-empty list, while the second one says how to print the list with exactly one element. Note that the pattern structure of the binding in the first rule ensures its use in a recursive way.

Like the order of grammar productions, the order of printing rules does matter. In case of two rules whose patterns superpose each other the last rule is always chosen. In the example, if the last two rules were written in the inverse order the printing will not work, because only the rule ne_identarg_list_cons would be recursively retrieved and there is no rule for the empty list. Other possibilities would have been to write a rule for the empty list instead of the ne_identarg_list_single rule,

Coq < Syntax tactic level 0:
Coq < ne_identarg_list_nil [(NEIDENTARGLIST)] -> [ ].

This rule indicates to do nothing in case of the empty list. In this case there is no superposition between patterns (no critical pairs) and the order is not relevant. But a useless space would be printed after the last identifier.

Example 5: Defining constants with arbitrary number of arguments

Sometimes the constants we define may have an arbitrary number of arguments, the typical case are polymorphic functions. Let's consider for example the functional composition operator. The following rule extends the parser:

Coq < Definition explicit_comp := [A,B,C:Set][f:A->B][g:B->C][a:A](g (f a)).

Coq < Grammar command command6 :=
Coq < expl_comp [command5($c1) "o" command6($c2) ] ->
Coq < [<<(explicit_comp ? ? ? $c1 $c2)>>].

Our first idea is to write the printing rule just by ``inverting'' the production:

Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp ? ? ? $f $g)>>] -> [ $f:L "o" $g:L ].

This rule is not correct: ? is an ordinary AST (indeed, it is the AST (XTRA "ISEVAR")), and does not behave as the ``wildcard'' pattern $_. Here is a correct version of this rule:

Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp $_ $_ $_ $f $g)>>] -> [ $f:L "o" $g:L ].

Let's test the printing rule:

Coq < Definition Id := [A:Set][x:A]x.
Id is defined

Coq < Check (Id nat) o (Id nat).
(Id nat)o(Id nat)
: nat->nat

Coq < Check ((Id nat)o(Id nat) O).
(explicit_comp nat nat nat (Id nat) (Id nat) O)
: nat

In the first case the rule was used, while in the second one the system failed to match the pattern of the rule with the AST of ((Id nat)o(Id nat) O). Internally the AST of this term is the same as the AST of the term (explicit_comp nat nat nat (Id nat) (Id nat) O). When the system retrieves our rule it tries to match an application of six arguments with an application of five arguments (the AST of (explicit_comp $_ $_ $_ $f $g)). Then, the matching fails and the term is printed using the rule for application.

Note that the idea of adding a new rule for explicit_comp for the case of six arguments does not solve the problem, because of the polymorphism, we can always build a term with one argument more. The rules for application deal with the problem of having an arbitrary number of arguments by using list patterns. Let's see these rules:

Coq < Syntax constr level 10:
Coq < app [(APPLIST $H ($LIST $T))] ->
Coq < [ [<hov 0> $H:E (APPTAIL ($LIST $T)):E ] ]
Coq <
Coq < | apptailcons [(APPTAIL $H ($LIST $T))] ->
Coq < [ [1 1] $H:L (APPTAIL ($LIST $T)):E ]
Coq < | apptailnil [(APPTAIL)] -> [ ].

The first rule prints the operator of the application, and the second prints the list of its arguments. Then, one solution to our problem is to specialize the first rule of the application to the cases where the operator is explicit_comp and the list pattern has at least five arguments:

Coq < Syntax constr level 10:
Coq < expl_comp
Coq < [(APPLIST <<explicit_comp>> $_ $_ $_ $f $g ($LIST $l))]
Coq < -> [ [<hov 0> $f:L "o" $g:L (APPTAIL ($LIST $l)) ] ].

Now we can see that this rule works for any application of the operator:

Coq < Check ((Id nat) o (Id nat) O).
((Id nat)o(Id nat) O)
: nat

Coq < Check ((Id nat->nat) o (Id nat->nat) [x:nat]x O).
((Id nat->nat)o(Id nat->nat) [x:nat]x O)
: nat

In the examples presented by now, the rules have no information about how to deal with indentation, break points and spaces, the printer will write everything in the same line without spaces. To indicate the concrete layout of the patterns, there's a simple language of printing instructions that will be described in the following section.

The printing of terminals

The user is not expected to write the printing rules for terminals, this is done automatically. Primitive printing is done for identifiers, strings, paths, numbers. For example :

Coq < Grammar vernac vernac :=
Coq < mycd [ "MyCd" prim:string($dir) "." ] -> [(MYCD $dir)].

Coq < Syntax vernac level 0:
Coq < mycd [(MYCD $dir)] -> [ "MyCd " $dir ].

There is no more need to encapsulate the $dir meta-variable with the $PRIM or the $STR operator as in the version 6.1. However, the pattern (MYCD ($STR $dir)) would be safer, because the rule would not be selected to print an ill-formed AST. The name of default primitive printer is the Objective Caml function print_token. If the user wants a particular terminal to be printed by another printer, he may specify it in the right part of the rule. Example:

Coq < Syntax tactic level 0 :
Coq < do_pp [(DO ($NUM $num) $tactic)]
Coq < -> [ "Do " $num:"my_printer" [1 1] $tactic ].

The printer my_printer must have been installed as shown below.

Primitive printers

Writing and installing primitive pretty-printers requires to have the sources of the system like writing tactics.

A primitive pretty-printer is a Caml function of type

     Esyntax.std_printer -> CoqAst.t -> Pp.std_ppcmds

The first argument is the global printer, it can be called for example by the specific printer to print subterms. The second argument is the AST to print, and the result is a stream of printing orders like :

'sTR"string" to print the string string
'bRK num1 num2 that has the same semantics than [ num1 num2 ] in the print rules.
'sPC to leave a blank space
'iNT n to print the integer n
...

There is also commands to make boxes (h or hv, described in file src/lib/util/pp.mli). Once the printer is written, it must be registered by the command :

     Esyntax.Ppprim.add ("name",my_printer);;

Then, in the toplevel, after having loaded the right Objective Caml module, it can be used in the right hand side of printing orders using the syntax $truc:"name".

The real name and the registered name of a pretty-printer does not need to be the same. However, it can be nice and simple to give the same name.

9.3.2 Syntax for pretty printing rules

This section describes the syntax for printing rules. The metalanguage conventions are the same as those specified for the definition of the pattern's syntax in section 9.1. The grammar of printing rules is the following:

printing-rule	::=	`Syntax` ident level `;` ... `;` level
level	::=	`level` precedence `:` rule `\|` ... `\|` rule
precedence	::=	integer \| `[` integer integer integer `]`
rule	::=	ident `[` pattern `] -> [` [printing-order ... printing-order] `]`
printing-order	::=	`FNL`
	\|	string
	\|	`[` integer integer `]`
	\|	`[` box [printing-order ... printing-order] `]`
	\|	ast [`:` prim-printer] [`:` paren-rel]
box	::=	`<` box-type integer `>`
box-type	::=	`hov` \| `hv` \| `v` \| `h`
paren-rel	::=	`L` \| `E`
prim-printer	::=	string
pattern	::=	ast

As already stated, the order of rules in a given level is relevant (the last ones override the previous ones).

Pretty grammar structures

The basic structure is the printing order sequence. Each order has a printing effect and they are sequentially executed. The orders can be:

printing orders
printing boxes

Printing orders

Printing orders can be of the form:

"string" prints the string.
FNL force a new line.

$t:paren-rel or $t:prim-printer:paren-rel

ast is used to build an AST in current context. The printer looks up the adequate printing rule and applies recursively this method. The optional field prim-printer is a string with the name primitive pretty-printer to call (The name is not the name of the Objective Caml function, but the name given to Esyntax.Ppprim.add). Recursion of the printing is determined by the pattern's structure. paren-rel is the following:

`L`	if t 's precedence is less than the rule's one, then no parentheses
	around t are written.
`E`	if t 's precedence is less or equal than the rule's one then no parentheses
	around t are written.
none	never write parentheses around t.

Printing boxes

The concept of formatting boxes is used to describe the concrete layout of patterns: a box may contain many objects which are orders or subboxes sequences separated by breakpoints; the box wraps around them an imaginary rectangle.

Box types

The type of boxes specifies the way the components of the box will be displayed and may be:
- h : to catenate objects horizontally.
- v : to catenate objects vertically.
- hv : to catenate objects as with an ``h box'' but an automatic vertical folding is applied when the horizontal composition does not fit into the width of the associated output device.
- hov : to catenate objects horizontally but if the horizontal composition does not fit, a vertical composition will be applied, trying to catenate horizontally as many objects as possible.
The type of the box can be followed by a n offset value, which is the offset added to the current indentation when breaking lines inside the box.
Boxes syntax

A box is described by a sequence surrounded by [ ]. The first element of the sequence is the box type: this type surrounded by the symbols < > is one of the words hov, hv, v, v followed by an offset. The default offset is 0 and the default box type is h.
Breakpoints

In order to specify where the pretty-printer is allowed to break, one of the following break-points may be used:
- [0 0] is a simple break-point, if the line is not broken here, no space is included (``Cut'').
- [1 0] if the line is not broken then a space is printed (``Spc'').
- [i j] if the line is broken, the value j is added to the current indentation for the following line; otherwise i blank spaces are inserted (``Brk'').
Examples : It is interesting to test printing rules on ``small'' and ``large'' expressions in order to see how the break of lines and indentation are managed. Let's define two constants and make a Print of them to test the rules. Here are some examples of rules for our constant Xor:

Coq < Definition A := True X True.

Coq < Definition B := True X True X True X True X True X True X True
Coq < X True X True X True X True X True X True.

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].

This rule prints everything in the same line exceeding the line's width.

Coq < Print B.
B =
True X True X True X True X True X True X True X True X True X True X Tru
e X True X True
     : Prop

Let's add some break-points in order to force the printer to break the line before the operator:

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L [0 1] " X " $t2:E ].

Coq < Print B.
B = True X True X True X True X True X True X True X True X True X True
  X True X True X True
     : Prop

The line was correctly broken but there is no indentation at all. To deal with indentation we use a printing box:

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hov 0> $t1:L [0 1] " X " $t2:E ] ].

With this rule the printing of A is correct, an the printing of B is indented.

Coq < Print B.
B =
True
  X True
      X True
          X True
              X True
                  X True
                      X True X True X True X True X True X True X True
     : Prop

If we had chosen the mode v instead of hov :

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ [<v 0> $t1:L [0 1] " X " $t2:E ] ].

We would have obtained a vertical presentation:

Coq < Print A.
A = True
      X True
     : Prop

The difference between the presentation obtained with the hv and hov type box is not evident at first glance. Just for clarification purposes let's compare the result of this silly rule using an hv and a hov box type:

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hv 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Coq < [0 0] "----------------------------------------"
Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].

Coq < Print A.
A =
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
----------------------------------------
ZZZZZZZZZZZZZZZZ
     : Prop

Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hov 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Coq < [0 0] "----------------------------------------"
Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].

Coq < Print A.
A =
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
----------------------------------------ZZZZZZZZZZZZZZZZ
     : Prop

In the first case, as the three strings to be printed do not fit in the line's width, a vertical presentation is applied. In the second case, a vertical presentation is applied, but as the last two strings fit in the line's width, they are printed in the same line.

9.3.3 Debugging the printing rules

By now, there is almost no semantic check of printing rules in the system. To find out where the problem is, there are two possibilities: to analyze the rules looking for the most common errors or to work in the toplevel tracing the ml code of the printer. When the system can't find the proper rule to print an Ast, it prints #GENTERM ast. If you added no printing rule, it's probably a bug and you can send it to the Coq team.

Most common errors

Here are some considerations that may help to get rid of simple errors:

make sure that the rule you want to use is not defined in previously closed section.
make sure that all non-terminals of your grammar have their corresponding printing rules.
make sure that the set of printing rules for a certain non terminal covers all the space of AST values for that non terminal.
the order of the rules is important. If there are two rules whose patterns superpose (they have common instances) then it is always the most recent rule that will be retrieved.
if there are two rules with the same name and universe the last one overrides the first one. The system always warns you about redefinition of rules.

Tracing the Objective Caml code of the printer

Some of the conditions presented above are not easy to verify when dealing with many rules. In that case tracing the code helps to understand what is happening. The printers are in the file src/typing/printer. There you will find the functions:

gencompr : the printer of commands
gentacpr : the printer of tactics

These printers are defined in terms of a general printer genprint (this function is located in src/parsing/esyntax.ml) and by instantiating it with the adequate parameters. genprint waits for: the universe to which this AST belongs (tactic, constr), a default printer, the precedence of the AST inherited from the caller rule and the AST to print. genprint looks for a rule whose pattern matches the AST, and executes in order the printing orders associated to this rule. Subterms are printed by recursively calling the generic printer. If no rule matches the AST, the default printer is used.

An AST of a universe may have subterms that belong to another universe. For instance, let v be the AST of the tactic expression MyTactic O. The function gentacpr is called to print v. This function instantiates the general printer genprint with the universe tactic. Note that v has a subterm c corresponding to the AST of O (c belongs to the universe constr). genprint will try recursively to print all subterms of v as belonging to the same universe of v. If this is not possible, because the subterm belongs to another universe, then the default printer that was given as argument to genprint is applied. The default printer is responsible for changing the universe in a proper way calling the suitable printer for c.

Technical Remark. In the file PPTactic.v, there are some rules that do not arise from the inversion of a parsing rule. They are strongly related to the way the printing is implemented.

Coq < Syntax tactic level 8:
Coq < constr [(COMMAND $c)] -> [ (PPUNI$COMMAND $c):E ].

As an AST of tactic may have subterms that are commands, these rules allow the printer of tactic to change the universe. The PPUNI$COMMAND is a special identifier used for this purpose. They are used in the code of the default printer that gentacpr gives to genprint.

Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.0.2.html at 8/10/98

Univ. Grammar	Univ. Printing
tactic	tactic
command	constr