9.3 Writing your own pretty printing rules

9.3 Writing your own pretty printing rules

There is a mechanism for extending the vernacular's printer by adding, in the interactive toplevel, new printing rules. The printing rules are stored into a table and will be recovered at the moment of the printing by the vernacular's printer.

The user can print new constants, tactics and vernacular phrases with his desired syntax. The printing rules for new constants should be written after the definition of the constants. The rules should be outside a section if the user wants them to be exported.

The printing rules corresponding to the heart of the system (primitive tactics, commands and the vernacular language) are defined, respectively, in the files PPTactic.v and PPCommand.v (in the directory src/syntax). These files are automatically loaded in the initial state. The user is not expected to modify these files unless he dislikes the way primitive things are printed, in which case he will have to compile the system after doing the modifications.

When the system fails to find a suitable printing rule, a tag #GENTERMappears in the message.

In the following we give some examples showing how to write the printing rules for the non-terminal and terminal symbols of a grammar. We will test them frequently by inspecting the error messages. Then, we give the grammar of printing rules and a description of its semantics.

9.3.1 The Printing Rules

The printing of non terminals

The printing is the inverse process of parsing. While a grammar rule maps an input stream of characters into an AST, a printing rule maps an AST into an output stream of printing orders. So given a certain grammar rule, the printing rule is generally obtained by inverting the grammar rule.

Like grammar rules, it is possible to define several rules at the same time. The exact syntax for complex rules is described in 9.3.2. A simple printing rule is of the form:

Syntax  universe  level  precedence : name  [ pattern ] -> [ printing-orders ].


where :


Example 1: Syntax for user-defined tactics.

The first usage of the Syntax command might be the printing order for a user-defined tactic:

Coq < Declare ML Module "eqdecide".

Coq < Syntax tactic level 0:
Coq < ComparePP [(Compare $com1 $com2)] ->
Coq < ["Compare" [1 2] $com1 [1 2] $com2].


If such a printing rule is not given, a disgraceful #GENTERM will appear when typing Show Script or Save. For a tactic macro defined by a Tactic Definition command, a printing rule is automatically generated so the user don't have to write one.


Example 2: Defining the syntax for new constants.

Let's define the constant Xor in Coq:

Coq < Definition Xor := [A,B:Prop] A/\~B \/ ~A/\B.


Given this definition, we want to use the syntax of A X B to denote (Xor A B). To do that we give the grammar rule:

Coq < Grammar command command7 :=
Coq < Xor [ command6($c1) "X" command7($c2) ] -> [<<(Xor $c1 $c2)>>].


Note that the operator is associative to the right. Now True X False is well parsed:

Coq < Goal True X False.
1 subgoal
  
  ============================
   (Xor True False)


To have it well printed we extend the printer:

Coq < Syntax constr level 7:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].


and now we have the desired syntax:

Coq < Show.
1 subgoal
  
  ============================
   True X False


Let's comment the rule:

Note that while grammar rules are related by the name of non-terminals (such as command6 and command7) printing rules are isolated. The Pxor rule tells how to print an Xor expression but not how to print its subterms. The printer looks up recursively the rules for the values of $t1 and $t2. The selection of the printing rules is strictly determined by the structure of the AST to be printed.

This could have been defined with the Infix command.


Example 3: Forcing to parenthesize a new syntactic construction

You can force to parenthesize a new syntactic construction by fixing the precedence of its printing rule to a number greater than 9. For example a possible printing rule for the Xor connector in the prefix notation would be:

Coq < Syntax constr level 10:
Coq < ex_imp [<<(Xor $t1 $t2)>>] -> [ "X " $t1:L " " $t2:L ].


No explicit parentheses are contained in the rule, nevertheless, when using the connector, the parentheses are automatically written:

Coq < Show.
1 subgoal
  
  ============================
   (X True False)


A precedence higher than 9 ensures that the AST value will be parenthesized by default in either the empty context or if it occurs in a context where the instructions are of the form $t:L or $t:E.


Example 4: Dealing with list patterns in the syntax rules

The following productions extend the parser to recognize a tactic called MyIntros that receives a list of identifiers as argument as the primitive Intros tactic does:

Coq < Grammar tactic simple_tactic :=
Coq < my_intros [ "MyIntros" ne_identarg_list($idl) ] ->
Coq < [(MyIntrosWith ($LIST $idl))].


To define the printing rule for MyIntros it is necessary to define the printing rule for the non terminal ne_identarg_list. In grammar productions the dependency between the non terminals is explicit. This is not the case for printing rules, where the dependency between the rules is determined by the structure of the pattern. So, the way to make explicit the relation between printing rules is by adding structure to the patterns.

Coq < Syntax tactic level 0:
Coq < myintroswith [(MyIntrosWith ($LIST $L))] ->
Coq < [ "MyIntros " (NEIDENTARGLIST ($LIST $L)) ].


This rule says to print the string MyIntros and then to print the value of
(NEIDENTARGLIST ($LIST $L)).

Coq < Syntax tactic level 0:
Coq < ne_identarg_list_cons [(NEIDENTARGLIST $id ($LIST $l))]
Coq < -> [ $id " " (NEIDENTARGLIST ($LIST $l)) ]
Coq < | ne_identarg_list_single [(NEIDENTARGLIST $id)] -> [ $id ].
Warning: overriding syntax rule tactic:ne_identarg_list_cons.


The first rule says how to print a non-empty list, while the second one says how to print the list with exactly one element. Note that the pattern structure of the binding in the first rule ensures its use in a recursive way.

Like the order of grammar productions, the order of printing rules does matter. In case of two rules whose patterns superpose each other the last rule is always chosen. In the example, if the last two rules were written in the inverse order the printing will not work, because only the rule ne_identarg_list_cons would be recursively retrieved and there is no rule for the empty list. Other possibilities would have been to write a rule for the empty list instead of the ne_identarg_list_single rule,

Coq < Syntax tactic level 0:
Coq < ne_identarg_list_nil [(NEIDENTARGLIST)] -> [ ].


This rule indicates to do nothing in case of the empty list. In this case there is no superposition between patterns (no critical pairs) and the order is not relevant. But a useless space would be printed after the last identifier.


Example 5: Defining constants with arbitrary number of arguments

Sometimes the constants we define may have an arbitrary number of arguments, the typical case are polymorphic functions. Let's consider for example the functional composition operator. The following rule extends the parser:

Coq < Definition explicit_comp := [A,B,C:Set][f:A->B][g:B->C][a:A](g (f a)).

Coq < Grammar command command6 :=
Coq < expl_comp [command5($c1) "o" command6($c2) ] ->
Coq < [<<(explicit_comp ? ? ? $c1 $c2)>>].


Our first idea is to write the printing rule just by ``inverting'' the production:

Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp ? ? ? $f $g)>>] -> [ $f:L "o" $g:L ].


This rule is not correct: ? is an ordinary AST (indeed, it is the AST (XTRA "ISEVAR")), and does not behave as the ``wildcard'' pattern $_. Here is a correct version of this rule:

Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp $_ $_ $_ $f $g)>>] -> [ $f:L "o" $g:L ].


Let's test the printing rule:

Coq < Definition Id := [A:Set][x:A]x.
Id is defined

Coq < Check (Id nat) o (Id nat).
(Id nat)o(Id nat)
     : nat->nat

Coq < Check ((Id nat)o(Id nat) O).
(explicit_comp nat nat nat (Id nat) (Id nat) O)
     : nat


In the first case the rule was used, while in the second one the system failed to match the pattern of the rule with the AST of ((Id nat)o(Id nat) O). Internally the AST of this term is the same as the AST of the term (explicit_comp nat nat nat (Id nat) (Id nat) O). When the system retrieves our rule it tries to match an application of six arguments with an application of five arguments (the AST of (explicit_comp $_ $_ $_ $f $g)). Then, the matching fails and the term is printed using the rule for application.

Note that the idea of adding a new rule for explicit_comp for the case of six arguments does not solve the problem, because of the polymorphism, we can always build a term with one argument more. The rules for application deal with the problem of having an arbitrary number of arguments by using list patterns. Let's see these rules:

Coq < Syntax constr level 10:
Coq < app [(APPLIST $H ($LIST $T))] ->
Coq < [ [<hov 0> $H:E (APPTAIL ($LIST $T)):E ] ]
Coq <
Coq < | apptailcons [(APPTAIL $H ($LIST $T))] ->
Coq < [ [1 1] $H:L (APPTAIL ($LIST $T)):E ]
Coq < | apptailnil [(APPTAIL)] -> [ ].


The first rule prints the operator of the application, and the second prints the list of its arguments. Then, one solution to our problem is to specialize the first rule of the application to the cases where the operator is explicit_comp and the list pattern has at least five arguments:

Coq < Syntax constr level 10:
Coq < expl_comp
Coq < [(APPLIST <<explicit_comp>> $_ $_ $_ $f $g ($LIST $l))]
Coq < -> [ [<hov 0> $f:L "o" $g:L (APPTAIL ($LIST $l)) ] ].


Now we can see that this rule works for any application of the operator:

Coq < Check ((Id nat) o (Id nat) O).
((Id nat)o(Id nat) O)
     : nat

Coq < Check ((Id nat->nat) o (Id nat->nat) [x:nat]x O).
((Id nat->nat)o(Id nat->nat) [x:nat]x O)
     : nat


In the examples presented by now, the rules have no information about how to deal with indentation, break points and spaces, the printer will write everything in the same line without spaces. To indicate the concrete layout of the patterns, there's a simple language of printing instructions that will be described in the following section.

The printing of terminals

The user is not expected to write the printing rules for terminals, this is done automatically. Primitive printing is done for identifiers, strings, paths, numbers. For example :

Coq < Grammar vernac vernac :=
Coq < mycd [ "MyCd" prim:string($dir) "." ] -> [(MYCD $dir)].

Coq < Syntax vernac level 0:
Coq < mycd [(MYCD $dir)] -> [ "MyCd " $dir ].


There is no more need to encapsulate the $dir meta-variable with the $PRIM or the $STR operator as in the version 6.1. However, the pattern (MYCD ($STR $dir)) would be safer, because the rule would not be selected to print an ill-formed AST. The name of default primitive printer is the Objective Caml function print_token. If the user wants a particular terminal to be printed by another printer, he may specify it in the right part of the rule. Example:

Coq < Syntax tactic level 0 :
Coq < do_pp [(DO ($NUM $num) $tactic)]
Coq < -> [ "Do " $num:"my_printer" [1 1] $tactic ].


The printer my_printer must have been installed as shown below.

Primitive printers

Writing and installing primitive pretty-printers requires to have the sources of the system like writing tactics.

A primitive pretty-printer is a Caml function of type
     Esyntax.std_printer -> CoqAst.t -> Pp.std_ppcmds
The first argument is the global printer, it can be called for example by the specific printer to print subterms. The second argument is the AST to print, and the result is a stream of printing orders like :



There is also commands to make boxes (h or hv, described in file src/lib/util/pp.mli). Once the printer is written, it must be registered by the command :
     Esyntax.Ppprim.add ("name",my_printer);;
Then, in the toplevel, after having loaded the right Objective Caml module, it can be used in the right hand side of printing orders using the syntax $truc:"name".

The real name and the registered name of a pretty-printer does not need to be the same. However, it can be nice and simple to give the same name.

9.3.2 Syntax for pretty printing rules

This section describes the syntax for printing rules. The metalanguage conventions are the same as those specified for the definition of the pattern's syntax in section 9.1. The grammar of printing rules is the following:

printing-rule ::= Syntax ident  level ; ... ; level
level ::= level precedence :  rule | ... | rule
precedence ::= integer  |  [ integer integer integer ]
rule ::= ident [ pattern ] -> [  [printing-order ... printing-order] ]
printing-order ::= FNL
  | string
  | [ integer integer ]
  | [ box [printing-order ... printing-order] ]
  | ast [: prim-printer]  [: paren-rel]
box ::= < box-type integer >
box-type ::=  hov | hv | v | h
paren-rel ::= L | E
prim-printer ::= string
pattern ::= ast


As already stated, the order of rules in a given level is relevant (the last ones override the previous ones).

Pretty grammar structures

The basic structure is the printing order sequence. Each order has a printing effect and they are sequentially executed. The orders can be:
Printing orders
Printing orders can be of the form:
Printing boxes
The concept of formatting boxes is used to describe the concrete layout of patterns: a box may contain many objects which are orders or subboxes sequences separated by breakpoints; the box wraps around them an imaginary rectangle.
  1. Box types

    The type of boxes specifies the way the components of the box will be displayed and may be:
    • h : to catenate objects horizontally.
    • v : to catenate objects vertically.
    • hv : to catenate objects as with an ``h box'' but an automatic vertical folding is applied when the horizontal composition does not fit into the width of the associated output device.

    • hov : to catenate objects horizontally but if the horizontal composition does not fit, a vertical composition will be applied, trying to catenate horizontally as many objects as possible.


    The type of the box can be followed by a n offset value, which is the offset added to the current indentation when breaking lines inside the box.

  2. Boxes syntax

    A box is described by a sequence surrounded by [ ]. The first element of the sequence is the box type: this type surrounded by the symbols < > is one of the words hov, hv, v, v followed by an offset. The default offset is 0 and the default box type is h.

  3. Breakpoints

    In order to specify where the pretty-printer is allowed to break, one of the following break-points may be used:

    • [0 0] is a simple break-point, if the line is not broken here, no space is included (``Cut'').
    • [1 0] if the line is not broken then a space is printed (``Spc'').
    • [i j] if the line is broken, the value j is added to the current indentation for the following line; otherwise i blank spaces are inserted (``Brk'').


    Examples : It is interesting to test printing rules on ``small'' and ``large'' expressions in order to see how the break of lines and indentation are managed. Let's define two constants and make a Print of them to test the rules. Here are some examples of rules for our constant Xor:

    Coq < Definition A := True X True.

    Coq < Definition B := True X True X True X True X True X True X True
    Coq < X True X True X True X True X True X True.
    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].


    This rule prints everything in the same line exceeding the line's width.

    Coq < Print B.
    B = 
    True X True X True X True X True X True X True X True X True X True X Tru
    e X True X True
         : Prop


    Let's add some break-points in order to force the printer to break the line before the operator:

    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L [0 1] " X " $t2:E ].


    Coq < Print B.
    B = True X True X True X True X True X True X True X True X True X True
      X True X True X True
         : Prop


    The line was correctly broken but there is no indentation at all. To deal with indentation we use a printing box:

    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] ->
    Coq < [ [<hov 0> $t1:L [0 1] " X " $t2:E ] ].


    With this rule the printing of A is correct, an the printing of B is indented.

    Coq < Print B.
    B = 
    True
      X True
          X True
              X True
                  X True
                      X True
                          X True X True X True X True X True X True X True
         : Prop


    If we had chosen the mode v instead of hov :

    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ [<v 0> $t1:L [0 1] " X " $t2:E ] ].


    We would have obtained a vertical presentation:

    Coq < Print A.
    A = True
          X True
         : Prop


    The difference between the presentation obtained with the hv and hov type box is not evident at first glance. Just for clarification purposes let's compare the result of this silly rule using an hv and a hov box type:

    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] ->
    Coq < [ [<hv 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
    Coq < [0 0] "----------------------------------------"
    Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].
    Coq < Print A.
    A = 
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    ----------------------------------------
    ZZZZZZZZZZZZZZZZ
         : Prop
    Coq < Syntax constr level 6:
    Coq < Pxor [<<(Xor $t1 $t2)>>] ->
    Coq < [ [<hov 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
    Coq < [0 0] "----------------------------------------"
    Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].
    Coq < Print A.
    A = 
    XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    ----------------------------------------ZZZZZZZZZZZZZZZZ
         : Prop


    In the first case, as the three strings to be printed do not fit in the line's width, a vertical presentation is applied. In the second case, a vertical presentation is applied, but as the last two strings fit in the line's width, they are printed in the same line.

9.3.3 Debugging the printing rules

By now, there is almost no semantic check of printing rules in the system. To find out where the problem is, there are two possibilities: to analyze the rules looking for the most common errors or to work in the toplevel tracing the ml code of the printer. When the system can't find the proper rule to print an Ast, it prints #GENTERM ast. If you added no printing rule, it's probably a bug and you can send it to the Coq team.

Most common errors

Here are some considerations that may help to get rid of simple errors:

Tracing the Objective Caml code of the printer

Some of the conditions presented above are not easy to verify when dealing with many rules. In that case tracing the code helps to understand what is happening. The printers are in the file src/typing/printer. There you will find the functions:



These printers are defined in terms of a general printer genprint (this function is located in src/parsing/esyntax.ml) and by instantiating it with the adequate parameters. genprint waits for: the universe to which this AST belongs (tactic, constr), a default printer, the precedence of the AST inherited from the caller rule and the AST to print. genprint looks for a rule whose pattern matches the AST, and executes in order the printing orders associated to this rule. Subterms are printed by recursively calling the generic printer. If no rule matches the AST, the default printer is used.

An AST of a universe may have subterms that belong to another universe. For instance, let v be the AST of the tactic expression MyTactic O. The function gentacpr is called to print v. This function instantiates the general printer genprint with the universe tactic. Note that v has a subterm c corresponding to the AST of O (c belongs to the universe constr). genprint will try recursively to print all subterms of v as belonging to the same universe of v. If this is not possible, because the subterm belongs to another universe, then the default printer that was given as argument to genprint is applied. The default printer is responsible for changing the universe in a proper way calling the suitable printer for c.


Technical Remark. In the file PPTactic.v, there are some rules that do not arise from the inversion of a parsing rule. They are strongly related to the way the printing is implemented.

Coq < Syntax tactic level 8:
Coq < constr [(COMMAND $c)] -> [ (PPUNI$COMMAND $c):E ].


As an AST of tactic may have subterms that are commands, these rules allow the printer of tactic to change the universe. The PPUNI$COMMAND is a special identifier used for this purpose. They are used in the code of the default printer that gentacpr gives to genprint.



Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.0.2.html at 8/10/98