9.3 Writing your own pretty printing rules
9.3 Writing your own pretty printing rules
There is a mechanism for extending the
vernacular's printer by adding, in the interactive
toplevel, new printing rules. The printing rules are stored into a
table and will be recovered at the moment of the printing by the
vernacular's printer.
The user can print new constants, tactics and vernacular phrases
with his desired syntax. The printing rules
for new constants should be written after the definition of the
constants. The rules should be
outside a section if the user wants them to be exported.
The printing rules corresponding to the heart of the system (primitive
tactics, commands and the vernacular language) are defined,
respectively, in the files PPTactic.v and PPCommand.v
(in the directory src/syntax). These files are automatically
loaded in the initial state. The user is not expected to modify these
files unless he dislikes the way primitive things are printed, in
which case he will have to compile the system after doing the
modifications.
When the system fails to find a suitable printing rule, a tag
#GENTERM
appears in the message.
In the following we give some examples showing how to write the
printing rules for the non-terminal and terminal symbols of a
grammar. We will test them frequently by inspecting the error
messages. Then, we give the grammar of printing rules and a
description of its semantics.
9.3.1 The Printing Rules
The printing of non terminals
The printing is the inverse process of parsing. While a grammar rule
maps an input stream of characters
into an AST, a printing
rule maps an AST into an output stream of printing orders.
So given a certain grammar rule, the printing rule
is generally obtained by inverting the grammar rule.
Like grammar rules, it is possible to define several rules at the same
time. The exact syntax for complex rules is described
in 9.3.2. A simple printing rule is of the form:
Syntax
universe
level
precedence :
name
[
pattern ] -> [
printing-orders ].
where :
-
universe is an identifier denoting the universe of the AST to
be printed. There is a correspondence between the universe of the
grammar rule used to generate the AST and the one of the printing
rule:
Univ. Grammar |
Univ. Printing |
tactic |
tactic |
command |
constr |
The vernac universe has no equivalent in pretty-printing since
vernac phrases are never printed by the system. Error messages are
reported by re-displaying what the user entered.
- precedence is positive integer indicating the precedence
of the rule. In general the precedence for tactics is 0. The
universe of commands is implicitly stratified by the hierarchy of
the parsing rules. We have non terminals command0,
command1, ..., command10.
The idea is that objects parsed with the non terminal
commandi have precedence i. In most of the cases we fix the
precedence of the printing rules for commands to be the same number
of the non terminal with which it is parsed.
A precedence may also be a triple of integers. The triples are
ordered in lexicographic order, and the level n is equal to [n 0 0].
- name is the name of the
printing rule. A rule is identified by both its universe and name,
if there are two rules with both the same name and universe, then
the last one overrides the former.
- pattern is a pattern that matches the AST to be
printed. The syntax of patterns is very similar to the grammar for ASTs.
A description of their syntax is given in section
9.1.
- printing-orders is the sequence of orders indicating the
concrete layout of the printer.
Example 1: Syntax for user-defined tactics.
The first usage of the Syntax command might be the printing
order for a user-defined tactic:
Coq < Declare ML Module "eqdecide".
Coq < Syntax tactic level 0:
Coq < ComparePP [(Compare $com1 $com2)] ->
Coq < ["Compare" [1 2] $com1 [1 2] $com2].
If such a printing rule is not given, a disgraceful #GENTERM
will appear when typing Show Script or Save. For
a tactic macro defined by a Tactic Definition command, a
printing rule is automatically generated so the user don't have to
write one.
Example 2: Defining the syntax for new constants.
Let's define the constant Xor
in Coq:
Coq < Definition Xor := [A,B:Prop] A/\~B \/ ~A/\B.
Given this definition, we want to use the syntax of A X B
to denote (Xor A B)
. To do that we give the grammar rule:
Coq < Grammar command command7 :=
Coq < Xor [ command6($c1) "X" command7($c2) ] -> [<<(Xor $c1 $c2)>>].
Note that the operator is associative to the right.
Now True X False
is well parsed:
Coq < Goal True X False.
1 subgoal
============================
(Xor True False)
To have it well printed we extend the printer:
Coq < Syntax constr level 7:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].
and now we have the desired syntax:
Coq < Show.
1 subgoal
============================
True X False
Let's comment the rule:
-
constr
is the universe of the printing rule.
7
is the rule's precedence and it is the same one than the
parsing production (command7).
Pxor
is the name of the printing rule.
<<(Xor $t1 $t2)>>
is the pattern of the AST to be
printed. Between << >>
we are allowed to use the syntax of
command instead of syntax of ASTs.
Metavariables may occur in the pattern but preceded by
$
.
$t1:L " X " $t2:E
are the printing
orders, it tells to print the value of $t1
then the symbol
X
and then the value of $t2
.
The L
in the little box $t1:L
indicates not
to put parentheses around the value of $t1
if its precedence
is less than the rule's one. An E
instead of the
L
would mean not to put parentheses around the value of
$t1
if its the precedence is less or equal than the
rule's one.
The associativity of the operator can be expressed in the following
way:
$t1:L " X " $t2:E
associates the operator to the right.
$t1:E " X " $t2:L
associates to the left.
$t1:L " X " $t2:L
is non-associative.
Note that while grammar rules are related by the name of non-terminals
(such as command6 and command7) printing rules are
isolated. The Pxor rule tells how to print an Xor
expression but not how to print its subterms. The printer looks up
recursively the rules for the values of $t1
and $t2
. The
selection of the printing rules is strictly determined by the
structure of the AST to be printed.
This could have been defined with the Infix command.
Example 3: Forcing to parenthesize a new syntactic construction
You can force to parenthesize a new syntactic construction by fixing
the precedence of its printing rule to a number greater than 9. For
example a possible printing rule for the Xor connector in the prefix
notation would be:
Coq < Syntax constr level 10:
Coq < ex_imp [<<(Xor $t1 $t2)>>] -> [ "X " $t1:L " " $t2:L ].
No explicit parentheses are contained in the rule, nevertheless, when
using the connector, the parentheses are automatically written:
Coq < Show.
1 subgoal
============================
(X True False)
A precedence higher than 9 ensures that the AST value will be
parenthesized by default in either the empty context or if it occurs
in a context where the instructions are of the form
$t:L
or $t:E
.
Example 4: Dealing with list patterns in the syntax rules
The following productions extend the parser to recognize a
tactic called MyIntros
that receives a list of identifiers as
argument as the primitive Intros
tactic does:
Coq < Grammar tactic simple_tactic :=
Coq < my_intros [ "MyIntros" ne_identarg_list($idl) ] ->
Coq < [(MyIntrosWith ($LIST $idl))].
To define the printing rule for MyIntros
it is necessary to
define the printing rule for the non terminal ne_identarg_list
.
In grammar productions the dependency between the non terminals is
explicit. This is not the case for printing rules, where the
dependency between the rules is determined by the structure of the
pattern. So, the way to make explicit the relation between printing
rules is by adding structure to the patterns.
Coq < Syntax tactic level 0:
Coq < myintroswith [(MyIntrosWith ($LIST $L))] ->
Coq < [ "MyIntros " (NEIDENTARGLIST ($LIST $L)) ].
This rule says to print the string MyIntros
and then to print
the value of
(NEIDENTARGLIST ($LIST $L))
.
Coq < Syntax tactic level 0:
Coq < ne_identarg_list_cons [(NEIDENTARGLIST $id ($LIST $l))]
Coq < -> [ $id " " (NEIDENTARGLIST ($LIST $l)) ]
Coq < | ne_identarg_list_single [(NEIDENTARGLIST $id)] -> [ $id ].
Warning: overriding syntax rule tactic:ne_identarg_list_cons.
The first rule says how to print a non-empty list, while the second
one says how to print the list with exactly one element. Note that the
pattern structure of the binding in the first rule ensures its use in
a recursive way.
Like the order of grammar productions, the order of printing rules
does matter. In case of two rules whose patterns superpose
each other the last rule is always chosen. In the example, if
the last two rules were written in the inverse order the printing will
not work, because only the rule ne_identarg_list_cons would
be recursively retrieved and there is no rule for the empty list.
Other possibilities would have been to write a rule for the empty list
instead of the ne_identarg_list_single rule,
Coq < Syntax tactic level 0:
Coq < ne_identarg_list_nil [(NEIDENTARGLIST)] -> [ ].
This rule indicates to do nothing in case of the empty list. In this
case there is no superposition between patterns (no critical pairs)
and the order is not relevant. But a useless space would be printed
after the last identifier.
Example 5: Defining constants with arbitrary number of arguments
Sometimes the constants we define may have an arbitrary number of
arguments, the typical case are polymorphic functions. Let's consider
for example the functional composition operator. The following rule
extends the parser:
Coq < Definition explicit_comp := [A,B,C:Set][f:A->B][g:B->C][a:A](g (f a)).
Coq < Grammar command command6 :=
Coq < expl_comp [command5($c1) "o" command6($c2) ] ->
Coq < [<<(explicit_comp ? ? ? $c1 $c2)>>].
Our first idea is to write the printing rule just by ``inverting'' the
production:
Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp ? ? ? $f $g)>>] -> [ $f:L "o" $g:L ].
This rule is not correct: ?
is an ordinary AST (indeed, it is
the AST (XTRA "ISEVAR")
), and does not behave as the
``wildcard'' pattern $_
. Here is a correct version of this
rule:
Coq < Syntax constr level 6:
Coq < expl_comp [<<(explicit_comp $_ $_ $_ $f $g)>>] -> [ $f:L "o" $g:L ].
Let's test the printing rule:
Coq < Definition Id := [A:Set][x:A]x.
Id is defined
Coq < Check (Id nat) o (Id nat).
(Id nat)o(Id nat)
: nat->nat
Coq < Check ((Id nat)o(Id nat) O).
(explicit_comp nat nat nat (Id nat) (Id nat) O)
: nat
In the first case the rule was used, while in the second one the
system failed to match the pattern of the rule with the AST of
((Id nat)o(Id nat) O)
.
Internally the AST of this term is the same as the AST of the
term (explicit_comp nat nat nat (Id nat) (Id nat) O)
.
When the system retrieves our rule it tries to match an application of
six arguments with an application of five arguments (the AST of
(explicit_comp $_ $_ $_ $f $g)
). Then, the matching fails and
the term is printed using the rule for application.
Note that the idea of adding a new rule for explicit_comp
for
the case of six arguments does not solve the problem, because of the
polymorphism, we can always build a term with one argument more. The
rules for application deal with the problem of having an arbitrary
number of arguments by using list patterns. Let's see these rules:
Coq < Syntax constr level 10:
Coq < app [(APPLIST $H ($LIST $T))] ->
Coq < [ [<hov 0> $H:E (APPTAIL ($LIST $T)):E ] ]
Coq <
Coq < | apptailcons [(APPTAIL $H ($LIST $T))] ->
Coq < [ [1 1] $H:L (APPTAIL ($LIST $T)):E ]
Coq < | apptailnil [(APPTAIL)] -> [ ].
The first rule prints the operator of the application, and the second
prints the list of its arguments. Then, one solution to our problem is
to specialize the first rule of the application to the cases where the
operator is explicit_comp
and the list pattern has at
least five arguments:
Coq < Syntax constr level 10:
Coq < expl_comp
Coq < [(APPLIST <<explicit_comp>> $_ $_ $_ $f $g ($LIST $l))]
Coq < -> [ [<hov 0> $f:L "o" $g:L (APPTAIL ($LIST $l)) ] ].
Now we can see that this rule works for any application of the
operator:
Coq < Check ((Id nat) o (Id nat) O).
((Id nat)o(Id nat) O)
: nat
Coq < Check ((Id nat->nat) o (Id nat->nat) [x:nat]x O).
((Id nat->nat)o(Id nat->nat) [x:nat]x O)
: nat
In the examples presented by now, the rules have no information about
how to deal with indentation, break points and spaces, the printer
will write everything in the same line without spaces. To indicate the
concrete layout of the patterns, there's a simple language of printing
instructions that will be described in the following section.
The printing of terminals
The user is not expected to write the printing rules for terminals,
this is done automatically. Primitive printing is done for
identifiers, strings, paths, numbers. For example :
Coq < Grammar vernac vernac :=
Coq < mycd [ "MyCd" prim:string($dir) "." ] -> [(MYCD $dir)].
Coq < Syntax vernac level 0:
Coq < mycd [(MYCD $dir)] -> [ "MyCd " $dir ].
There is no more need to encapsulate the $dir
meta-variable
with the $PRIM
or the $STR
operator as in the version
6.1. However, the pattern (MYCD ($STR $dir))
would be safer,
because the rule would not be selected to print an ill-formed AST. The
name of default primitive printer is the Objective Caml function print_token. If the user wants a particular terminal to be printed
by another printer, he may specify it in the right part of the
rule. Example:
Coq < Syntax tactic level 0 :
Coq < do_pp [(DO ($NUM $num) $tactic)]
Coq < -> [ "Do " $num:"my_printer" [1 1] $tactic ].
The printer my_printer must have been installed as shown
below.
Primitive printers
Writing and installing primitive pretty-printers requires to have the
sources of the system like writing tactics.
A primitive pretty-printer is a Caml function of type
Esyntax.std_printer -> CoqAst.t -> Pp.std_ppcmds
The first
argument is the global printer, it can be called for example by the
specific printer to print subterms. The second argument is the AST to
print, and the result is a stream of printing orders like :
-
'sTR"
string"
to print the string
string
'bRK
num1 num2 that has the same semantics than
[
num1 num2 ]
in the print rules.
'sPC
to leave a blank space
'iNT
n to print the integer n
- ...
There is also commands to make boxes (h
or hv
, described
in file src/lib/util/pp.mli). Once the printer is written, it
must be registered by the command :
Esyntax.Ppprim.add ("name",my_printer);;
Then, in the toplevel, after having loaded the right Objective Caml module,
it can be used in the right hand side of printing orders using the
syntax $truc:"name"
.
The real name and the registered name of a pretty-printer does not
need to be the same. However, it can be nice and simple to give the
same name.
9.3.2 Syntax for pretty printing rules
This section describes the syntax for printing rules. The
metalanguage conventions are the same as those specified for the
definition of the pattern's syntax in section 9.1.
The grammar of printing rules is the following:
printing-rule |
::= |
Syntax ident level ; ... ; level |
level |
::= |
level precedence :
rule | ... | rule |
precedence |
::= |
integer | [ integer integer integer ] |
rule |
::= |
ident [ pattern ] -> [ [printing-order ... printing-order] ] |
printing-order |
::= |
FNL |
|
| |
string |
|
| |
[ integer integer ] |
|
| |
[ box [printing-order ... printing-order] ] |
|
| |
ast [: prim-printer] [: paren-rel] |
box |
::= |
< box-type integer > |
box-type |
::= |
hov | hv | v | h |
paren-rel |
::= |
L | E |
prim-printer |
::= |
string |
pattern |
::= |
ast |
As already stated, the order of rules in a given level is relevant
(the last ones override the previous ones).
Pretty grammar structures
The basic structure is the printing order sequence. Each order has a
printing effect and they are sequentially executed. The orders can
be:
-
printing orders
- printing boxes
Printing orders
Printing orders can be of the form:
-
"
string"
prints the string.
FNL
force a new line.
- $t:paren-rel or
$t:prim-printer:paren-rel
ast is used to build an AST in current context. The printer
looks up the adequate printing rule and applies recursively this
method. The optional field prim-printer is a string with the
name primitive pretty-printer to call (The name is not the name of
the Objective Caml function, but the name given to Esyntax.Ppprim.add). Recursion of the printing is determined by
the pattern's structure. paren-rel is the following:
L |
if t 's precedence is less than the rule's one, then no
parentheses |
|
around t are written. |
E |
if t 's precedence is less or equal than the rule's one
then no parentheses |
|
around t are written. |
none |
never write parentheses around t. |
Printing boxes
The concept of formatting boxes is used to describe the concrete
layout of patterns: a box may contain many objects which are orders or
subboxes sequences separated by breakpoints; the box wraps around
them an imaginary rectangle.
-
Box types
The type of boxes specifies the way the components of the box will
be displayed and may be:
-
h
: to catenate objects horizontally.
v
: to catenate objects vertically.
hv
: to catenate objects as with an ``h box'' but an
automatic vertical folding is applied when the horizontal
composition does not fit into the width of the associated output
device.
hov
: to catenate objects horizontally but if the
horizontal composition does not fit, a vertical composition will be
applied, trying to catenate horizontally as many objects as possible.
The type of the box can be followed by a n offset value, which
is the offset added to the current indentation when breaking lines
inside the box.
- Boxes syntax
A box is described by a sequence surrounded by [ ]
. The first
element of the sequence is the box type: this type surrounded by the
symbols < >
is one of the words hov
, hv
,
v
, v
followed by an offset. The default offset is 0
and the default box type is h
.
- Breakpoints
In order to specify where the pretty-printer is allowed to break,
one of the following break-points may be used:
-
[0 0]
is a simple break-point, if the line is not broken
here, no space is included (``Cut'').
[1 0]
if the line is not broken then a space is printed
(``Spc'').
[i j]
if the line is broken, the value j is added to the
current indentation for the following line; otherwise i blank spaces
are inserted (``Brk'').
Examples :
It is interesting to test printing rules on ``small'' and ``large''
expressions in order to see how the break of lines and indentation are
managed. Let's define two constants and make a Print
of them to
test the rules. Here are some examples of rules for our constant
Xor
:
Coq < Definition A := True X True.
Coq < Definition B := True X True X True X True X True X True X True
Coq < X True X True X True X True X True X True.
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L " X " $t2:E ].
This rule prints everything in the same line exceeding the line's
width.
Coq < Print B.
B =
True X True X True X True X True X True X True X True X True X True X Tru
e X True X True
: Prop
Let's add some break-points in order to force the printer to break the
line before the operator:
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ $t1:L [0 1] " X " $t2:E ].
Coq < Print B.
B = True X True X True X True X True X True X True X True X True X True
X True X True X True
: Prop
The line was correctly broken but there is no indentation at all. To
deal with indentation we use a printing box:
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hov 0> $t1:L [0 1] " X " $t2:E ] ].
With this rule the printing of A is correct, an the printing of B is
indented.
Coq < Print B.
B =
True
X True
X True
X True
X True
X True
X True X True X True X True X True X True X True
: Prop
If we had chosen the mode v
instead of hov
:
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] -> [ [<v 0> $t1:L [0 1] " X " $t2:E ] ].
We would have obtained a vertical presentation:
Coq < Print A.
A = True
X True
: Prop
The difference between the presentation obtained with the hv
and hov
type box is not evident at first glance. Just for
clarification purposes let's compare the result of this silly rule
using an hv
and a hov
box type:
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hv 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Coq < [0 0] "----------------------------------------"
Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].
Coq < Print A.
A =
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
----------------------------------------
ZZZZZZZZZZZZZZZZ
: Prop
Coq < Syntax constr level 6:
Coq < Pxor [<<(Xor $t1 $t2)>>] ->
Coq < [ [<hov 0> "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
Coq < [0 0] "----------------------------------------"
Coq < [0 0] "ZZZZZZZZZZZZZZZZ" ] ].
Coq < Print A.
A =
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
----------------------------------------ZZZZZZZZZZZZZZZZ
: Prop
In the first case, as the three strings to be printed do not fit in
the line's width, a vertical presentation is applied. In the second
case, a vertical presentation is applied, but as the last two strings
fit in the line's width, they are printed in the same line.
9.3.3 Debugging the printing rules
By now, there is almost no semantic check of printing rules in the
system. To find out where the problem is, there are two
possibilities: to analyze the rules looking for the most common
errors or to work in the toplevel tracing the ml code of the
printer.
When the system can't find the proper rule to print an Ast, it prints
#GENTERM
ast. If you added no printing rule,
it's probably a bug and you can send it to the Coq team.
Most common errors
Here are some considerations that may help to get rid of simple
errors:
-
make sure that the rule you want to use is not defined in
previously closed section.
- make sure that all non-terminals of your grammar have
their corresponding printing rules.
- make sure that the set of printing rules for a certain non
terminal covers all the space of AST values for that non terminal.
- the order of the rules is important. If there are two rules
whose patterns superpose (they have common instances) then it is
always the most recent rule that will be retrieved.
- if there are two rules with the same name and universe the last
one overrides the first one. The system always warns you about
redefinition of rules.
Tracing the Objective Caml code of the printer
Some of the conditions presented above are not easy to verify when
dealing with many rules. In that case tracing the code helps to
understand what is happening. The printers are in the file src/typing/printer. There you will find the functions:
-
gencompr : the printer of commands
- gentacpr : the printer of tactics
These printers are defined in terms of a general printer genprint (this function is located in src/parsing/esyntax.ml)
and by instantiating it with the adequate parameters. genprint
waits for: the universe to which this AST belongs (tactic, constr), a default printer, the precedence of the AST inherited from
the caller rule and the AST to print. genprint looks for a rule
whose pattern matches the AST, and executes in order the printing
orders associated to this rule. Subterms are printed by recursively
calling the generic printer. If no rule matches the AST, the default
printer is used.
An AST of a universe may have subterms that belong to another
universe. For instance, let v be the AST of the tactic
expression MyTactic O
. The function gentacpr is called
to print v. This function instantiates the general printer genprint with the universe tactic. Note that v has a subterm
c corresponding to the AST of O
(c belongs to the universe
constr). genprint will try recursively to print all
subterms of v as belonging to the same universe of v. If this is
not possible, because the subterm belongs to another universe, then
the default printer that was given as argument to genprint is
applied. The default printer is responsible for changing the universe
in a proper way calling the suitable printer for c.
Technical Remark. In the file
PPTactic.v
, there are some rules that do not arise from the
inversion of a parsing rule. They are strongly related to the way the
printing is implemented.
Coq < Syntax tactic level 8:
Coq < constr [(COMMAND $c)] -> [ (PPUNI$COMMAND $c):E ].
As an AST of tactic may have subterms that are commands, these rules
allow the printer of tactic to change the universe. The
PPUNI$COMMAND
is a special identifier used for this
purpose. They are used in the code of the default printer that gentacpr gives to genprint.
Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.0.2.html at 8/10/98