9.1 Abstract syntax trees (AST)

9.1 Abstract syntax trees (AST)

The AST expressions are conceptually divided into two classes: constructive expressions (those that can be used in parsing rules) and destructive expressions (those that can be used in pretty printing rules). In the following we give the concrete syntax of expressions and some examples of their usage.

The BNF grammar ast in Fig. 9.1 defines the syntax of both constructive and destructive expressions. The lexical conventions are the same as in section 1.1. Let us first describe the features common to constructive and destructive expressions.


ast ::= meta (metavariable)
  | ident (variable)
  | integer (positive integer)
  | string (quoted string)
  | path (section-path)
  | { ident } (identifier)
  | [ name ] ast (abstraction)
  | ( ident  [ast ... ast] ) (application node)
  | ( special-tok  meta ) (special-operator)
  | ' ast (quoted ast)
  | quotation  


meta
::= $ident  
path ::= #ident ... #ident . kind  
kind ::= cci  | fw  |  obj  
name ::= <>  |  ident   |  meta  


special-tok
::= $LIST | $VAR | $NUM |  $STR | $PATH | $ID  


quotation
::= <<  meta-command  >>  
  | <:command:<  meta-command  >>  
  | <:vernac:<  meta-vernac  >>  
  | <:tactic:<  meta-tactic  >>  


Figure 9.1:Syntax of AST expressions

Atomic AST

An atomic AST can be either a variable, a natural number, a quoted string, a section path or an identifier. They are the basic components of an AST.

Metavariable

As we will see later, metavariables may denote an AST or an AST list. So, we introduce two types of variables: Ast and List. The type of variables is checked statically: an expression refering to undefined metavariables, or using metavariables with an unappropriate type, will be rejected.

Metavariables are used to perform substitutions in constructive expressions: they are replaced by their value in a given environment. They are also involved in the pattern matching operation: metavariables in destructive patterns create new bindings in the environment.

Application node

Note that the AST syntax is rather general, since application nodes may be labelled by an arbitrary identifier (but not a metavariable), and operators have no fixed arity. This enables the extensibility of the system.

Nevertheless there are some application nodes that have some special meaning for the system. They are build on (special-tok meta), and cannot be confused with regular nodes since special-tok begins with a $. There is a description of these nodes below.

Abstraction

The equality on AST's is the a-conversion, i.e. two AST's are equal if only they are the same up to renaming of bound variables (thus, [x]x is equal to [y]y). This makes the difference between variables and identifiers clear: the former may be bound by abstractions, whereas identifiers cannot be bound. To illustrate this, [x]{x} is equal to [y]{x}, but not to [y]{y}.

The binding structure of AST is used to represent the binders in the terms of Coq: the product (x:$A)$B is mapped to the AST (PROD $A [x]$B), whereas the non dependent product $A->$B is mapped to (PROD $A [<>]$B) ([<>]t is an anonymous abstraction).

Metavariables can appear in abstractions. In that case, the value of the metavariable must be a variable (or a list of variables). If not, a run-time error is raised.

Quoted AST

The 't construction means that the AST t should not be interpreted at all. The main effect is that metavariables occurring in it cannot be substituted or considered as binding in patterns.

Quotations

The non terminal symbols meta-command, meta-vernac and meta-tactic stand, respectively, for the syntax of commands, vernacular phrases and tactics. The prefix meta- is just to emphasize that the expression may refer to metavariables.

Indeed, if the AST to generate corresponds to a term that already has a syntax, one can call a grammar to parse it and to return the AST result. For instance, <<(eq ? $f $g)>> denotes the AST which is the application (in the sense of CIC) of the constant eq to three arguments. It is coded as an AST node labelled APPLIST with four arguments.

This term is parsable by command command grammar. This grammar is invoked on this term to generate an AST by putting the term between ``<<'' and ``>>''.

We can also invoke the initial grammars of several other predefined entries (see section 9.2.1 for a description of these grammars).




Warning: One cannot invoke other grammars than those described.

Special operators in constructive expressions

The expressions ($LIST $x) injects the AST list variable $x in an AST position. For example, an application node is composed of an identifier followed by a list of AST's that are glued together. Each of these expressions must denote an AST. If we want to insert an AST list, one has to use the $LIST operator. Assume the variable $idl is bound to the list [x y z], the expression (Intros ($LIST $idl) a b c) will build the AST (Intros x y z a b c). Note that $LIST does not occur in the result.

Since we know the type of variables, the $LIST is not really necessary. We enforce this annotation to stress on the fact that the variable will be substituted by an arbitrary number of AST's.

The other special operators ($VAR, $NUM, $STR, $PATH and $ID) are forbidden.

Special operators in destructive expressions (AST patterns)

A pattern is an AST expression, in which some metavariables can appear. In a given environment a pattern matches any AST which is equal (w.r.t a-conversion) to the value of the pattern in an extension of the current environment. The result of the matching is precisely this extended environment. This definition allows non-linear patterns (i.e. patterns in which a variable occurs several times).

For instance, the pattern (PAIR $x $x) matches any AST which is a node labelled PAIR applied to two identical arguments, and binds this argument to $x. If $x was already bound, the arguments must also be equal to the current value of $x.

The ``wildcard pattern'' $_ is not a regular metavariable: it matches any term, but does not bind any variable. The pattern (PAIR $_ $_) matches any PAIR node applied to two arguments.

The $LIST operator still introduces list variables. Typically, when a metavariable appears as argument of an application, one has to say if it must match one argument (binding an AST variable), or all the arguments (binding a list variable). Let us consider the patterns (Intros $id) and (Intros ($LIST $idl)). The former matches nodes with exactly one argument, which is bound in the AST variable $id. On the other hand, the latter pattern matches any AST node labelled Intros, and it binds the list of its arguments to the list variable $idl. The $LIST pattern must be the last item of a list pattern, because it would make the pattern matching operation more complicated. The pattern (Intros ($LIST $idl) $lastid) is not accepted.

The other special operators allows checking what kind of leaf we are destructurating: For instance, the pattern (DO ($NUM $n) $tc) matches (DO 5 (Intro)), and creates the bindings ($n,5) and ($tc,(Intro)). The pattern matching would fail on (DO "5" (Intro)).



Retrieved by Memoweb from http://pauillac.inria.fr/coq/doc/node.2.0.0.html at 8/10/98