The IL Abstract Syntax Reference Manual

The abstract Imperative Language “IL”

IL is the language accepted as an input by Tapenade. IL is an abstract language: it doesn’t exist outside Tapenade, it has no concrete syntax, it only exists as an Abstract Syntax Tree. IL is supposed to be fed to Tapenade through a so-called “protocol”, which is in fact the pre-order serializing of all the operators in the IL syntax tree. Therefore this protocol transforms this tree into a stream of operator names, plus a few special labels added at places. IL is supposed to represent all possible programming constructs that are used in actual imperative programming languages. To feed an actual program, written in a given language, into Tapenade, the program must first be turned into IL and then this IL is fed into Tapenade.

IL does not capture pre-processing directives. In particular, all pre-processing that is specific to one language or system must be solved away. There are few other restrictions. In particular the IL machine assumes that:

  • All “cpp” or “fpp” pre-processing is done, i.e. there are no remaining #DEFINE, #IFDEF, #INCLUDE

  • All includes are expanded (possibly with an indication of the original include file to help source regeneration to re-insert include clauses)

  • Distinction between assigned GOTO and normal GOTO must be done, resulting in the use of the “gotoLabelVar” and “goto” IL operators.

However, it is true that some normalizations, although desirable, are very difficult to do on the “sender” side, because this would require the sender to perform the same kind of analyses as the IL analyses. Therefore, some normalisations and pre-processing is left to IL, and the IL machine accepts non-normalized or ambiguous input in the following cases:

  • Fortran is ambiguous between function calls and array accesses. IL accepts indifferently operators “call” and “arrayAccess”, and disambiguates them according to declarations.

  • Fortran allows declarations to be decomposed into many steps. Therefore, IL accepts the same decomposition and builds the correct entry in the symbolTable progressively. For instance, one may declare the type and the DIMENSION separately.

  • Strictly speaking, when a tree is a “list of something”, and this list happens to be empty, the IL transmission protocol requires two tokens on the stream: First the operator for the list, and then the “EndOfList”. However in some cases, IL, and the analyses that follow, such as the FlowGraph analysis, accept that these two tokens be simply replaced by the “none” operator. This is still subject to discussion.

AD Directives

AD Directives are passed as comments. Their contained string must start with $AD.

From this specification to the “il.metal” file

This specification here is written in a somewhat simplified form, with respect to the actual “il.metal” source file. The reason for that is to improve legibility and understanding of the following syntactic rules. However, the actual rules may be deduced from the rules here, with the following conversion methods:

  • Syntax: Actual metal rules all end with a “;”. Rules describing lists use a special syntax, with “*” for possibly empty lists and “+” for “never empty” lists, followed by three dots. For example:

      labels -> Label * ...
      equivalence -> Expression + ...
    

    Leaves are actually defined with the “implemented as” keyword, for example in

      ident -> implemented as IDENTIFIER
      intCst -> implemented as INTEGER
      none -> implemented as SINGLETON
    

    We don’t see the difference between this last rule and:

      none ->
    
  • Lists: Actual metal rules for lists require more “demultiplication”. In addition to the operator for “this”, they must introduce a new operator for “list of this”. This operator’s name is always built by adding a “s” at the end. Same for the phylum name. Therefore, a rule such as

       module -> ident? Declaration*
    

    corresponds to the actual rules (see under for the transformation of the ident?):

      module -> OptIdent Declarations
      Declarations ::= declarations
      declarations -> Declaration * ...
    

    As a general rule, order in lists matters: elements of these lists should appear in the same order as they appear in the original file.

  • Optionals A notation with a “?”, such as ident?, means that optionally, instead of an “ident” node, there may be “nothing”. This nothing is actually represented by the none atomic operator. In that case the phylum for “this or nothing” is built by adding “Opt” in front. For example again, the rule

       module -> ident? Declaration*
    

    corresponds to the real rules:

      module -> OptIdent Declarations
      OptIdent ::= Ident none
    
  • Small phyla: In metal, the right hand side of “->” rules must consist of phyla only. However, when a phylum contains only one operator, the simplified rules directly mention the operator. A one operator phylum bears the same name as the operator, capitalized. Therefore, the simplified rule:

       try -> blockStatement catch* blockStatement?
    

    corresponds to the real rules:

       try -> BlockStatement Catchs OptBlockStatement
       BlockStatement ::= blockStatement
       Catchs ::= catchs
       catches -> Catch * ...
       Catch ::= catch
       OptBlockStatement ::= BlockStatement none
    
  • Idents: label, and operator are all replaced by ident in the real rules. We keep distinct names here to make their meaning clearer.

  • none: There is a special atomic operator none, used mostly for optionals (cf above).

       none -> implemented as SINGLETON
    
  • Directives: The extra operator directive is used to place user directives to the tool, close to some IL objects such as members of the Declaration or Statement phyla. We did not mention it in the simplified rules. For example, the real rule defining the phylum unit is:

       Unit ::= program module Declaration blockStatement directives
    

    In the current state, the directive operator is actually added only to the phyla Unit and Declaration. The rule defining directive is:

        directive -> implemented as STRING
    
  • End markers: There are two additional atomic operators in the real rules, endOfList and endOfProtocol. These are not real operators, and therefore do not appear here. These are just markers used by the IL tree-passing protocol for marking the end of list sub-trees, and the end of the file transfer protocol.

Specification of top level of files

       file -> Unit*
       Unit ::= program module blockStatement Declaration
       Declaration ::=
            varDeclaration constDeclaration varDimDeclaration
            typeDeclaration common equivalence implicit function
            class external intrinsic data save

A file contains a – possibly empty – list of units (Fortran vocabulary), also called external declarations (C vocabulary). Objects declared at this level are usually considered as globals, i.e. their visibility scope is either all the file, or the remaining of the file. One unit may be the definition of a procedure, such as the main program (program), or a procedure or function (function), or an object-oriented class (class). One unit may also be a plain piece of code (blockStatement), that will be executed a little like an unnamed main program. One Unit may also be a named buch of Declaration’s (module), which can then be used by any procedure that imports this bunch. One unit may also be the declaration of one or many (global) variables (varDeclaration) or constants (constDeclaration). One unit may also be a global definition of a user data-type (typeDeclaration), a COMMON (common), or some EQUIVALENCE relation (equivalence). One Unit may also be an implicit typing rule (implicit), some variable remanence declaration (save), or some initialization (data). It may also declare names of external or predefined functions to be used later (external, intrinsic). Although we think it doesn’t make sense at this Unit level, we say a Declaration may also be a separate declaration of dimension (varDimDeclaration). This is because the phylum Declaration is also used in the declaration part of each subroutine.
question: So far, there is no specific functionDeclaration operator. It is subsumed by the varDeclaration, that may contain a functionDeclarator. But then the name “varDeclaration” is maybe poorly chosen, and the type modifiers in front of the varDeclaration would be confused with the modifiers that apply to a function.

Procedure definitions

       program -> modifier* ident? varDeclaration* blockStatement
       function -> modifier* TypeSpec? TypeSpec? ident
                   varDeclaration* blockStatement?

A procedure or method definition. program designates the main program.

  • modifier*: a list of keywords that modify the behavior or visibility of the method.

  • TypeSpec?: when returns a value, the type of the returned value. When this is actually a procedure, this tree must be operator void.

  • TypeSpec?: when the procedure is a method of a class, the class to which this procedure belongs.

  • ident: the name of the procedure or method. This name is optional for the main program.

  • varDeclaration*: the list of formal arguments. This may contain typed or untyped variable names (with or without initial values, meaning default values).

  • blockStatement: the body of the procedure or method.

The local declarations inside the blockStatement are much more permissive than the ones in varDeclaration*. For example, they may also contain type declarations or local subroutines...
note: In Fortran, the declarations in the blockStatement may as well declare the types of local variables and variables in the varDeclaration* formal parameters. In C, these declarations will be separated in two different scopes. If IL finds a declaration in the outer scope, for a name that is not a formal parameter, then it cannot complain because this is allowed in Fortran.

Modules

    module -> ident? Declaration*

The operator module represents a module (e.g. the FORTRAN module). This is a possibly named (with first field ident?) bunch of declarations (Declaration*), possibly including subroutine declarations. In Fortran, the module subsumes the old “blockdata”, that is similar, except that it may hold only variable, common declarations and initializations. Hence, we chose not to create a “blockdata” constructor. The “module” is also close to the C++ or Java class, but we shall create a special purpose “class” operator, because the presence of methods is a big semantic difference.

Object-oriented classes definition

    class -> modifier* ident parentClass* Declaration*

A class definition

  • modifier*: a list of keywords that modify the behavior of the class. Possible values: public, private, abstract

  • ident: the name of this class.

  • parentClass*: the specification of the parent classes, including various styles of parenthood such as C++ inheritance, Java “implements”…

  • Declaration*: the actual contents of the class (member fields, types, methods, sub-classes…)

    parentClass -> modifier* ident
    

A declaration of class inheritance from a parent class:

  • modifier*: a list of keywords that modify the parenthood relation. Possible values: public, private, virtual, implements

  • ident: the name of this parent class.

    constructor -> modifier* ident varDeclaration* BlockStatement
    

A definition of a class instance constructor:

  • modifier*: a list of keywords that modify the constructor method.

  • ident: the name of the constructor (Question: is is useful, since it is the name of the class itself?)

  • varDeclaration*: the formal arguments of ths constructor

  • BlockStatement: the body of the constructor method.

    constructorCall -> TypeSpec ident Expressions
    

A call to a class instance contructor:

  • TypeSpec: the class

  • ident: the name of the constructor (Question: is is useful, since it is the name of the class itself?)

  • Expressions: the actual arguments passed to the constructor.

    destructorCall -> TypeSpec ident
    

A call to a class instance destructor:

  • TypeSpec: the class

  • ident: the name of the destructor (Question: is is useful, since it is built from the name of the class itself?)

    namespace -> ident? Declarations
    

Definition of a name space. All declarations inside must be accessed through the name of this name space.

  • ident?: the name of this name space.

  • Declarations: the body of this name space.

Variable declarations

    varDeclaration -> TypeSpec? Declarator*

A variable declaration node (varDeclaration) first specifies a type (TypeSpec?). This type is optional, for the varDeclaration’s that appear in the lists of formal parameters in all cases of Fortran] and in one style of C. Then follows (Declarator*) a list of variables that have this declared type. This is because, in most languages there may be a list of variables attached to the same type in a single declaration. Actually it may be more than just variables. For example in C, it may be an array, a pointer, a function... See the section on Declarators below. some Declarator’s may have an attached value, denoted here with the operator initDeclarator, that may represent an initial value, or a default value in the case of formal parameters.

Other declarations

    implicit -> typedLetters*
    typedLetters -> TypeSpec Letters*
    Letters ::= letterRange letter
    letterRange -> letter letter

Operator implicit represents Fortran’s IMPLICIT declaration. It associates a default type TypeSpec for undeclared variables starting with the letters specified in Letters*. If typedLetters* is empty, this represents the IMPLICIT NONE declaration, that disactivates the implicit typing mechanism. Letters in Letters* may be a single letter (letter), or a pair (letterRange) that designates all letters in the interval.

    typeDeclaration -> typeName TypeSpec

A typeDeclaration is a type declaration, i.e. it associates a typeName as a shorthand for a complete type specification or definition TypeSpec.

    constDeclaration -> TypeSpec? Declarator*

This is the declaration of variables that will hold a constant, not modifiable value, during all the execution. This declares the type of the constants, followed by a list of constant names, each possibly followed by its initializator expression. This corresponds to the declaration of PARAMETER variables in Fortran, or final in Java. This also exists in C++. In Fortran90, there are two ways:

INTEGER I2, I3
PARAMETER (I2 = 27, I3 = -5)
or
INTEGER, PARAMETER :: I2 = 27, I3 = -5

    varDimDeclaration -> SimpleDeclarator*

This is the Fortran DIMENSION declaration, that declares the array dimensions of the declared variables, without specifying their base type. The base type must then be declared elsewhere, or come from an IMPLICIT rule. Therefore this rule is similar to the varDeclaration rule, but with no TypeSpec, and the declarator may not give initial values.

    common -> ident? VarDeclarator*

This is the Fortran COMMON declaration. It defines a named bunch (list) of global variables. The name of the bunch is the first son. If not present, this is the special “blank” common. The second son is a list of simple VarDeclarator’s, that may be simple variable names or array names with their dimensions. We think it cannot be an iterativeVariableRef here, because this represents a part of a given array, and arrays cannot be partly inside a common. At least, this it how it works in Fortran.

    equivalence -> Expression**

restriction: Expression** is a list of list of only operators arrayAccess or ident.

This is the Fortran EQUIVALENCE declaration. It contains a list of “atomic equivalence”. An “atomic equivalence” is itself a list of references to variables, and each variable reference in one “atomic equivalence” will share the same memory location.

    intrinsic -> ident*
    external -> ident*

This declares the contained ident’s to be the names of subroutines, that are either intrinsic (predefined) routines, or external (library) routines, i.e. that are defined outside the program and are not intrinsic.

    data -> Expression* Expression*
    save -> Expression*

The data operator gives the initial value for some variables. It gives in the second son Expression* a list of constant expressions, whose value will be given to each variable in the first son, which is a list of Expression. The save operator declares that each variable designated by the list of Expression’s is remanent, i.e. keeps its value between successive calls to the enclosing subroutine.
restriction: These Expression in the first son are actually restricted to be simple references to variables (i.e. VariableRef), but may also include iterativeVariableRef operators.

Type specifications

    PrimitiveType ::= void boolean integer float complex character
    TypeSpec ::= PrimitiveType ident ParentClass modifiedType recordType unionType
                 enumType arrayType pointerType referenceType functionType
    void ->
    boolean ->
    integer ->
    float ->
    complex ->
    character ->
    modifiedType -> Modifier* TypeSpec
    recordType -> ident? Modifier* Declaration*
    unionType -> ident? VarDeclaration*
    enumType -> ident? Declarator*
    arrayType -> TypeSpec dimColon*
    pointerType -> TypeSpec
    referenceType -> TypeSpec
    functionType -> TypeSpec argDeclaration*

restriction: typeName of atomic value in char, integer, float, boolean, byte will (or may?) be recognized as primitive types. typeName’s with other values will be interpreted as previously defined user types.

    modifiedType -> modifier* TypeSpec

restriction: modifier* for modifiedType is a list of atomic modifier’s with value signed, unsigned, short, long, or double. The value may also be an integer constant that represents the Fortran-style type modifiers, such as the 16 in REAL*16.
restriction: modifier* for modifiedType is a list of atomic modifier’s with value const, volatile, register, auto, target, pointer, in, out, inout, optional, save, extern, private, or sequence.
note: there is no static any more. The C static will be translated to modifier save for a local var of a procedure, to modifier private otherwise.
question: Is the const modifier redundant with the constDeclaration operator? In that case, can we get rid of this constDeclaration operator? Same question with attribute save and operator save. Same for pointer, target.

    arrayType -> TypeSpec dimColon*
    dimColon -> Expression?  Expression?

This declares an array type, of base type given by first son TypeSpec, and a list of dimensions given by second son dimColon*. Each dimColon is a pair of beginning index and ending index. Both may be null. When the starting index is null, it means the Fortran default “1”. When the ending index is null, it means that the last index is not specified, but is specified somewhere else. In particular, this will be used to represent the special Fortran star dimension (``), that specifies a dimension that declared somewhere else, and that is not redeclared at the current point.

    recordType -> ident? varDeclaration*
    unionType -> ident? varDeclaration*

These are the classical record and union types, as defined in C or in Fortran90. The first son is a temporary name for the type, used inside the second son to obtain recursive definitions. The second son is a list of typed field names, looking just like ordinary varDeclaration’s.

    enumType -> ident? Declarator*

restriction: Declarator* is a list of only name or initDeclarator for name = constant.
enumType is the C-like enumerated type, i.e. a list of possible values in Declarator*, that are atomic names. There may be a constant attached to force the internal integer representation of this value. The initial optional ident? is an identifier that may exist in C. But this is not the name of the enumerated type. The enumerated type is named by enclosing it into a typeDeclaration node.

    pointerType -> TypeSpec

This is the type: pointer to TypeSpec.

    functionType -> TypeSpec ArgDeclaration*
    ArgDeclaration ::= varDeclaration TypeSpec

This is the function type. First son is the type of the returned result, and second son is the list of the arguments’ types. The ArgDeclaration phylum represents the operators that can be found in the declarations of the arguments of a function, both here in the functionType type specification operator, and later in the functionDeclarator declarator operator. An ArgDeclaration can be a normal varDeclaration or, as happens in C, it can be a simple TypeSpec that represents the type of the argument, without specifying its name.

    void ->

This is the void type, used as the return type of subroutines that return nothing, i.e. procedures and not functions.

Declarators

    Declarator ::= SimpleDeclarator initDeclarator
    SimpleDeclarator ::=
                     VarDeclarator pointerDeclarator bitfieldDeclarator
                     functionDeclarator
    VarDeclarator ::= arrayDeclarator ident
    initDeclarator -> SimpleDeclarator Expression
    pointerDeclarator -> SimpleDeclarator
    bitfieldDeclarator -> ident Expression
    functionDeclarator -> SimpleDeclarator ArgDeclaration*
    arrayDeclarator -> SimpleDeclarator dimColon*

A Declarator is an operator that is used, in conjunction with a type specification (TypeSpec), to declare an object (variable, record field, etc...) that has this type. It can actually be an initDeclarator, in which case the declarator is immediately followed by an expression that gives an initial value for this object. Otherwise it is a SimpleDeclarator, which can be a pointerDeclarator, meaning that the SimpleDeclarator under has the type “pointer to” the type (cf C). It can be a bitfieldDeclarator (comes from C) that declares a field in a structure that has a length in bits specified by the second son Expression. It can be a functionDeclarator, that contains the declarator for the “name” of the function, and the ArgDeclaration* for the arguments. Similarly, an arrayDeclarator starting with the declarator for the “name” of the array, and the dimColon* declarations of the dimensions.

Structured control statements

    Statement ::=
              blockStatement labelStatement if switch loop break
              continue return throw synchro try exit cycle stop goto
              gotoLabelVar assignLabelVar compGoto data Expression

Among the above constructs, we consider blockStatement, if, switch, loop, try, synchro, as “structured” control statements. They contrast with “unstructured” control statements, described in the next section, and “simple” statements, that just do something but do not affect the control flow. We shall identify these simple statements with expressions, that will be described later. This comes from the C style, where simple statements, such as assignment and calls, may be used as expressions, and on the other hand, any expresson can be used as a statement, in which case its return value will just be discarded.

    blockStatement -> DeclarationOrStatement*
    DeclarationOrStatement ::= Declaration Statement

A statement may be a blockStatement, in which case there may be additional declarations (Declaration), that override previous declarations, for the duration of the blockStatement. To cope with the style of some imperative languages, such as Java, we allow declarations and statements to be interleaved in any manner (provided declarations come before uses?). Therefore we introduce this new phylum DeclarationOrStatement. Notice however that it maybe a good practice not to interleave declarations and statements. For example in Java, there may be conflicts between declarations in different cases of a switch statement. What semantics for that?

    if -> Expression Statement? Statement?

This is the standard “if-then-else” construct. In Fortran, it may have an additional label, but it seems this label has no semantic value, and therefore we don’t keep it.

    try -> blockStatement catch* blockStatement?
    catch -> varDeclaration blockStatement
    throw -> Expression

This is the exception mechanism. Exceptions are raised with the throw instruction. A blockStatement catches exceptions by being enclosed into a try statement. When an exception is raised inside the blockStatement of a try, it is matched against the varDeclaration of each of the following catch in catch*. If one matches, the corresponding blockStatement is executed. If no one matches, then if the final blockStatement? is present, it is executed, or it is none and the exception is propagated further.

    switch -> Expression switchCase*
    switchCase -> Expression* DeclarationOrStatement*
    break ->

This is the classical switch-case construct of C or Java. All cases (switchCase in switchCase*) are examined in sequence. Each switchCase represents a case, that is chosen when the value of the top Expression matches one of the expressions in list Expression*. As a special behavior, the case is chosen when its list of expressions is empty. This is the way the default operator must be translated into IL. Inside the Statement* for each case, there may be calls to break, that just exit from the whole switch statement.

    synchro -> Expression blockStatement

This is the threads synchronization, as found in Java. The blockStatement is suspended until the objects returned by Expression can be locked. The lock is released at the end of the blockStatement critical section.
question: This specification of IL is currently incomplete with respect to tasking and threads parallelism. Therefore this synchro operator is here just to remind us that these mechanisms should be taken into account as soon as possible.

    loop -> ident? label? Control Statement?
    Control ::= do while until times forall for none

This loop construct represents all iterative control structures. The optional ident? and label may be used for referencing this loop in exit’s and cycle’s. Then the Control son explicits the kind of iterative control used in this loop, and finally the Statement? (very rarely optional) defines what to do during each iteration.

    do -> ident Expression Expression Expression?

This is the header of the standard Fortran DO loop.

    forall -> ident Expression Expression Expression? Expression?

This is, for example, the header of the parallel Fortran90 forall.

    for -> varDeclaration? Expression? Expression?

restriction: the first optional varDeclaration? may be a variable declaration with an initialization, or a simple expression such as a simple assignment.
This is the header of the C for loop, that is more powerful than the Fortran DO loop.

    while -> Expression
    until -> Expression
    times -> Expression

These control headers repeat the loop while or until some Expression evaluates to true, or a certain number of times. The condition is evaluated before each iteration for a while, after each iteration for a until.

    cycle -> ident?
    exit -> ident?

These instructions jump to special places in the enclosing loop, or in the closest enclosing loop identified by ident? when present. cycle goes to the next iteration, exit just terminates the loop.

    return -> Expression?

This returns from the enclosing function, and returns the value of the optional Expression? to the calling place.

Unstructured control statements

    labelStatement -> label Statement?
    goto -> label
    gotoLabelVar -> ident label*
    assignLabelVar -> ident label
    compGoto -> label* Expression
    stop ->
    continue ->

question: maybe we should remove the continue statement. and instead, accept none as a Statement. Otherwise, we must accept a Statement? as second son of labelStatement. This would treat correctly the 100 CONTINUE instructions and 100 ENDDO as well.

Expressions or simple statements

    Expression ::=
               unary UnaryExpression binary BinaryExpression
               VariableRef address arrayTriplet AssignExpression
               ifExpression arrayInitializers stringCst intCst
               realCst substringAccess iterativeVariableRef
    UnaryExpression ::= minus not decr incr
    BinaryExpression ::=
               add sub mul div eq neq ge gt le lt and or xor power ior
               leftShift rightShift mod
    AssignExpression ::=
               andAssign assign divAssign iorAssign minusAssign
               modAssign plusAssign timesAssign xorAssign
               leftShiftAssign rightShiftAssign
    VariableRef ::=
               ident arrayAccess fieldAccess pointerAccess call

The phylum Expression contains all possible operators that build an expression. Some of these operators are regrouped into the sub-phyla UnaryExpression, BinaryExpression, AssignExpression, and VariableRef. We have chosen to create a separate operator for each unary, binary, and assignment operation. They are respectively in the phyla UnaryExpression, BinaryExpression, and AssignExpression. However, in the very rare case where a new unary or binary operator is needed, it can be defined in terms of the catch-all operators unary and binary. However, we think these operators should not be used, or only temporarily, until the new operation is assigned a new operator in Expression. The operators in phylum VariableRef represent the objects that have an allocated memory, i.e. that can be assigned a value, or return a pointer to them, or can be passed as an “out” argument of a procedure.
question: this distinction between Expression and VariableRef must be verified.
Lastly, we added the iterativeVariableRef operator into Expression, although we know that this corresponds rather to an enumeration of expressions. This is because some operators, such as I-O, or the operators save and data, use to take a list of expressions in which some items are actually an iterative enumeration of expressions. This is a compromise that seems to work. The alternative would be to modify the expression* operator inside I-O statements.

    iterativeVariableRef -> Expression* do

An iterativeVariableRef is equivalent to the enumeration of a list of expressions, built by concatenating copies of the Expression* in first place, one copy for each value of the index in the do in second place.

    assign -> VariableRef Expression
    plusAssign -> VariableRef Expression
    minusAssign -> VariableRef Expression
    timesAssign -> VariableRef Expression
    divAssign -> VariableRef Expression
    modAssign -> VariableRef Expression
    leftShiftAssign -> VariableRef Expression
    rightShiftAssign -> VariableRef Expression
    xorAssign -> VariableRef Expression
    iorAssign -> VariableRef Expression
    andAssign -> VariableRef Expression

These are all the known assignment operators. We hope not to have forgotten any! Respectively, these correspond to the classical assignment, followed by the C assignments: +=, -=, =, /=, %=, <<=, >>=, |=, ^=, &=

    call -> VariableRef? VariableRef Expression*

A call to a procedure, function, subroutine, or method.

  • VariableRef?: (only for method calls) the object or class on which the method is called. More generally, this indicates where the called method must be found e.g. it can be a scopeAccess to restrict searching to a namespace or to a parent class.

  • VariableRef: the name of the called procedure (i.e. subroutine, function, method…). In more complex cases this child can be an expression that evaluates to this called procedure e.g. deref of a pointer to a function, or access into array of functions.

  • Expression*: the actual arguments of this call.

    scopeAccess -> scopeAccess ident?
    

A scope restriction in front of an identifier.

  • scopeAccess:

  • ident?:

    add -> Expression Expression 
    sub -> Expression Expression 
    mul -> Expression Expression 
    div -> Expression Expression 
    power -> Expression Expression 
    mod -> Expression Expression 
    eq -> Expression Expression 
    neq -> Expression Expression 
    ge -> Expression Expression
    gt -> Expression Expression
    le -> Expression Expression
    lt -> Expression Expression
    and -> Expression Expression
    or -> Expression Expression
    xor -> Expression Expression
    ior -> Expression Expression
    leftShift -> Expression Expression 
    rightShift -> Expression Expression 
    binary -> Expression operator Expression 
    minus -> Expression
    not -> Expression
    decr -> Expression
    incr -> Expression
    unary -> operator Expression
    

These are the classical unary and binary operations. We hope not to have forgotten any! These binary and unary operators are catch-all operators for operations that were forgotten in the above UnaryExpression and BinaryExpression phyla. These binary and unary operators should not be used, or very temporarily.

    ifExpression -> Expression Expression Expression

This is the conditional expression, that exists for example in C.

    arrayConstructor ->  Expression * ...

An expression that constructs an array

  • Expression: each of the values to be put into the constructed array.

Dynamic memory

    allocate  -> Expression? Expressions TypeSpec? KeywordArg Expression

A call to dynamic memory allocation

  • Expression?: Total size allocated, in bytes.

  • Expressions: Array dimensions.

  • TypeSpec?: Type of the allocated elements.

  • KeywordArg: (?) in Fortran, optional returned status “STAT=”.

  • Expression: (?) in C++, initialization information.

    fieldAccess  -> VariableRef VariableRef
    

Access to a field of a structured object. In particular access to a member of a class instance or to a static member of a class.

  • VariableRef: the structured object or the class.

  • VariableRef: the name of the field or member.

    address -> VariableRef
    arrayAccess -> VariableRef Expression*
    arrayInitializers -> Expression*
    arrayTriplet -> Expression? Expression?
    

The operator arrayTriplet represents the classical general array section, from some index to some index, with a given stride. Initial index may be none, meaning 1, terminal index may be none, meaning the last index, and stride may be none, meaning 1. In particular, the arrayTriplet with none as third son represents (subsumes) the old operator arraySection.

    pointerAccess -> Expression
    substringAccess -> Expression Expression? Expression?

Atomic nodes

    ident -> implemented as STRING
    label ::= ident
    typeName ::= ident
    operator ::= ident
    modifier ::= ident
    stringCst -> implemented as STRING
    letter -> implemented as STRING
    intCst -> implemented as INTEGER
    realCst -> implemented as REAL

Summary of the complete syntax

- Phyla:
    Unit ::=
         program module blockStatement Declaration
    Declaration ::=
         varDeclaration constDeclaration varDimDeclaration
         typeDeclaration common equivalence implicit function
         class external intrinsic data save
    ArgDeclaration ::=
         varDeclaration TypeSpec
    Declarator ::=
         SimpleDeclarator initDeclarator
    SimpleDeclarator ::=
         VarDeclarator pointerDeclarator bitfieldDeclarator
         functionDeclarator
    VarDeclarator ::=
         arrayDeclarator ident
    DeclarationOrStatement ::=
         Declaration Statement
    Statement ::=
         blockStatement labelStatement if switch loop break
         continue return throw synchro try exit cycle stop goto
         gotoLabelVar assignLabelVar compGoto data Expression
    Control ::=
         do while until times forall for none
    Expression ::=
         unary UnaryExpression binary BinaryExpression
         VariableRef address arrayTriplet AssignExpression
         ifExpression arrayInitializers stringCst intCst
         realCst substringAccess iterativeVariableRef
    UnaryExpression ::=
         minus not decr incr
    BinaryExpression ::=
         add sub mul div eq neq ge gt le lt and or xor power ior
         leftShift rightShift mod
    AssignExpression ::=
         andAssign assign divAssign iorAssign minusAssign
         modAssign plusAssign timesAssign xorAssign
         leftShiftAssign rightShiftAssign
    VariableRef ::=
         ident arrayAccess fieldAccess pointerAccess call
    Letters ::=
         letterRange letter

- Declarations:
    file -> Unit*
    module -> ident? Declaration*
    implicit -> typedLetters*
    typedLetters -> TypeSpec Letters*
    typeDeclaration -> typeName TypeSpec
    constDeclaration -> TypeSpec? Declarator*
    varDeclaration -> TypeSpec? Declarator*
    varDimDeclaration -> SimpleDeclarator*
    common -> ident? VarDeclarator*
    equivalence -> Expression**
    intrinsic -> ident*
    external -> ident*
    data -> Expression*
    save -> Expression*

- Type specifications:
    modifiedTypeName ->
    modifiedTypeSpec ->
    arrayType -> TypeSpec dimColon*
    dimColon -> Expression? Expression?
    recordType -> ident? varDeclaration*
    unionType -> ident? varDeclaration*
    enumType -> ident? Declarator*
    pointerType -> TypeSpec
    functionType -> TypeSpec ArgDeclaration*
    void ->
    
- Declarators:
    initDeclarator -> SimpleDeclarator Expression
    pointerDeclarator -> SimpleDeclarator
    bitfieldDeclarator -> ident Expression
    functionDeclarator -> SimpleDeclarator ArgDeclaration*
    arrayDeclarator -> SimpleDeclarator dimColon*
    
- Control statements:
    blockStatement -> DeclarationOrStatement*
    if -> Expression Statement? Statement?
    return -> Expression?
    try -> blockStatement catch* blockStatement?
    catch -> varDeclaration blockStatement
    throw -> Expression
    switch -> Expression switchCase*
    switchCase -> Expression* DeclarationOrStatement*
    break ->
    synchro -> Expression blockStatement
    loop -> ident? label? Control Statement?
    do -> ident Expression Expression Expression?
    forall -> ident Expression Expression Expression? Expression?
    for -> varDeclaration? Expression? Expression?
    while -> Expression
    until -> Expression
    times -> Expression
    cycle -> ident?
    exit -> ident?
    
- Expressions:
    iterativeVariableRef -> Expression* do
    assign -> VariableRef Expression
    plusAssign -> VariableRef Expression
    minusAssign -> VariableRef Expression
    timesAssign -> VariableRef Expression
    divAssign -> VariableRef Expression
    modAssign -> VariableRef Expression
    leftShiftAssign -> VariableRef Expression
    rightShiftAssign -> VariableRef Expression
    xorAssign -> VariableRef Expression
    iorAssign -> VariableRef Expression
    andAssign -> VariableRef Expression
    add -> Expression Expression
    sub -> Expression Expression
    mul -> Expression Expression
    div -> Expression Expression
    power -> Expression Expression
    mod -> Expression Expression
    eq -> Expression Expression
    neq -> Expression Expression
    ge -> Expression Expression
    gt -> Expression Expression
    le -> Expression Expression
    lt -> Expression Expression
    and -> Expression Expression
    or -> Expression Expression
    xor -> Expression Expression
    ior -> Expression Expression
    leftShift -> Expression Expression
    rightShift -> Expression Expression
    binary -> Expression operator Expression
    minus -> Expression
    not -> Expression
    decr -> Expression
    incr -> Expression
    unary -> operator Expression
    ifExpression -> Expression Expression Expression
    address -> VariableRef
    arrayAccess -> VariableRef Expression*
    arrayInitializers -> Expression*
    arrayTriplet -> Expression? Expression? Expression?
    pointerAccess -> Expression
    substringAccess -> Expression Expression? Expression?
    letterRange -> letter letter
    
- Leaves:
    ident -> implemented as STRING 
    label ::= ident
    typeName ::= ident
    operator ::= ident
    modifier ::= ident
    stringCst -> implemented as STRING
    letter -> implemented as STRING
    intCst -> implemented as INTEGER
    realCst -> implemented as REAL