The IL Abstract Syntax Reference Manual ======================================= ### The abstract Imperative Language "IL" IL is the language accepted as an input by Tapenade. IL is an abstract language: it doesn't exist outside Tapenade, it has **no** concrete syntax, it only exists as an Abstract Syntax Tree. IL is supposed to be fed to Tapenade through a so-called \"protocol\", which is in fact the pre-order serializing of all the operators in the IL syntax tree. Therefore this protocol transforms this tree into a stream of operator names, plus a few special labels added at places. IL is supposed to represent all possible programming constructs that are used in actual imperative programming languages. To feed an actual program, written in a given language, into Tapenade, the program must first be turned into IL and then this IL is fed into Tapenade. IL does not capture pre-processing directives. In particular, all pre-processing that is specific to one language or system must be solved away. There are few other restrictions. In particular the IL machine assumes that: - All "cpp" or "fpp" pre-processing is done, i.e. there are no remaining \#DEFINE, \#IFDEF, \#INCLUDE - All includes are expanded (possibly with an indication of the original include file to help source regeneration to re-insert include clauses) - Distinction between assigned GOTO and normal GOTO must be done, resulting in the use of the "gotoLabelVar" and "goto" IL operators. However, it is true that some normalizations, although desirable, are very difficult to do on the "sender" side, because this would require the sender to perform the same kind of analyses as the IL analyses. Therefore, some normalisations and pre-processing is left to IL, and the IL machine accepts non-normalized or ambiguous input in the following cases: - Fortran is ambiguous between function calls and array accesses. IL accepts indifferently operators "call" and "arrayAccess", and disambiguates them according to declarations. - Fortran allows declarations to be decomposed into many steps. Therefore, IL accepts the same decomposition and builds the correct entry in the symbolTable progressively. For instance, one may declare the type and the DIMENSION separately. - Strictly speaking, when a tree is a "list of something", and this list happens to be empty, the IL transmission protocol requires two tokens on the stream: First the operator for the list, and then the "EndOfList". However in some cases, IL, and the analyses that follow, such as the FlowGraph analysis, accept that these two tokens be simply replaced by the "none" operator. This is still subject to discussion. #### AD Directives AD Directives are passed as comments. Their contained string must start with `$AD`. ### From this specification to the "il.metal" file This specification here is written in a somewhat simplified form, with respect to the actual "il.metal" source file. The reason for that is to improve legibility and understanding of the following syntactic rules. However, the actual rules may be deduced from the rules here, with the following conversion methods: - **Syntax:** Actual metal rules all end with a ";". Rules describing lists use a special syntax, with "\*" for possibly empty lists and "+" for "never empty" lists, followed by three dots. For example: labels -> Label * ... equivalence -> Expression + ... Leaves are actually defined with the "implemented as" keyword, for example in ident -> implemented as IDENTIFIER intCst -> implemented as INTEGER none -> implemented as SINGLETON We don't see the difference between this last rule and: none -> - **Lists:** Actual metal rules for lists require more "demultiplication". In addition to the operator for "this", they must introduce a new operator for "list of this". This operator's name is always built by adding a "s" at the end. Same for the phylum name. Therefore, a rule such as module -> ident? Declaration* corresponds to the actual rules (see under for the transformation of the `ident?`): module -> OptIdent Declarations Declarations ::= declarations declarations -> Declaration * ... As a general rule, order in lists **matters**: elements of these lists should appear in the same order as they appear in the original file. - **Optionals** A notation with a "?", such as `ident?`, means that optionally, instead of an "ident" node, there may be "nothing". This nothing is actually represented by the `none` atomic operator. In that case the phylum for "this or nothing" is built by adding "Opt" in front. For example again, the rule module -> ident? Declaration* corresponds to the real rules: module -> OptIdent Declarations OptIdent ::= Ident none - **Small phyla:** In metal, the right hand side of "-\>" rules must consist of phyla only. However, when a phylum contains only one operator, the simplified rules directly mention the operator. A one operator phylum bears the same name as the operator, capitalized. Therefore, the simplified rule: try -> blockStatement catch* blockStatement? corresponds to the real rules: try -> BlockStatement Catchs OptBlockStatement BlockStatement ::= blockStatement Catchs ::= catchs catches -> Catch * ... Catch ::= catch OptBlockStatement ::= BlockStatement none - **Idents:** `label`, and `operator` are all replaced by `ident` in the real rules. We keep distinct names here to make their meaning clearer. - **none:** There is a special atomic operator `none`, used mostly for optionals (*cf* above). none -> implemented as SINGLETON - **Directives:** The extra operator `directive` is used to place user directives to the tool, close to some IL objects such as members of the `Declaration` or `Statement` phyla. We did not mention it in the simplified rules. For example, the real rule defining the phylum `unit` is: Unit ::= program module Declaration blockStatement directives In the current state, the `directive` operator is actually added only to the phyla `Unit` and `Declaration`. The rule defining `directive` is: directive -> implemented as STRING - **End markers:** There are two additional atomic operators in the real rules, `endOfList` and `endOfProtocol`. These are *not* real operators, and therefore do not appear here. These are just markers used by the IL tree-passing protocol for marking the end of list sub-trees, and the end of the file transfer protocol. ### Specification of top level of files file -> Unit* Unit ::= program module blockStatement Declaration Declaration ::= varDeclaration constDeclaration varDimDeclaration typeDeclaration common equivalence implicit function class external intrinsic data save A file contains a -- possibly empty -- list of units (Fortran vocabulary), also called external declarations (C vocabulary). Objects declared at this level are usually considered as globals, i.e. their visibility scope is either all the file, or the remaining of the file. One unit may be the definition of a procedure, such as the main program (`program`), or a procedure or function (`function`), or an object-oriented class (`class`). One unit may also be a plain piece of code (`blockStatement`), that will be executed a little like an unnamed main program. One Unit may also be a named buch of `Declaration`'s (`module`), which can then be used by any procedure that imports this bunch. One unit may also be the declaration of one or many (global) variables (`varDeclaration`) or constants (`constDeclaration`). One unit may also be a global definition of a user data-type (`typeDeclaration`), a `COMMON` (`common`), or some `EQUIVALENCE` relation (`equivalence`). One Unit may also be an implicit typing rule (`implicit`), some variable remanence declaration (`save`), or some initialization (`data`). It may also declare names of external or predefined functions to be used later (`external`, `intrinsic`). Although we think it doesn't make sense at this `Unit` level, we say a `Declaration` may also be a separate declaration of dimension (`varDimDeclaration`). This is because the phylum `Declaration` is also used in the declaration part of each subroutine.\ **question**: So far, there is no specific functionDeclaration operator. It is subsumed by the `varDeclaration`, that may contain a `functionDeclarator`. But then the name "`varDeclaration`" is maybe poorly chosen, and the type modifiers in front of the varDeclaration would be confused with the modifiers that apply to a function. ### Procedure definitions program -> modifier* ident? varDeclaration* blockStatement function -> modifier* TypeSpec? TypeSpec? ident varDeclaration* blockStatement? A procedure or method definition. `program` designates the `main` program. - `modifier*`: a list of keywords that modify the behavior or visibility of the method. - `TypeSpec?`: when returns a value, the type of the returned value. When this is actually a procedure, this tree must be operator `void`. - `TypeSpec?`: when the procedure is a method of a class, the class to which this procedure belongs. - `ident`: the name of the procedure or method. This name is optional for the main program. - `varDeclaration*`: the list of formal arguments. This may contain typed or untyped variable names (with or without initial values, meaning *default* values). - `blockStatement`: the body of the procedure or method. The local declarations inside the `blockStatement` are much more permissive than the ones in `varDeclaration*`. For example, they may also contain type declarations or local subroutines\...\ **note:** In Fortran, the declarations in the `blockStatement` may as well declare the types of local variables and variables in the `varDeclaration*` formal parameters. In C, these declarations will be separated in two different scopes. If IL finds a declaration in the outer scope, for a name that is not a formal parameter, then it cannot complain because this is allowed in Fortran. ### Modules module -> ident? Declaration* The operator `module` represents a module (e.g. the FORTRAN module). This is a possibly named (with first field `ident?`) bunch of declarations (`Declaration*`), possibly including subroutine declarations. In Fortran, the module subsumes the old "blockdata", that is similar, except that it may hold only variable, common declarations and initializations. Hence, we chose not to create a "blockdata" constructor. The "module" is also close to the C++ or Java class, but we shall create a special purpose "class" operator, because the presence of methods is a big semantic difference. ### Object-oriented classes definition class -> modifier* ident parentClass* Declaration* A class definition - `modifier*`: a list of keywords that modify the behavior of the class. Possible values: `public`, `private`, `abstract`... - `ident`: the name of this class. - `parentClass*`: the specification of the parent classes, including various styles of parenthood such as C++ inheritance, Java "implements"... - `Declaration*`: the actual contents of the class (member fields, types, methods, sub-classes...) parentClass -> modifier* ident A declaration of class inheritance from a parent class: - `modifier*`: a list of keywords that modify the parenthood relation. Possible values: `public`, `private`, `virtual`, `implements`... - `ident`: the name of this parent class. constructor -> modifier* ident varDeclaration* BlockStatement A definition of a class instance constructor: - `modifier*`: a list of keywords that modify the constructor method. - `ident`: the name of the constructor (Question: is is useful, since it is the name of the class itself?) - `varDeclaration*`: the formal arguments of ths constructor - `BlockStatement`: the body of the constructor method. constructorCall -> TypeSpec ident Expressions A call to a class instance contructor: - `TypeSpec`: the class - `ident`: the name of the constructor (Question: is is useful, since it is the name of the class itself?) - `Expressions`: the actual arguments passed to the constructor. destructorCall -> TypeSpec ident A call to a class instance destructor: - `TypeSpec`: the class - `ident`: the name of the destructor (Question: is is useful, since it is built from the name of the class itself?) namespace -> ident? Declarations Definition of a name space. All declarations inside must be accessed through the name of this name space. - `ident?`: the name of this name space. - `Declarations`: the body of this name space. ### Variable declarations varDeclaration -> TypeSpec? Declarator* A variable declaration node (`varDeclaration`) first specifies a type (`TypeSpec?`). This type is optional, for the `varDeclaration`'s that appear in the lists of formal parameters in all cases of Fortran] and in one style of C. Then follows (`Declarator*`) a list of variables that have this declared type. This is because, in most languages there may be a list of variables attached to the same type in a single declaration. Actually it may be more than just variables. For example in C, it may be an array, a pointer, a function\... See the section on Declarators below. some `Declarator`'s may have an attached value, denoted here with the operator `initDeclarator`, that may represent an initial value, or a default value in the case of formal parameters. ### Other declarations implicit -> typedLetters* typedLetters -> TypeSpec Letters* Letters ::= letterRange letter letterRange -> letter letter Operator `implicit` represents Fortran's `IMPLICIT` declaration. It associates a default type `TypeSpec` for undeclared variables starting with the letters specified in `Letters*`. If `typedLetters*` is empty, this represents the `IMPLICIT NONE` declaration, that disactivates the implicit typing mechanism. `Letters` in `Letters*` may be a single letter (`letter`), or a pair (`letterRange`) that designates all letters in the interval. typeDeclaration -> typeName TypeSpec A `typeDeclaration` is a type declaration, i.e. it associates a `typeName` as a shorthand for a complete type specification or definition `TypeSpec`. constDeclaration -> TypeSpec? Declarator* This is the declaration of variables that will hold a constant, not modifiable value, during all the execution. This declares the type of the constants, followed by a list of constant names, each possibly followed by its initializator expression. This corresponds to the declaration of PARAMETER variables in Fortran, or `final` in Java. This also exists in C++. In Fortran90, there are two ways: `INTEGER I2, I3` `PARAMETER (I2 = 27, I3 = -5)` or\ `INTEGER, PARAMETER :: I2 = 27, I3 = -5` varDimDeclaration -> SimpleDeclarator* This is the Fortran `DIMENSION` declaration, that declares the array dimensions of the declared variables, without specifying their base type. The base type must then be declared elsewhere, or come from an `IMPLICIT` rule. Therefore this rule is similar to the `varDeclaration` rule, but with no `TypeSpec`, and the declarator may not give initial values. common -> ident? VarDeclarator* This is the Fortran `COMMON` declaration. It defines a named bunch (list) of global variables. The name of the bunch is the first son. If not present, this is the special "blank" common. The second son is a list of simple `VarDeclarator`'s, that may be simple variable names or array names with their dimensions. We think it cannot be an `iterativeVariableRef` here, because this represents a part of a given array, and arrays cannot be `partly` inside a common. At least, this it how it works in Fortran. equivalence -> Expression** **restriction:** `Expression**` is a list of list of only operators `arrayAccess` or `ident`. This is the Fortran `EQUIVALENCE` declaration. It contains a list of "atomic equivalence". An "atomic equivalence" is itself a list of references to variables, and each variable reference in one "atomic equivalence" will share the same memory location. intrinsic -> ident* external -> ident* This declares the contained `ident`'s to be the names of subroutines, that are either intrinsic (predefined) routines, or external (library) routines, i.e. that are defined outside the program and are not intrinsic. data -> Expression* Expression* save -> Expression* The `data` operator gives the initial value for some variables. It gives in the second son `Expression*` a list of constant expressions, whose value will be given to each variable in the first son, which is a list of `Expression`. The `save` operator declares that each variable designated by the list of `Expression`'s is remanent, i.e. keeps its value between successive calls to the enclosing subroutine.\ **restriction:** These `Expression` in the first son are actually restricted to be simple references to variables (i.e. `VariableRef`), but may also include `iterativeVariableRef` operators. ### Type specifications PrimitiveType ::= void boolean integer float complex character TypeSpec ::= PrimitiveType ident ParentClass modifiedType recordType unionType enumType arrayType pointerType referenceType functionType void -> boolean -> integer -> float -> complex -> character -> modifiedType -> Modifier* TypeSpec recordType -> ident? Modifier* Declaration* unionType -> ident? VarDeclaration* enumType -> ident? Declarator* arrayType -> TypeSpec dimColon* pointerType -> TypeSpec referenceType -> TypeSpec functionType -> TypeSpec argDeclaration* **restriction:** `typeName` of atomic value in `char`, `integer`, `float`, `boolean`, `byte` will (or may?) be recognized as primitive types. `typeName`'s with other values will be interpreted as previously defined user types. modifiedType -> modifier* TypeSpec **restriction:** `modifier*` for `modifiedType` is a list of atomic `modifier`'s with value `signed`, `unsigned`, `short`, `long`, or `double`. The value may also be an integer constant that represents the Fortran-style type modifiers, such as the `16` in `REAL*16`.\ **restriction:** `modifier*` for `modifiedType` is a list of atomic `modifier`'s with value `const`, `volatile`, `register`, `auto`, `target`, `pointer`, `in`, `out`, `inout`, `optional`, `save`, `extern`, `private`, or `sequence`.\ **note:** there is no `static` any more. The C `static` will be translated to modifier `save` for a local var of a procedure, to modifier `private` otherwise.\ **question:** Is the `const` modifier redundant with the `constDeclaration` operator? In that case, can we get rid of this `constDeclaration` operator? Same question with attribute `save` and operator `save`. Same for `pointer`, `target`. arrayType -> TypeSpec dimColon* dimColon -> Expression? Expression? This declares an array type, of base type given by first son `TypeSpec`, and a list of dimensions given by second son `dimColon*`. Each `dimColon` is a pair of beginning index and ending index. Both may be `null`. When the starting index is `null`, it means the Fortran default "1". When the ending index is `null`, it means that the last index is not specified, but is specified somewhere else. In particular, this will be used to represent the special Fortran star dimension (``), that specifies a dimension that declared somewhere else, and that is not redeclared at the current point. recordType -> ident? varDeclaration* unionType -> ident? varDeclaration* These are the classical record and union types, as defined in C or in Fortran90. The first son is a temporary name for the type, used inside the second son to obtain recursive definitions. The second son is a list of typed field names, looking just like ordinary `varDeclaration`'s. enumType -> ident? Declarator* **restriction:** `Declarator*` is a list of only `name` or `initDeclarator` for `name = constant`.\ `enumType` is the C-like enumerated type, i.e. a list of possible values in `Declarator*`, that are atomic names. There may be a constant attached to force the internal integer representation of this value. The initial optional `ident?` is an identifier that may exist in C. But this is not the name of the enumerated type. The enumerated type is named by enclosing it into a `typeDeclaration` node. pointerType -> TypeSpec This is the type: pointer to `TypeSpec`. functionType -> TypeSpec ArgDeclaration* ArgDeclaration ::= varDeclaration TypeSpec This is the function type. First son is the type of the returned result, and second son is the list of the arguments' types. The `ArgDeclaration` phylum represents the operators that can be found in the declarations of the arguments of a function, both here in the `functionType` type specification operator, and later in the `functionDeclarator` declarator operator. An `ArgDeclaration` can be a normal `varDeclaration` or, as happens in C, it can be a simple `TypeSpec` that represents the type of the argument, without specifying its name. void -> This is the void type, used as the return type of subroutines that return nothing, i.e. procedures and not functions. ### Declarators Declarator ::= SimpleDeclarator initDeclarator SimpleDeclarator ::= VarDeclarator pointerDeclarator bitfieldDeclarator functionDeclarator VarDeclarator ::= arrayDeclarator ident initDeclarator -> SimpleDeclarator Expression pointerDeclarator -> SimpleDeclarator bitfieldDeclarator -> ident Expression functionDeclarator -> SimpleDeclarator ArgDeclaration* arrayDeclarator -> SimpleDeclarator dimColon* A `Declarator` is an operator that is used, in conjunction with a type specification (`TypeSpec`), to declare an object (variable, record field, etc\...) that has this type. It can actually be an `initDeclarator`, in which case the declarator is immediately followed by an expression that gives an initial value for this object. Otherwise it is a `SimpleDeclarator`, which can be a `pointerDeclarator`, meaning that the `SimpleDeclarator` under has the type "pointer to" the type (*cf* C). It can be a `bitfieldDeclarator` (comes from C) that declares a field in a structure that has a length in bits specified by the second son `Expression`. It can be a `functionDeclarator`, that contains the declarator for the "name" of the function, and the `ArgDeclaration*` for the arguments. Similarly, an `arrayDeclarator` starting with the declarator for the "name" of the array, and the `dimColon*` declarations of the dimensions. ### Structured control statements Statement ::= blockStatement labelStatement if switch loop break continue return throw synchro try exit cycle stop goto gotoLabelVar assignLabelVar compGoto data Expression Among the above constructs, we consider `blockStatement`, `if`, `switch`, `loop`, `try`, `synchro`, as "structured" control statements. They contrast with "unstructured" control statements, described in the next section, and "simple" statements, that just do something but do not affect the control flow. We shall identify these simple statements with expressions, that will be described later. This comes from the C style, where simple statements, such as assignment and calls, may be used as expressions, and on the other hand, any expresson can be used as a statement, in which case its return value will just be discarded. blockStatement -> DeclarationOrStatement* DeclarationOrStatement ::= Declaration Statement A statement may be a `blockStatement`, in which case there may be additional declarations (`Declaration`), that override previous declarations, for the duration of the `blockStatement`. To cope with the style of some imperative languages, such as Java, we allow declarations and statements to be interleaved in any manner (provided declarations come before uses?). Therefore we introduce this new phylum `DeclarationOrStatement`. Notice however that it maybe a good practice not to interleave declarations and statements. For example in Java, there may be conflicts between declarations in different cases of a switch statement. What semantics for that? if -> Expression Statement? Statement? This is the standard "if-then-else" construct. In Fortran, it may have an additional label, but it seems this label has no semantic value, and therefore we don't keep it. try -> blockStatement catch* blockStatement? catch -> varDeclaration blockStatement throw -> Expression This is the exception mechanism. Exceptions are raised with the `throw` instruction. A `blockStatement` catches exceptions by being enclosed into a `try` statement. When an exception is raised inside the `blockStatement` of a `try`, it is matched against the `varDeclaration` of each of the following `catch` in `catch*`. If one matches, the corresponding `blockStatement` is executed. If no one matches, then if the final `blockStatement?` is present, it is executed, or it is `none` and the exception is propagated further. switch -> Expression switchCase* switchCase -> Expression* DeclarationOrStatement* break -> This is the classical switch-case construct of C or Java. All cases (`switchCase` in `switchCase*`) are examined in sequence. Each `switchCase` represents a case, that is chosen when the value of the top `Expression` matches one of the expressions in list `Expression*`. As a special behavior, the case is chosen when its list of expressions is empty. This is the way the `default` operator must be translated into IL. Inside the `Statement*` for each case, there may be calls to `break`, that just exit from the whole `switch` statement. synchro -> Expression blockStatement This is the threads synchronization, as found in Java. The `blockStatement` is suspended until the objects returned by `Expression` can be locked. The lock is released at the end of the `blockStatement` critical section.\ **question:** This specification of IL is currently incomplete with respect to tasking and threads parallelism. Therefore this `synchro` operator is here just to remind us that these mechanisms should be taken into account as soon as possible. loop -> ident? label? Control Statement? Control ::= do while until times forall for none This `loop` construct represents all iterative control structures. The optional `ident?` and `label` may be used for referencing this loop in `exit`'s and `cycle`'s. Then the `Control` son explicits the kind of iterative control used in this loop, and finally the `Statement?` (very rarely optional) defines what to do during each iteration. do -> ident Expression Expression Expression? This is the header of the standard Fortran `DO` loop. forall -> ident Expression Expression Expression? Expression? This is, for example, the header of the parallel Fortran90 `forall`. for -> varDeclaration? Expression? Expression? **restriction:** the first optional `varDeclaration?` may be a variable declaration with an initialization, or a simple expression such as a simple assignment.\ This is the header of the C `for` loop, that is more powerful than the Fortran `DO` loop. while -> Expression until -> Expression times -> Expression These control headers repeat the loop while or until some `Expression` evaluates to `true`, or a certain number of times. The condition is evaluated *before* each iteration for a `while`, *after* each iteration for a `until`. cycle -> ident? exit -> ident? These instructions jump to special places in the enclosing loop, or in the closest enclosing loop identified by `ident?` when present. `cycle` goes to the next iteration, `exit` just terminates the loop. return -> Expression? This returns from the enclosing `function`, and returns the value of the optional `Expression?` to the calling place. ### Unstructured control statements labelStatement -> label Statement? goto -> label gotoLabelVar -> ident label* assignLabelVar -> ident label compGoto -> label* Expression stop -> continue -> **question:** maybe we should remove the `continue` statement. and instead, accept `none` as a `Statement`. Otherwise, we must accept a `Statement?` as second son of `labelStatement`. This would treat correctly the `100 CONTINUE` instructions and `100 ENDDO` as well. ### Expressions or simple statements Expression ::= unary UnaryExpression binary BinaryExpression VariableRef address arrayTriplet AssignExpression ifExpression arrayInitializers stringCst intCst realCst substringAccess iterativeVariableRef UnaryExpression ::= minus not decr incr BinaryExpression ::= add sub mul div eq neq ge gt le lt and or xor power ior leftShift rightShift mod AssignExpression ::= andAssign assign divAssign iorAssign minusAssign modAssign plusAssign timesAssign xorAssign leftShiftAssign rightShiftAssign VariableRef ::= ident arrayAccess fieldAccess pointerAccess call The phylum `Expression` contains all possible operators that build an expression. Some of these operators are regrouped into the sub-phyla `UnaryExpression`, `BinaryExpression`, `AssignExpression`, and `VariableRef`. We have chosen to create a separate operator for each unary, binary, and assignment operation. They are respectively in the phyla `UnaryExpression`, `BinaryExpression`, and `AssignExpression`. However, in the very rare case where a new unary or binary operator is needed, it can be defined in terms of the catch-all operators `unary` and `binary`. However, we think these operators should not be used, or only temporarily, until the new operation is assigned a new operator in `Expression`. The operators in phylum `VariableRef` represent the objects that have an allocated memory, i.e. that can be assigned a value, or return a pointer to them, or can be passed as an "out" argument of a procedure.\ **question:** this distinction between `Expression` and `VariableRef` must be verified.\ Lastly, we added the `iterativeVariableRef` operator into `Expression`, although we know that this corresponds rather to an enumeration of expressions. This is because some operators, such as I-O, or the operators `save` and `data`, use to take a list of expressions in which some items are actually an iterative enumeration of expressions. This is a compromise that seems to work. The alternative would be to modify the `expression*` operator inside I-O statements. iterativeVariableRef -> Expression* do An `iterativeVariableRef` is equivalent to the enumeration of a list of expressions, built by concatenating copies of the `Expression*` in first place, one copy for each value of the index in the `do` in second place. assign -> VariableRef Expression plusAssign -> VariableRef Expression minusAssign -> VariableRef Expression timesAssign -> VariableRef Expression divAssign -> VariableRef Expression modAssign -> VariableRef Expression leftShiftAssign -> VariableRef Expression rightShiftAssign -> VariableRef Expression xorAssign -> VariableRef Expression iorAssign -> VariableRef Expression andAssign -> VariableRef Expression These are all the known assignment operators. We hope not to have forgotten any! Respectively, these correspond to the classical assignment, followed by the C assignments: `+=`, `-=`, `=`, `/=`, `%=`, `<<=`, `>>=`, `|=`, `^=`, `&=` call -> VariableRef? VariableRef Expression* A call to a procedure, function, subroutine, or method. - `VariableRef?`: (only for method calls) the object or class on which the method is called. More generally, this indicates where the called method must be found e.g. it can be a `scopeAccess` to restrict searching to a namespace or to a parent class. - `VariableRef`: the name of the called procedure (i.e. subroutine, function, method...). In more complex cases this child can be an expression that evaluates to this called procedure e.g. deref of a pointer to a function, or access into array of functions. - `Expression*`: the actual arguments of this call. scopeAccess -> scopeAccess ident? A scope restriction in front of an identifier. - `scopeAccess`: - `ident?`: add -> Expression Expression sub -> Expression Expression mul -> Expression Expression div -> Expression Expression power -> Expression Expression mod -> Expression Expression eq -> Expression Expression neq -> Expression Expression ge -> Expression Expression gt -> Expression Expression le -> Expression Expression lt -> Expression Expression and -> Expression Expression or -> Expression Expression xor -> Expression Expression ior -> Expression Expression leftShift -> Expression Expression rightShift -> Expression Expression binary -> Expression operator Expression minus -> Expression not -> Expression decr -> Expression incr -> Expression unary -> operator Expression These are the classical unary and binary operations. We hope not to have forgotten any! These `binary` and `unary` operators are catch-all operators for operations that were forgotten in the above `UnaryExpression` and `BinaryExpression` phyla. These `binary` and `unary` operators should not be used, or very temporarily. ifExpression -> Expression Expression Expression This is the conditional expression, that exists for example in C. arrayConstructor -> Expression * ... An expression that constructs an array - `Expression`: each of the values to be put into the constructed array. ### Dynamic memory allocate -> Expression? Expressions TypeSpec? KeywordArg Expression A call to dynamic memory allocation - `Expression?`: Total size allocated, in bytes. - `Expressions`: Array dimensions. - `TypeSpec?`: Type of the allocated elements. - `KeywordArg`: (?) in Fortran, optional returned status \"`STAT=`\". - `Expression`: (?) in C++, initialization information. fieldAccess -> VariableRef VariableRef Access to a field of a structured object. In particular access to a member of a class instance or to a static member of a class. - `VariableRef`: the structured object or the class. - `VariableRef`: the name of the field or member. address -> VariableRef arrayAccess -> VariableRef Expression* arrayInitializers -> Expression* arrayTriplet -> Expression? Expression? The operator `arrayTriplet` represents the classical general array section, from some index to some index, with a given stride. Initial index may be `none`, meaning 1, terminal index may be `none`, meaning the last index, and stride may be `none`, meaning 1. In particular, the `arrayTriplet` with `none` as third son represents (subsumes) the old operator `arraySection`. pointerAccess -> Expression substringAccess -> Expression Expression? Expression? ### Atomic nodes ident -> implemented as STRING label ::= ident typeName ::= ident operator ::= ident modifier ::= ident stringCst -> implemented as STRING letter -> implemented as STRING intCst -> implemented as INTEGER realCst -> implemented as REAL ### Summary of the complete syntax ``` - Phyla: Unit ::= program module blockStatement Declaration Declaration ::= varDeclaration constDeclaration varDimDeclaration typeDeclaration common equivalence implicit function class external intrinsic data save ArgDeclaration ::= varDeclaration TypeSpec Declarator ::= SimpleDeclarator initDeclarator SimpleDeclarator ::= VarDeclarator pointerDeclarator bitfieldDeclarator functionDeclarator VarDeclarator ::= arrayDeclarator ident DeclarationOrStatement ::= Declaration Statement Statement ::= blockStatement labelStatement if switch loop break continue return throw synchro try exit cycle stop goto gotoLabelVar assignLabelVar compGoto data Expression Control ::= do while until times forall for none Expression ::= unary UnaryExpression binary BinaryExpression VariableRef address arrayTriplet AssignExpression ifExpression arrayInitializers stringCst intCst realCst substringAccess iterativeVariableRef UnaryExpression ::= minus not decr incr BinaryExpression ::= add sub mul div eq neq ge gt le lt and or xor power ior leftShift rightShift mod AssignExpression ::= andAssign assign divAssign iorAssign minusAssign modAssign plusAssign timesAssign xorAssign leftShiftAssign rightShiftAssign VariableRef ::= ident arrayAccess fieldAccess pointerAccess call Letters ::= letterRange letter - Declarations: file -> Unit* module -> ident? Declaration* implicit -> typedLetters* typedLetters -> TypeSpec Letters* typeDeclaration -> typeName TypeSpec constDeclaration -> TypeSpec? Declarator* varDeclaration -> TypeSpec? Declarator* varDimDeclaration -> SimpleDeclarator* common -> ident? VarDeclarator* equivalence -> Expression** intrinsic -> ident* external -> ident* data -> Expression* save -> Expression* - Type specifications: modifiedTypeName -> modifiedTypeSpec -> arrayType -> TypeSpec dimColon* dimColon -> Expression? Expression? recordType -> ident? varDeclaration* unionType -> ident? varDeclaration* enumType -> ident? Declarator* pointerType -> TypeSpec functionType -> TypeSpec ArgDeclaration* void -> - Declarators: initDeclarator -> SimpleDeclarator Expression pointerDeclarator -> SimpleDeclarator bitfieldDeclarator -> ident Expression functionDeclarator -> SimpleDeclarator ArgDeclaration* arrayDeclarator -> SimpleDeclarator dimColon* - Control statements: blockStatement -> DeclarationOrStatement* if -> Expression Statement? Statement? return -> Expression? try -> blockStatement catch* blockStatement? catch -> varDeclaration blockStatement throw -> Expression switch -> Expression switchCase* switchCase -> Expression* DeclarationOrStatement* break -> synchro -> Expression blockStatement loop -> ident? label? Control Statement? do -> ident Expression Expression Expression? forall -> ident Expression Expression Expression? Expression? for -> varDeclaration? Expression? Expression? while -> Expression until -> Expression times -> Expression cycle -> ident? exit -> ident? - Expressions: iterativeVariableRef -> Expression* do assign -> VariableRef Expression plusAssign -> VariableRef Expression minusAssign -> VariableRef Expression timesAssign -> VariableRef Expression divAssign -> VariableRef Expression modAssign -> VariableRef Expression leftShiftAssign -> VariableRef Expression rightShiftAssign -> VariableRef Expression xorAssign -> VariableRef Expression iorAssign -> VariableRef Expression andAssign -> VariableRef Expression add -> Expression Expression sub -> Expression Expression mul -> Expression Expression div -> Expression Expression power -> Expression Expression mod -> Expression Expression eq -> Expression Expression neq -> Expression Expression ge -> Expression Expression gt -> Expression Expression le -> Expression Expression lt -> Expression Expression and -> Expression Expression or -> Expression Expression xor -> Expression Expression ior -> Expression Expression leftShift -> Expression Expression rightShift -> Expression Expression binary -> Expression operator Expression minus -> Expression not -> Expression decr -> Expression incr -> Expression unary -> operator Expression ifExpression -> Expression Expression Expression address -> VariableRef arrayAccess -> VariableRef Expression* arrayInitializers -> Expression* arrayTriplet -> Expression? Expression? Expression? pointerAccess -> Expression substringAccess -> Expression Expression? Expression? letterRange -> letter letter - Leaves: ident -> implemented as STRING label ::= ident typeName ::= ident operator ::= ident modifier ::= ident stringCst -> implemented as STRING letter -> implemented as STRING intCst -> implemented as INTEGER realCst -> implemented as REAL ```