The IL Abstract Syntax Reference Manual¶
The abstract Imperative Language “IL”¶
IL is the language accepted as an input by Tapenade. IL is an abstract language: it doesn’t exist outside Tapenade, it has no concrete syntax, it only exists as an Abstract Syntax Tree. IL is supposed to be fed to Tapenade through a so-called “protocol”, which is in fact the pre-order serializing of all the operators in the IL syntax tree. Therefore this protocol transforms this tree into a stream of operator names, plus a few special labels added at places. IL is supposed to represent all possible programming constructs that are used in actual imperative programming languages. To feed an actual program, written in a given language, into Tapenade, the program must first be turned into IL and then this IL is fed into Tapenade.
IL does not capture pre-processing directives. In particular, all pre-processing that is specific to one language or system must be solved away. There are few other restrictions. In particular the IL machine assumes that:
All “cpp” or “fpp” pre-processing is done, i.e. there are no remaining #DEFINE, #IFDEF, #INCLUDE
All includes are expanded (possibly with an indication of the original include file to help source regeneration to re-insert include clauses)
Distinction between assigned GOTO and normal GOTO must be done, resulting in the use of the “gotoLabelVar” and “goto” IL operators.
However, it is true that some normalizations, although desirable, are very difficult to do on the “sender” side, because this would require the sender to perform the same kind of analyses as the IL analyses. Therefore, some normalisations and pre-processing is left to IL, and the IL machine accepts non-normalized or ambiguous input in the following cases:
Fortran is ambiguous between function calls and array accesses. IL accepts indifferently operators “call” and “arrayAccess”, and disambiguates them according to declarations.
Fortran allows declarations to be decomposed into many steps. Therefore, IL accepts the same decomposition and builds the correct entry in the symbolTable progressively. For instance, one may declare the type and the DIMENSION separately.
Strictly speaking, when a tree is a “list of something”, and this list happens to be empty, the IL transmission protocol requires two tokens on the stream: First the operator for the list, and then the “EndOfList”. However in some cases, IL, and the analyses that follow, such as the FlowGraph analysis, accept that these two tokens be simply replaced by the “none” operator. This is still subject to discussion.
AD Directives¶
AD Directives are passed as comments. Their contained string must start
with $AD
.
From this specification to the “il.metal” file¶
This specification here is written in a somewhat simplified form, with respect to the actual “il.metal” source file. The reason for that is to improve legibility and understanding of the following syntactic rules. However, the actual rules may be deduced from the rules here, with the following conversion methods:
Syntax: Actual metal rules all end with a “;”. Rules describing lists use a special syntax, with “*” for possibly empty lists and “+” for “never empty” lists, followed by three dots. For example:
labels -> Label * ... equivalence -> Expression + ...
Leaves are actually defined with the “implemented as” keyword, for example in
ident -> implemented as IDENTIFIER intCst -> implemented as INTEGER none -> implemented as SINGLETON
We don’t see the difference between this last rule and:
none ->
Lists: Actual metal rules for lists require more “demultiplication”. In addition to the operator for “this”, they must introduce a new operator for “list of this”. This operator’s name is always built by adding a “s” at the end. Same for the phylum name. Therefore, a rule such as
module -> ident? Declaration*
corresponds to the actual rules (see under for the transformation of the
ident?
):module -> OptIdent Declarations Declarations ::= declarations declarations -> Declaration * ...
As a general rule, order in lists matters: elements of these lists should appear in the same order as they appear in the original file.
Optionals A notation with a “?”, such as
ident?
, means that optionally, instead of an “ident” node, there may be “nothing”. This nothing is actually represented by thenone
atomic operator. In that case the phylum for “this or nothing” is built by adding “Opt” in front. For example again, the rulemodule -> ident? Declaration*
corresponds to the real rules:
module -> OptIdent Declarations OptIdent ::= Ident none
Small phyla: In metal, the right hand side of “->” rules must consist of phyla only. However, when a phylum contains only one operator, the simplified rules directly mention the operator. A one operator phylum bears the same name as the operator, capitalized. Therefore, the simplified rule:
try -> blockStatement catch* blockStatement?
corresponds to the real rules:
try -> BlockStatement Catchs OptBlockStatement BlockStatement ::= blockStatement Catchs ::= catchs catches -> Catch * ... Catch ::= catch OptBlockStatement ::= BlockStatement none
Idents:
label
, andoperator
are all replaced byident
in the real rules. We keep distinct names here to make their meaning clearer.none: There is a special atomic operator
none
, used mostly for optionals (cf above).none -> implemented as SINGLETON
Directives: The extra operator
directive
is used to place user directives to the tool, close to some IL objects such as members of theDeclaration
orStatement
phyla. We did not mention it in the simplified rules. For example, the real rule defining the phylumunit
is:Unit ::= program module Declaration blockStatement directives
In the current state, the
directive
operator is actually added only to the phylaUnit
andDeclaration
. The rule definingdirective
is:directive -> implemented as STRING
End markers: There are two additional atomic operators in the real rules,
endOfList
andendOfProtocol
. These are not real operators, and therefore do not appear here. These are just markers used by the IL tree-passing protocol for marking the end of list sub-trees, and the end of the file transfer protocol.
Specification of top level of files¶
file -> Unit*
Unit ::= program module blockStatement Declaration
Declaration ::=
varDeclaration constDeclaration varDimDeclaration
typeDeclaration common equivalence implicit function
class external intrinsic data save
A file contains a – possibly empty – list of units
(Fortran vocabulary), also called external declarations
(C vocabulary). Objects declared at this level are usually
considered as globals, i.e. their visibility scope is either all the
file, or the remaining of the file. One unit may be the definition of a
procedure, such as the main program (program
), or a procedure or
function (function
), or an object-oriented class (class
). One unit
may also be a plain piece of code (blockStatement
), that will be
executed a little like an unnamed main program. One Unit may also be a
named buch of Declaration
’s (module
), which can then be used by any
procedure that imports this bunch. One unit may also be the declaration of
one or many (global) variables (varDeclaration
) or constants
(constDeclaration
). One unit may also be a global definition of a user
data-type (typeDeclaration
), a COMMON
(common
), or some
EQUIVALENCE
relation (equivalence
). One Unit may also be an implicit
typing rule (implicit
), some variable remanence declaration (save
),
or some initialization (data
). It may also declare names of external
or predefined functions to be used later (external
, intrinsic
).
Although we think it doesn’t make sense at this Unit
level, we say a
Declaration
may also be a separate declaration of dimension
(varDimDeclaration
). This is because the phylum Declaration
is also
used in the declaration part of each subroutine.
question: So far, there is no specific functionDeclaration operator.
It is subsumed by the varDeclaration
, that may contain a
functionDeclarator
. But then the name “varDeclaration
” is maybe
poorly chosen, and the type modifiers in front of the varDeclaration
would be confused with the modifiers that apply to a function.
Procedure definitions¶
program -> modifier* ident? varDeclaration* blockStatement
function -> modifier* TypeSpec? TypeSpec? ident
varDeclaration* blockStatement?
A procedure or method definition. program
designates the main
program.
modifier*
: a list of keywords that modify the behavior or visibility of the method.TypeSpec?
: when returns a value, the type of the returned value. When this is actually a procedure, this tree must be operatorvoid
.TypeSpec?
: when the procedure is a method of a class, the class to which this procedure belongs.ident
: the name of the procedure or method. This name is optional for the main program.varDeclaration*
: the list of formal arguments. This may contain typed or untyped variable names (with or without initial values, meaning default values).blockStatement
: the body of the procedure or method.
The local declarations inside the blockStatement
are much more
permissive than the ones in varDeclaration*
. For example, they may
also contain type declarations or local subroutines...
note: In Fortran, the declarations in the
blockStatement
may as well declare the types of local variables and
variables in the varDeclaration*
formal parameters. In
C, these declarations will be separated in two different
scopes. If IL finds a declaration in the outer scope, for a name that is
not a formal parameter, then it cannot complain because this is allowed
in Fortran.
Modules¶
module -> ident? Declaration*
The operator module
represents a module (e.g. the FORTRAN module).
This is a possibly named (with first field ident?
) bunch of
declarations (Declaration*
), possibly including subroutine
declarations. In Fortran, the module subsumes the old “blockdata”, that
is similar, except that it may hold only variable, common declarations
and initializations. Hence, we chose not to create a “blockdata”
constructor. The “module” is also close to the C++ or Java class, but we
shall create a special purpose “class” operator, because the presence of
methods is a big semantic difference.
Object-oriented classes definition¶
class -> modifier* ident parentClass* Declaration*
A class definition
modifier*
: a list of keywords that modify the behavior of the class. Possible values:public
,private
,abstract
…ident
: the name of this class.parentClass*
: the specification of the parent classes, including various styles of parenthood such as C++ inheritance, Java “implements”…Declaration*
: the actual contents of the class (member fields, types, methods, sub-classes…)parentClass -> modifier* ident
A declaration of class inheritance from a parent class:
modifier*
: a list of keywords that modify the parenthood relation. Possible values:public
,private
,virtual
,implements
…ident
: the name of this parent class.constructor -> modifier* ident varDeclaration* BlockStatement
A definition of a class instance constructor:
modifier*
: a list of keywords that modify the constructor method.ident
: the name of the constructor (Question: is is useful, since it is the name of the class itself?)varDeclaration*
: the formal arguments of ths constructorBlockStatement
: the body of the constructor method.constructorCall -> TypeSpec ident Expressions
A call to a class instance contructor:
TypeSpec
: the classident
: the name of the constructor (Question: is is useful, since it is the name of the class itself?)Expressions
: the actual arguments passed to the constructor.destructorCall -> TypeSpec ident
A call to a class instance destructor:
TypeSpec
: the classident
: the name of the destructor (Question: is is useful, since it is built from the name of the class itself?)namespace -> ident? Declarations
Definition of a name space. All declarations inside must be accessed through the name of this name space.
ident?
: the name of this name space.Declarations
: the body of this name space.
Variable declarations¶
varDeclaration -> TypeSpec? Declarator*
A variable declaration node (varDeclaration
) first specifies a type
(TypeSpec?
). This type is optional, for the varDeclaration
’s that
appear in the lists of formal parameters in all cases of
Fortran] and in one style of C. Then follows
(Declarator*
) a list of variables that have this declared type. This
is because, in most languages there may be a list of variables attached
to the same type in a single declaration. Actually it may be more than
just variables. For example in C, it may be an array, a
pointer, a function... See the section on Declarators below. some
Declarator
’s may have an attached value, denoted here with the
operator initDeclarator
, that may represent an initial value, or a
default value in the case of formal parameters.
Other declarations¶
implicit -> typedLetters*
typedLetters -> TypeSpec Letters*
Letters ::= letterRange letter
letterRange -> letter letter
Operator implicit
represents Fortran’s IMPLICIT
declaration. It associates a default type TypeSpec
for undeclared
variables starting with the letters specified in Letters*
. If
typedLetters*
is empty, this represents the IMPLICIT NONE
declaration, that disactivates the implicit typing mechanism. Letters
in Letters*
may be a single letter (letter
), or a pair
(letterRange
) that designates all letters in the interval.
typeDeclaration -> typeName TypeSpec
A typeDeclaration
is a type declaration, i.e. it associates a
typeName
as a shorthand for a complete type specification or
definition TypeSpec
.
constDeclaration -> TypeSpec? Declarator*
This is the declaration of variables that will hold a constant, not
modifiable value, during all the execution. This declares the type of
the constants, followed by a list of constant names, each possibly
followed by its initializator expression. This corresponds to the
declaration of PARAMETER variables in Fortran, or final
in Java. This
also exists in C++. In Fortran90, there are two ways:
INTEGER I2, I3
PARAMETER (I2 = 27, I3 = -5)
orINTEGER, PARAMETER :: I2 = 27, I3 = -5
varDimDeclaration -> SimpleDeclarator*
This is the Fortran DIMENSION
declaration, that declares
the array dimensions of the declared variables, without specifying their
base type. The base type must then be declared elsewhere, or come from
an IMPLICIT
rule. Therefore this rule is similar to the
varDeclaration
rule, but with no TypeSpec
, and the declarator may
not give initial values.
common -> ident? VarDeclarator*
This is the Fortran COMMON
declaration. It defines a
named bunch (list) of global variables. The name of the bunch is the
first son. If not present, this is the special “blank” common. The
second son is a list of simple VarDeclarator
’s, that may be simple
variable names or array names with their dimensions. We think it cannot
be an iterativeVariableRef
here, because this represents a part of a
given array, and arrays cannot be partly
inside a common. At least,
this it how it works in Fortran.
equivalence -> Expression**
restriction: Expression**
is a list of list of only operators
arrayAccess
or ident
.
This is the Fortran EQUIVALENCE
declaration. It contains
a list of “atomic equivalence”. An “atomic equivalence” is itself a list
of references to variables, and each variable reference in one “atomic
equivalence” will share the same memory location.
intrinsic -> ident*
external -> ident*
This declares the contained ident
’s to be the names of subroutines,
that are either intrinsic (predefined) routines, or external (library)
routines, i.e. that are defined outside the program and are not
intrinsic.
data -> Expression* Expression*
save -> Expression*
The data
operator gives the initial value for some variables. It gives
in the second son Expression*
a list of constant expressions, whose
value will be given to each variable in the first son, which is a list
of Expression
. The save
operator declares that each variable
designated by the list of Expression
’s is remanent, i.e. keeps its
value between successive calls to the enclosing subroutine.
restriction: These Expression
in the first son are actually
restricted to be simple references to variables (i.e. VariableRef
),
but may also include iterativeVariableRef
operators.
Type specifications¶
PrimitiveType ::= void boolean integer float complex character
TypeSpec ::= PrimitiveType ident ParentClass modifiedType recordType unionType
enumType arrayType pointerType referenceType functionType
void ->
boolean ->
integer ->
float ->
complex ->
character ->
modifiedType -> Modifier* TypeSpec
recordType -> ident? Modifier* Declaration*
unionType -> ident? VarDeclaration*
enumType -> ident? Declarator*
arrayType -> TypeSpec dimColon*
pointerType -> TypeSpec
referenceType -> TypeSpec
functionType -> TypeSpec argDeclaration*
restriction: typeName
of atomic value in char
, integer
,
float
, boolean
, byte
will (or may?) be recognized as primitive
types. typeName
’s with other values will be interpreted as previously
defined user types.
modifiedType -> modifier* TypeSpec
restriction: modifier*
for modifiedType
is a list of atomic
modifier
’s with value signed
, unsigned
, short
, long
, or
double
. The value may also be an integer constant that represents the
Fortran-style type modifiers, such as the 16
in
REAL*16
.
restriction: modifier*
for modifiedType
is a list of atomic
modifier
’s with value const
, volatile
, register
, auto
,
target
, pointer
, in
, out
, inout
, optional
, save
, extern
,
private
, or sequence
.
note: there is no static
any more. The C static
will be
translated to modifier save
for a local var of a procedure, to
modifier private
otherwise.
question: Is the const
modifier redundant with the
constDeclaration
operator? In that case, can we get rid of this
constDeclaration
operator? Same question with attribute save
and
operator save
. Same for pointer
, target
.
arrayType -> TypeSpec dimColon*
dimColon -> Expression? Expression?
This declares an array type, of base type given by first son TypeSpec
,
and a list of dimensions given by second son dimColon*
. Each
dimColon
is a pair of beginning index and ending index. Both may be
null
. When the starting index is null
, it means the
Fortran default “1”. When the ending index is null
, it
means that the last index is not specified, but is specified somewhere
else. In particular, this will be used to represent the special
Fortran star dimension (``), that specifies a dimension
that declared somewhere else, and that is not redeclared at the current
point.
recordType -> ident? varDeclaration*
unionType -> ident? varDeclaration*
These are the classical record and union types, as defined in C or in
Fortran90. The first son is a temporary name for the type,
used inside the second son to obtain recursive definitions. The second
son is a list of typed field names, looking just like ordinary
varDeclaration
’s.
enumType -> ident? Declarator*
restriction: Declarator*
is a list of only name
or
initDeclarator
for name = constant
.enumType
is the C-like enumerated type, i.e. a list of
possible values in Declarator*
, that are atomic names. There may be a
constant attached to force the internal integer representation of this
value. The initial optional ident?
is an identifier that may exist in
C. But this is not the name of the enumerated type. The
enumerated type is named by enclosing it into a typeDeclaration
node.
pointerType -> TypeSpec
This is the type: pointer to TypeSpec
.
functionType -> TypeSpec ArgDeclaration*
ArgDeclaration ::= varDeclaration TypeSpec
This is the function type. First son is the type of the returned result,
and second son is the list of the arguments’ types. The ArgDeclaration
phylum represents the operators that can be found in the declarations of
the arguments of a function, both here in the functionType
type
specification operator, and later in the functionDeclarator
declarator
operator. An ArgDeclaration
can be a normal varDeclaration
or, as
happens in C, it can be a simple TypeSpec
that
represents the type of the argument, without specifying its name.
void ->
This is the void type, used as the return type of subroutines that return nothing, i.e. procedures and not functions.
Declarators¶
Declarator ::= SimpleDeclarator initDeclarator
SimpleDeclarator ::=
VarDeclarator pointerDeclarator bitfieldDeclarator
functionDeclarator
VarDeclarator ::= arrayDeclarator ident
initDeclarator -> SimpleDeclarator Expression
pointerDeclarator -> SimpleDeclarator
bitfieldDeclarator -> ident Expression
functionDeclarator -> SimpleDeclarator ArgDeclaration*
arrayDeclarator -> SimpleDeclarator dimColon*
A Declarator
is an operator that is used, in conjunction with a type
specification (TypeSpec
), to declare an object (variable, record
field, etc...) that has this type. It can actually be an
initDeclarator
, in which case the declarator is immediately followed
by an expression that gives an initial value for this object. Otherwise
it is a SimpleDeclarator
, which can be a pointerDeclarator
, meaning
that the SimpleDeclarator
under has the type “pointer to” the type
(cf C). It can be a bitfieldDeclarator
(comes from
C) that declares a field in a structure that has a length
in bits specified by the second son Expression
. It can be a
functionDeclarator
, that contains the declarator for the “name” of the
function, and the ArgDeclaration*
for the arguments. Similarly, an
arrayDeclarator
starting with the declarator for the “name” of the
array, and the dimColon*
declarations of the dimensions.
Structured control statements¶
Statement ::=
blockStatement labelStatement if switch loop break
continue return throw synchro try exit cycle stop goto
gotoLabelVar assignLabelVar compGoto data Expression
Among the above constructs, we consider blockStatement
, if
,
switch
, loop
, try
, synchro
, as “structured” control statements.
They contrast with “unstructured” control statements, described in the
next section, and “simple” statements, that just do something but do not
affect the control flow. We shall identify these simple statements with
expressions, that will be described later. This comes from the C style,
where simple statements, such as assignment and calls, may be used as
expressions, and on the other hand, any expresson can be used as a
statement, in which case its return value will just be discarded.
blockStatement -> DeclarationOrStatement*
DeclarationOrStatement ::= Declaration Statement
A statement may be a blockStatement
, in which case there may be
additional declarations (Declaration
), that override previous
declarations, for the duration of the blockStatement
. To cope with the
style of some imperative languages, such as Java, we allow
declarations and statements to be interleaved in any manner (provided
declarations come before uses?). Therefore we introduce this new phylum
DeclarationOrStatement
. Notice however that it maybe a good practice
not to interleave declarations and statements. For example in
Java, there may be conflicts between declarations in
different cases of a switch statement. What semantics for that?
if -> Expression Statement? Statement?
This is the standard “if-then-else” construct. In Fortran, it may have an additional label, but it seems this label has no semantic value, and therefore we don’t keep it.
try -> blockStatement catch* blockStatement?
catch -> varDeclaration blockStatement
throw -> Expression
This is the exception mechanism. Exceptions are raised with the throw
instruction. A blockStatement
catches exceptions by being enclosed
into a try
statement. When an exception is raised inside the
blockStatement
of a try
, it is matched against the varDeclaration
of each of the following catch
in catch*
. If one matches, the
corresponding blockStatement
is executed. If no one matches, then if
the final blockStatement?
is present, it is executed, or it is none
and the exception is propagated further.
switch -> Expression switchCase*
switchCase -> Expression* DeclarationOrStatement*
break ->
This is the classical switch-case construct of C or
Java. All cases (switchCase
in switchCase*
) are
examined in sequence. Each switchCase
represents a case, that is
chosen when the value of the top Expression
matches one of the
expressions in list Expression*
. As a special behavior, the case is
chosen when its list of expressions is empty. This is the way the
default
operator must be translated into IL. Inside the Statement*
for each case, there may be calls to break
, that just exit from the
whole switch
statement.
synchro -> Expression blockStatement
This is the threads synchronization, as found in Java. The
blockStatement
is suspended until the objects returned by Expression
can be locked. The lock is released at the end of the blockStatement
critical section.
question: This specification of IL is currently incomplete with
respect to tasking and threads parallelism. Therefore this synchro
operator is here just to remind us that these mechanisms should be taken
into account as soon as possible.
loop -> ident? label? Control Statement?
Control ::= do while until times forall for none
This loop
construct represents all iterative control structures. The
optional ident?
and label
may be used for referencing this loop in
exit
’s and cycle
’s. Then the Control
son explicits the kind of
iterative control used in this loop, and finally the Statement?
(very
rarely optional) defines what to do during each iteration.
do -> ident Expression Expression Expression?
This is the header of the standard Fortran DO
loop.
forall -> ident Expression Expression Expression? Expression?
This is, for example, the header of the parallel Fortran90
forall
.
for -> varDeclaration? Expression? Expression?
restriction: the first optional varDeclaration?
may be a variable
declaration with an initialization, or a simple expression such as a
simple assignment.
This is the header of the C for
loop, that is more powerful than
the Fortran DO
loop.
while -> Expression
until -> Expression
times -> Expression
These control headers repeat the loop while or until some Expression
evaluates to true
, or a certain number of times. The condition is
evaluated before each iteration for a while
, after each iteration
for a until
.
cycle -> ident?
exit -> ident?
These instructions jump to special places in the enclosing loop, or in
the closest enclosing loop identified by ident?
when present. cycle
goes to the next iteration, exit
just terminates the loop.
return -> Expression?
This returns from the enclosing function
, and returns the value of the
optional Expression?
to the calling place.
Unstructured control statements¶
labelStatement -> label Statement?
goto -> label
gotoLabelVar -> ident label*
assignLabelVar -> ident label
compGoto -> label* Expression
stop ->
continue ->
question: maybe we should remove the continue
statement. and
instead, accept none
as a Statement
. Otherwise, we must accept a
Statement?
as second son of labelStatement
. This would treat
correctly the 100 CONTINUE
instructions and 100 ENDDO
as well.
Expressions or simple statements¶
Expression ::=
unary UnaryExpression binary BinaryExpression
VariableRef address arrayTriplet AssignExpression
ifExpression arrayInitializers stringCst intCst
realCst substringAccess iterativeVariableRef
UnaryExpression ::= minus not decr incr
BinaryExpression ::=
add sub mul div eq neq ge gt le lt and or xor power ior
leftShift rightShift mod
AssignExpression ::=
andAssign assign divAssign iorAssign minusAssign
modAssign plusAssign timesAssign xorAssign
leftShiftAssign rightShiftAssign
VariableRef ::=
ident arrayAccess fieldAccess pointerAccess call
The phylum Expression
contains all possible operators that build an
expression. Some of these operators are regrouped into the sub-phyla
UnaryExpression
, BinaryExpression
, AssignExpression
, and
VariableRef
. We have chosen to create a separate operator for each
unary, binary, and assignment operation. They are respectively in the
phyla UnaryExpression
, BinaryExpression
, and AssignExpression
.
However, in the very rare case where a new unary or binary operator is
needed, it can be defined in terms of the catch-all operators unary
and binary
. However, we think these operators should not be used, or
only temporarily, until the new operation is assigned a new operator in
Expression
. The operators in phylum VariableRef
represent the
objects that have an allocated memory, i.e. that can be assigned a
value, or return a pointer to them, or can be passed as an “out”
argument of a procedure.
question: this distinction between Expression
and VariableRef
must be verified.
Lastly, we added the iterativeVariableRef
operator into Expression
,
although we know that this corresponds rather to an enumeration of
expressions. This is because some operators, such as I-O, or the
operators save
and data
, use to take a list of expressions in which
some items are actually an iterative enumeration of expressions. This is
a compromise that seems to work. The alternative would be to modify the
expression*
operator inside I-O statements.
iterativeVariableRef -> Expression* do
An iterativeVariableRef
is equivalent to the enumeration of a list of
expressions, built by concatenating copies of the Expression*
in first
place, one copy for each value of the index in the do
in second
place.
assign -> VariableRef Expression
plusAssign -> VariableRef Expression
minusAssign -> VariableRef Expression
timesAssign -> VariableRef Expression
divAssign -> VariableRef Expression
modAssign -> VariableRef Expression
leftShiftAssign -> VariableRef Expression
rightShiftAssign -> VariableRef Expression
xorAssign -> VariableRef Expression
iorAssign -> VariableRef Expression
andAssign -> VariableRef Expression
These are all the known assignment operators. We hope not to have
forgotten any! Respectively, these correspond to the classical
assignment, followed by the C assignments: +=
, -=
,
=
, /=
, %=
, <<=
, >>=
, |=
, ^=
, &=
call -> VariableRef? VariableRef Expression*
A call to a procedure, function, subroutine, or method.
VariableRef?
: (only for method calls) the object or class on which the method is called. More generally, this indicates where the called method must be found e.g. it can be ascopeAccess
to restrict searching to a namespace or to a parent class.VariableRef
: the name of the called procedure (i.e. subroutine, function, method…). In more complex cases this child can be an expression that evaluates to this called procedure e.g. deref of a pointer to a function, or access into array of functions.Expression*
: the actual arguments of this call.scopeAccess -> scopeAccess ident?
A scope restriction in front of an identifier.
scopeAccess
:ident?
:add -> Expression Expression sub -> Expression Expression mul -> Expression Expression div -> Expression Expression power -> Expression Expression mod -> Expression Expression eq -> Expression Expression neq -> Expression Expression ge -> Expression Expression gt -> Expression Expression le -> Expression Expression lt -> Expression Expression and -> Expression Expression or -> Expression Expression xor -> Expression Expression ior -> Expression Expression leftShift -> Expression Expression rightShift -> Expression Expression binary -> Expression operator Expression minus -> Expression not -> Expression decr -> Expression incr -> Expression unary -> operator Expression
These are the classical unary and binary operations. We hope not to have
forgotten any! These binary
and unary
operators are catch-all
operators for operations that were forgotten in the above
UnaryExpression
and BinaryExpression
phyla. These binary
and
unary
operators should not be used, or very temporarily.
ifExpression -> Expression Expression Expression
This is the conditional expression, that exists for example in C.
arrayConstructor -> Expression * ...
An expression that constructs an array
Expression
: each of the values to be put into the constructed array.
Dynamic memory¶
allocate -> Expression? Expressions TypeSpec? KeywordArg Expression
A call to dynamic memory allocation
Expression?
: Total size allocated, in bytes.Expressions
: Array dimensions.TypeSpec?
: Type of the allocated elements.KeywordArg
: (?) in Fortran, optional returned status “STAT=
”.Expression
: (?) in C++, initialization information.fieldAccess -> VariableRef VariableRef
Access to a field of a structured object. In particular access to a member of a class instance or to a static member of a class.
VariableRef
: the structured object or the class.VariableRef
: the name of the field or member.address -> VariableRef arrayAccess -> VariableRef Expression* arrayInitializers -> Expression* arrayTriplet -> Expression? Expression?
The operator arrayTriplet
represents the classical general array
section, from some index to some index, with a given stride. Initial
index may be none
, meaning 1, terminal index may be none
, meaning
the last index, and stride may be none
, meaning 1. In particular, the
arrayTriplet
with none
as third son represents (subsumes) the old
operator arraySection
.
pointerAccess -> Expression
substringAccess -> Expression Expression? Expression?
Atomic nodes¶
ident -> implemented as STRING
label ::= ident
typeName ::= ident
operator ::= ident
modifier ::= ident
stringCst -> implemented as STRING
letter -> implemented as STRING
intCst -> implemented as INTEGER
realCst -> implemented as REAL
Summary of the complete syntax¶
- Phyla:
Unit ::=
program module blockStatement Declaration
Declaration ::=
varDeclaration constDeclaration varDimDeclaration
typeDeclaration common equivalence implicit function
class external intrinsic data save
ArgDeclaration ::=
varDeclaration TypeSpec
Declarator ::=
SimpleDeclarator initDeclarator
SimpleDeclarator ::=
VarDeclarator pointerDeclarator bitfieldDeclarator
functionDeclarator
VarDeclarator ::=
arrayDeclarator ident
DeclarationOrStatement ::=
Declaration Statement
Statement ::=
blockStatement labelStatement if switch loop break
continue return throw synchro try exit cycle stop goto
gotoLabelVar assignLabelVar compGoto data Expression
Control ::=
do while until times forall for none
Expression ::=
unary UnaryExpression binary BinaryExpression
VariableRef address arrayTriplet AssignExpression
ifExpression arrayInitializers stringCst intCst
realCst substringAccess iterativeVariableRef
UnaryExpression ::=
minus not decr incr
BinaryExpression ::=
add sub mul div eq neq ge gt le lt and or xor power ior
leftShift rightShift mod
AssignExpression ::=
andAssign assign divAssign iorAssign minusAssign
modAssign plusAssign timesAssign xorAssign
leftShiftAssign rightShiftAssign
VariableRef ::=
ident arrayAccess fieldAccess pointerAccess call
Letters ::=
letterRange letter
- Declarations:
file -> Unit*
module -> ident? Declaration*
implicit -> typedLetters*
typedLetters -> TypeSpec Letters*
typeDeclaration -> typeName TypeSpec
constDeclaration -> TypeSpec? Declarator*
varDeclaration -> TypeSpec? Declarator*
varDimDeclaration -> SimpleDeclarator*
common -> ident? VarDeclarator*
equivalence -> Expression**
intrinsic -> ident*
external -> ident*
data -> Expression*
save -> Expression*
- Type specifications:
modifiedTypeName ->
modifiedTypeSpec ->
arrayType -> TypeSpec dimColon*
dimColon -> Expression? Expression?
recordType -> ident? varDeclaration*
unionType -> ident? varDeclaration*
enumType -> ident? Declarator*
pointerType -> TypeSpec
functionType -> TypeSpec ArgDeclaration*
void ->
- Declarators:
initDeclarator -> SimpleDeclarator Expression
pointerDeclarator -> SimpleDeclarator
bitfieldDeclarator -> ident Expression
functionDeclarator -> SimpleDeclarator ArgDeclaration*
arrayDeclarator -> SimpleDeclarator dimColon*
- Control statements:
blockStatement -> DeclarationOrStatement*
if -> Expression Statement? Statement?
return -> Expression?
try -> blockStatement catch* blockStatement?
catch -> varDeclaration blockStatement
throw -> Expression
switch -> Expression switchCase*
switchCase -> Expression* DeclarationOrStatement*
break ->
synchro -> Expression blockStatement
loop -> ident? label? Control Statement?
do -> ident Expression Expression Expression?
forall -> ident Expression Expression Expression? Expression?
for -> varDeclaration? Expression? Expression?
while -> Expression
until -> Expression
times -> Expression
cycle -> ident?
exit -> ident?
- Expressions:
iterativeVariableRef -> Expression* do
assign -> VariableRef Expression
plusAssign -> VariableRef Expression
minusAssign -> VariableRef Expression
timesAssign -> VariableRef Expression
divAssign -> VariableRef Expression
modAssign -> VariableRef Expression
leftShiftAssign -> VariableRef Expression
rightShiftAssign -> VariableRef Expression
xorAssign -> VariableRef Expression
iorAssign -> VariableRef Expression
andAssign -> VariableRef Expression
add -> Expression Expression
sub -> Expression Expression
mul -> Expression Expression
div -> Expression Expression
power -> Expression Expression
mod -> Expression Expression
eq -> Expression Expression
neq -> Expression Expression
ge -> Expression Expression
gt -> Expression Expression
le -> Expression Expression
lt -> Expression Expression
and -> Expression Expression
or -> Expression Expression
xor -> Expression Expression
ior -> Expression Expression
leftShift -> Expression Expression
rightShift -> Expression Expression
binary -> Expression operator Expression
minus -> Expression
not -> Expression
decr -> Expression
incr -> Expression
unary -> operator Expression
ifExpression -> Expression Expression Expression
address -> VariableRef
arrayAccess -> VariableRef Expression*
arrayInitializers -> Expression*
arrayTriplet -> Expression? Expression? Expression?
pointerAccess -> Expression
substringAccess -> Expression Expression? Expression?
letterRange -> letter letter
- Leaves:
ident -> implemented as STRING
label ::= ident
typeName ::= ident
operator ::= ident
modifier ::= ident
stringCst -> implemented as STRING
letter -> implemented as STRING
intCst -> implemented as INTEGER
realCst -> implemented as REAL