Creation of the differentiated IR - ****
=================================================================================
* Files: [Javadoc](../../../../../../../build/fr/inria/tapenade/differentiation/package-index.html)
`BlockDifferentiator
CallGraphDifferentiator
DiffAssignmentNode
DiffConstants
DifferentiationEnv
DynMemoryDifferentiator
ExpressionDifferentiator
FlowGraphDifferentiator
NewBlockGraph
NewBlockGraphArrow
NewBlockGraphNode
ProcedureCallDifferentiator
SplitForSave
UnitCreationContext
UnitDiffInfo
VarRefDifferentiator`
```
CallGraphDifferentiator.run()
--> CallGraphDifferentiator.differentiateDownFrom()
--> FlowGraphDifferentiator.differentiateProcedure()
--> FlowGraphDifferentiator.differentiateProcedureInMode()
--> FlowGraphDifferentiator.precomputeDifferentiatedBlocks()
--> BlockDifferentiator.differentiateBlock()
--> BlockDifferentiator.differentiateInstructionDeclaration()
--> BlockDifferentiator.buildDiffInstructionsOfAssignAllocate()
--> BlockDifferentiator.buildDiffInstructionsOfDeallocate()
--> ProcedureCallDifferentiator.buildDiffInstructionsOfMPICall()
--> ProcedureCallDifferentiator.buildDiffInstructions()
--> BlockDifferentiator.buildDiffInstructionsOfPlainAssignment()
--> BlockDifferentiator.buildDiffInstructionsOfPlainDeclaration()
--> BlockDifferentiator.buildDiffInstructionsOfIncrement()
--> BlockDifferentiator.buildDiffInstructionsOfReturn()
--> BlockDifferentiator.buildDiffInstructionsOfIORead()
--> BlockDifferentiator.buildDiffInstructionsOfOther()
--> FlowGraphDifferentiator.differentiateFlowGraphLevel()
--> FlowGraphDifferentiator.differentiateEntryBlockLevel()
--> FlowGraphDifferentiator.differentiatePlainBlockLevel()
--> FlowGraphDifferentiator.differentiateCkpPiece()
--> FlowGraphDifferentiator.differentiateNoDiffPiece()
--> FlowGraphDifferentiator.differentiateBinomialLoop()
--> FlowGraphDifferentiator.differentiateIILoop()
--> FlowGraphDifferentiator.differentiateNaturalLoop()
--> FlowGraphDifferentiator.differentiateFixedPointLoop()
--> FlowGraphDifferentiator.differentiateStructuredLoop()
```
At the level of the complete Call-Graph
---------------------------------------
* Files `CallGraphDifferentiator`
The top level of Differentiation `CallGraphDifferentiator.run()`
operates at the Call-Graph level. It first creates a new
`DifferentiationEnv`, that gathers a `CallGraphDifferentiator`, a
`FlowGraphDifferentiator`, a `BlockDifferentiator`, a
`VarRefDifferentiator`, i.e. the tools to differentiate the nested
levels of constructs in a program. The `DifferentiationEnv` also keeps
correspondences between original and differentiated objects. After
creation of the `DifferentiationEnv`, `differentiateDownFrom()` is
called, that does the main part of the work.
`differentiateDownFrom()` Consists of successive \"`STEP`'s\", each one
running on the complete Call-Graph:
- one preliminary step runs for each `Unit`, and inside for each
`ActivityPattern` existing for this `Unit`: the activity is computed
for the elements of `COMMON`'s in `Unit`. Then the activity may be
augmented to produce a differentiated code that is maybe marginally
less efficient but less error-prone for the end-user, by forcing
more initialization of derivatives.
- the differentiated Call-Graph `diffCallGraph` is created and
initialized
- then another preliminary step runs for each `Unit`, and inside for
each `ActivityPattern` existing for this `Unit`: the activity of the
`Unit`'s formal arguments is computed and stored into the
`UnitDiffInfo` of this `Unit`.
- then another preliminary step (`takeCopyUnitDecisions`) decides, for
each (source, aka \"primal\") `Unit`, whether the differentiated
code must contain a copy of this `Unit`. When there is a risk of
ambiguity, the name of the copied `Unit` is often suffixed with
`_nodiff`.
- then another preliminary step runs for each `Unit`, and inside for
each `ActivityPattern` existing for this `Unit`: In principally
prepares info about pass-by-value arguents.
- then the `diffCallGraph` is populated with all its `Unit`'s,
correctly contained in one another when needed, but apart from that
empty: no flow graph, no `SymbolTable`'s. *Packages* (Translation
units (files), `MODULE`'s) are created first. Then procedures are
created: the *copied* procedure first if needed, then as many
differentiated procedures are there are `ActivityPattern`'s. Diff
procedures are also created for `INTRINSIC`'s that need a
derivative.
- then the `CallArrow`'s are placed between the newly created
`Unit`'s. Note: the `CallArrow`'s between differentiated `Unit`'s
will be placed later.
- then the differentiated `SymbolTable`'s are built. Root-most
`SymbolTable`'s are built first, starting with the root
`SymbolTable` of `diffCallGraph`. After this step the differentiated
`SymbolTable`'s are essentially copies. The differentiated symbols
will be added later.
- then the type (`FunctionTypeSpec`) of differentiated Units is built,
as well as the differentiated interface declarations.
- then the differentiated `Unit`'s are each filled with their
differentiated Flow-Graphs. This can be seen as the main part, and
it calls the deeper level of differentiation:
`FlowGraphDifferentiator`.
- then a few adminstrative work is done about the parts of the
differentiated `CallGraph` that are actually copies of parts of the
source `CallGraph`: these copies may have received a different name
(e.g. `XXX_NODIFF` instead of `XXX`). The copies may also refer to
copied elements that now have a different name. This step updates
these copied names everywhere, taking care of interface names,
module names, allocation commands .... This step is a little messy
as bits have been added progressively.
- finally, all new names (for new, differentiated objects) are fixed
(e.g. going from tentative `#V_D#` to final `V_D`), taking care of
conflicts with existing names. For instance `#V_D#` can be turned
into final `V0_D` if there is already a `V_D` that shares its scope.
[Notes on copies of source code inside the generated code](README-copysource.md)
At the level of one particular `Unit`'s Flow-Graph
--------------------------------------------------
* Files `FlowGraphDifferentiator`
Each time the `CallGraphDifferentiator` must differentiate the
Flow-Graph contents of a `Unit`, this is subcontracted to the
`FlowGraphDifferentiator`
Hint: to gain insight on how the `FlowGraphDifferentiator` works and how
differentiated `Block`'s are created and connected together, one may run
Tapenade with the command-line option
`-traceDifferentiation `, that prints a long log of this
complex operation.
The main entry point is
`FlowGraphDifferentiator.differentiateProcedure`, that fills each
differentiated version of the current `Unit` with its own
(differentiated) Flow-Graph. The source Flow-Graph is used to build a
tree of `FlowGraphLevel`'s, which are nested control structures, mostly
about nested loops (remember that Tapenade's Flow-Graphs are flat
graphs). Each `FlowGraphLevel` receives a kind based on its structure
and on the differentiation directives present in the source.
`FlowGraphDifferentiator` has a separate differentiation method for each
kind. Kinds can be:
- `PLAIN_BLOCK`: essentially for an individual `Block`
- `ENTRY_BLOCK`: for the `Unit`'s `EntryBlock`
- `DO_LOOP`: for a clean `DO` loop or a C clean `for` loop
- `WHILE_LOOP`: for a while loop.
- `TIMES_LOOP`: for a loop running a fixed number of times.
- `NATURAL_LOOP`: for a loop not clean enough to fit in any of the
above kinds (extra entries or exits).
- `II_LOOP`: for a clean loop that the user guarantees with
Independent Iterations and thus deserves a special adjoint
differentiation.
- `BINOMIAL_LOOP`: for a clean loop that the user guarantees is
time-stepping and thus deserves a special adjoint differentiation.
- `FIXEDPOINT_LOOP`: for a clean while loop that the user guarantees
implements a fixed-point iteration and thus deserves a special
adjoint differentiation.
- `CHECKPOINT`: for a (connected) subset of a set of \"sibling\"
structures, that we group because they will be treated in a special
way (*checkpointing* in adjoint differentiation).
- `NODIFF`: for a (connected) subset of a set of \"sibling\"
structures, that the user requests they are *not* differentiated.
- `PLAIN_GROUP`: for a (connected) subset of a set of \"sibling\"
structures, that are grouped for any other reason.
- `TOP_LEVEL`: for the top node of the tree of nested structures,
therefore representing the complete Flow-Graph of the current
`Unit`.
Each node of the tree of control structures is differentiated in turn,
recursively, bottom up. The recursive entry point in this algorithm is
`FlowGraphDifferentiator.differentiateFlowGraphLevel`.
Missing: details on fwd/bwd entry/exit diffs of one given FlowGraphLevel
At the level of one particular Basic Block
------------------------------------------
* Files `BlockDifferentiator`
Each time the `FlowGraphDifferentiator` must fill the derivative
`Block`'s of a given source `Block`, this is subcontracted to the
`BlockDifferentiator`.
Hint: to gain insight on how the `BlockDifferentiator` works and where
(in the Java source) are created the individual differentiated
instructions, one may run Tapenade with the command-line option
`-traceDifferentiation `. Upon creation, each new instruction
is printed with a short explanation text that one may then search for in
`BlockDifferentiator.java`.
The main entry point is `BlockDifferentiator.differentiateBlock`, that
fills a given empty forward-diff `Block` and a given empty backward-diff
`Block` with the differentiation of the instructions of the given source
`Block`, in the correct order, and obeying the given
`differentiationMode` (one of
`TANGENT_MODE, ADJOINT_MODE, ADJOINT_SPLIT_MODE`). Internally, the
`Instruction`'s of the source `Block` are swept from first to last, and
for each the forward and backward differentiated `Instruction`'s are
built. The differentiated `Instruction`'s are accumulated into two
data-dependency graphs (one for forward, one for backward). When all
instructions of the source `Block` have been swept, each data-dependency
graph is first searched for desirable node fusion (preserving
data-dependencies), then it is sorted (topological sort), and the
resulting list of differentiated `Instruction`'s is put into the
corresponding (forward- or backward-) diff `Block`.
The primitive that adds a new instruction into the forward
data-dependency graph is `addFuturePlainNodeFwd`.
The primitives that add a new instruction into the backward
data-dependency graph are
`addFuturePlainNodeBwd, addFutureZeroDiffNodeBwd, addFutureSetDiffNodeBwd, addFutureIncrDiffNodeBwd`,
plus a special `addFutureBwdSplitDecl`.
At the level of one particular procedure call
---------------------------------------------
* Files `ProcedureCallDifferentiator`
[Javadoc](../../../../../../../build/fr/inria/tapenade/differentiation/ProcedureCallDifferentiator.html)
At the level of one particular assignment
-----------------------------------------
* Files `ExpressionDifferentiator`
[Javadoc](../../../../../../../build/fr/inria/tapenade/differentiation/ExpressionDifferentiator.html)
In the most frequent situation, differentiating an `expression` happens in the context of an assignment `lhs=expression`. Method `tangentDifferentiateAssignedExpression` returns the tangent-differentiated statement of `lhs=expression`, whereas method `adjointDifferentiateAssignedExpression` returns the list of its adjoint-differentiated statements.
There are special situations where there is no `lhs`, for instance when `expression` is the return expression of a function. For these situations, which indeed occur only in tangent mode, method `tangentDifferentiatePlainExpression` returns the tangent-differentiated expression.
These three methods share most of their mechanism:
1. A "balanced" copy of `expression` is built. It is balanced in the sense that chains of associative operations are rearranged in the shape of a balanced binary tree. In the sequel, `expression` is indeed the balanced copy. A few sweeps through `expression` evaluate the execution costs and the number of occurences of each of its sub-expressions and of their derivatives in the future derivative code. These are used to take splitting decisions, i.e. decision to precompute a sub-expression or its derivative into a temporary variable. See method `findDiffOptimalCuts`. Final decision is stored at each node of the expression, in annotation `CutCosts`.
2. A "linearization tree" is built from `expression` (together with `lhs` if present), by a call to the `linearize` method. In the linearization tree, where each node is an object of (a sub-class of) `LinTree`. This linearization tree reflects the structure of the linearize expression. For example, the topmost node of the linearization tree is of type `LinAssign`. Available sub-classes of `LinTree` are:
* `LinLeaf` for a reference to a variable, i.e. a leaf of the linearization tree.
* `LinStd` for a standard arithmetic operation.
* `LinAssign` for the assignment
* `LinConvert` for a conversion operation
* `LinIf` fon an if-expression
* `LinCatenate` for a plain juxtaposition of values, e.g. an array constructor
* `LinReduce` for an array SUM reduction, possibly with DIM and MASK.
* `LinSpread` for an expansion of a scalar (or an array with one less dimension) into an array.
These sub-classes seem to cover all needs to-date. However, we provide a template to build new sub-classes of `LinTree` if needed [README-LinOther.md](README-LinOther.md).
Linearization observes splitting decisions: when the `CutCosts` annotation on one sub-tree of `expression` requires the derivative (happens only for the adjoint) to be split here, this branch of the current linearization tree is terminated with a `LinLeaf` standing for the temporary (derivative) variable, and a new linearization tree is built starting from a `LinAssign` of the temporary variable with the sub-expression. The linearization tree built is returnd by side effect into the second (`parent`) argument. Assignments of split primal sub-expressions into temporary variables are collected by side-effect into the `toPrecomputes` argument. Linearization trees of split derivative sub-expressions are collected by side-effect into the `toLinTrees` argument.
3. The linearization tree (each of them if splitting occured) is swept to produce the derivative statements. These sweeps are different in tangent and reverse mode.
* In tangent mode, method `buildTreeForward` is applied to the root of each linearization tree, root linearization tree last. The linearization tree is swept top-down, and the differentiated expression is built progressively on the way back up.
* In adjoint mode, method `buildTreesBackward` is applied to the root of each linearization tree, root linearization tree first. The linearization tree is swept top-down, bringing a copy of the differentiated lhs to each leaf, then the differentiated instruction corresponding to this leaf is built by a special sweep up from the leaf (method `buildTreeBackward`), and is accumulated in the result list of instructions.
4. Upon return from the method, argument `toPrecomputes` is filled with the precomputation instructions introduced by the splitting mechanism.
At the level of one particular dynamic memory command
-----------------------------------------------------
Missing
At the level of one particular declaration
------------------------------------------
Missing
At the level of one particular reference expression
---------------------------------------------------
* Files `VarRefDifferentiator`
[Javadoc](../../../../../../../build/fr/inria/tapenade/differentiation/VarRefDifferentiator.html)
Notes about reverse differentiation of Pass-By-Value variables
--------------------------------------------------------------
When variables are passed-by-value, a variable `v` used by
the called function `F` is indeed a new variable, holding a copy
of `v` and vanishing at the exit of `F`.
=> All passed-by-value formal parameter is always passive (because
not useful) and adjoint-dead (because vanishing) at the end of `F`.
EXCEPTION: because of the optimization described below, that identifies
`vb` and `vb0` when `v` is ONLY READ, we must consider that in this case
usefulness of `v` propagates backwards across call exit (cf C:lh14).
Therefore, for every formal parameter v of function `F`:
- if `v` is not active upon entry of `F`
then since it is also passive on exit of `F`, there will be no `vb`
parameter (there can be a `vb` inside `F_B`, but it will be local)
- else if `v` is ONLY READ during `F` and therefore
`vb` is ONLY incremented during `F_B`. In this easier
case, we can identify this `vb` in `F_B` with the outside `vb`.
- Each call to `F_B` must pass a reference `&vb`,
- the header of `F_B` expects a `*vb`,
- all occurences of `vb` inside `F_B` become `*vb`.
- else in the general case, we cannot identify the inside `vb`
with the outside `vb`:
- Each call to `F_B` must pass a reference `&vb`,
- the header of `F_B` expects a variable we call `*vb0`,
- all occurences of `vb` inside `F_B` remain accesses to
a local `vb` (not `*vb`) which holds the derivative
of the local copy of `v` inside `F_B`.
At the end of `F_B`, `vb` is transferred into `*vb0` by
`*vb0 += vb;`.