Creation of the differentiated IR - <Step 4>

BlockDifferentiator CallGraphDifferentiator DiffAssignmentNode DiffConstants DifferentiationEnv DynMemoryDifferentiator ExpressionDifferentiator FlowGraphDifferentiator NewBlockGraph NewBlockGraphArrow NewBlockGraphNode ProcedureCallDifferentiator SplitForSave UnitCreationContext UnitDiffInfo VarRefDifferentiator

       CallGraphDifferentiator.run()                 
       --> CallGraphDifferentiator.differentiateDownFrom()   
         --> FlowGraphDifferentiator.differentiateProcedure()       
           --> FlowGraphDifferentiator.differentiateProcedureInMode() 
             --> FlowGraphDifferentiator.precomputeDifferentiatedBlocks() 
               --> BlockDifferentiator.differentiateBlock()   
                 --> BlockDifferentiator.differentiateInstructionDeclaration() 
                 --> BlockDifferentiator.buildDiffInstructionsOfAssignAllocate() 
                 --> BlockDifferentiator.buildDiffInstructionsOfDeallocate() 
                 --> ProcedureCallDifferentiator.buildDiffInstructionsOfMPICall() 
                 --> ProcedureCallDifferentiator.buildDiffInstructions() 
                 --> BlockDifferentiator.buildDiffInstructionsOfPlainAssignment() 
                 --> BlockDifferentiator.buildDiffInstructionsOfPlainDeclaration() 
                 --> BlockDifferentiator.buildDiffInstructionsOfIncrement() 
                 --> BlockDifferentiator.buildDiffInstructionsOfReturn() 
                 --> BlockDifferentiator.buildDiffInstructionsOfIORead() 
                 --> BlockDifferentiator.buildDiffInstructionsOfOther() 
             --> FlowGraphDifferentiator.differentiateFlowGraphLevel() 
                --> FlowGraphDifferentiator.differentiateEntryBlockLevel() 
                --> FlowGraphDifferentiator.differentiatePlainBlockLevel() 
                --> FlowGraphDifferentiator.differentiateCkpPiece() 
                --> FlowGraphDifferentiator.differentiateNoDiffPiece() 
                --> FlowGraphDifferentiator.differentiateBinomialLoop() 
                --> FlowGraphDifferentiator.differentiateIILoop() 
                --> FlowGraphDifferentiator.differentiateNaturalLoop() 
                --> FlowGraphDifferentiator.differentiateFixedPointLoop() 
                --> FlowGraphDifferentiator.differentiateStructuredLoop() 

At the level of the complete Call-Graph

  • Files CallGraphDifferentiator

The top level of Differentiation CallGraphDifferentiator.run() operates at the Call-Graph level. It first creates a new DifferentiationEnv, that gathers a CallGraphDifferentiator, a FlowGraphDifferentiator, a BlockDifferentiator, a VarRefDifferentiator, i.e. the tools to differentiate the nested levels of constructs in a program. The DifferentiationEnv also keeps correspondences between original and differentiated objects. After creation of the DifferentiationEnv, differentiateDownFrom() is called, that does the main part of the work.

differentiateDownFrom() Consists of successive “STEP’s”, each one running on the complete Call-Graph:

  • one preliminary step runs for each Unit, and inside for each ActivityPattern existing for this Unit: the activity is computed for the elements of COMMON’s in Unit. Then the activity may be augmented to produce a differentiated code that is maybe marginally less efficient but less error-prone for the end-user, by forcing more initialization of derivatives.

  • the differentiated Call-Graph diffCallGraph is created and initialized

  • then another preliminary step runs for each Unit, and inside for each ActivityPattern existing for this Unit: the activity of the Unit’s formal arguments is computed and stored into the UnitDiffInfo of this Unit.

  • then another preliminary step (takeCopyUnitDecisions) decides, for each (source, aka “primal”) Unit, whether the differentiated code must contain a copy of this Unit. When there is a risk of ambiguity, the name of the copied Unit is often suffixed with _nodiff.

  • then another preliminary step runs for each Unit, and inside for each ActivityPattern existing for this Unit: In principally prepares info about pass-by-value arguents.

  • then the diffCallGraph is populated with all its Unit’s, correctly contained in one another when needed, but apart from that empty: no flow graph, no SymbolTable’s. Packages (Translation units (files), MODULE’s) are created first. Then procedures are created: the copied procedure first if needed, then as many differentiated procedures are there are ActivityPattern’s. Diff procedures are also created for INTRINSIC’s that need a derivative.

  • then the CallArrow’s are placed between the newly created Unit’s. Note: the CallArrow’s between differentiated Unit’s will be placed later.

  • then the differentiated SymbolTable’s are built. Root-most SymbolTable’s are built first, starting with the root SymbolTable of diffCallGraph. After this step the differentiated SymbolTable’s are essentially copies. The differentiated symbols will be added later.

  • then the type (FunctionTypeSpec) of differentiated Units is built, as well as the differentiated interface declarations.

  • then the differentiated Unit’s are each filled with their differentiated Flow-Graphs. This can be seen as the main part, and it calls the deeper level of differentiation: FlowGraphDifferentiator.

  • then a few adminstrative work is done about the parts of the differentiated CallGraph that are actually copies of parts of the source CallGraph: these copies may have received a different name (e.g. XXX_NODIFF instead of XXX). The copies may also refer to copied elements that now have a different name. This step updates these copied names everywhere, taking care of interface names, module names, allocation commands …. This step is a little messy as bits have been added progressively.

  • finally, all new names (for new, differentiated objects) are fixed (e.g. going from tentative #V_D# to final V_D), taking care of conflicts with existing names. For instance #V_D# can be turned into final V0_D if there is already a V_D that shares its scope.

Notes on copies of source code inside the generated code

At the level of one particular Unit’s Flow-Graph

  • Files FlowGraphDifferentiator

Each time the CallGraphDifferentiator must differentiate the Flow-Graph contents of a Unit, this is subcontracted to the FlowGraphDifferentiator

Hint: to gain insight on how the FlowGraphDifferentiator works and how differentiated Block’s are created and connected together, one may run Tapenade with the command-line option -traceDifferentiation <unit-name>, that prints a long log of this complex operation.

The main entry point is FlowGraphDifferentiator.differentiateProcedure, that fills each differentiated version of the current Unit with its own (differentiated) Flow-Graph. The source Flow-Graph is used to build a tree of FlowGraphLevel’s, which are nested control structures, mostly about nested loops (remember that Tapenade’s Flow-Graphs are flat graphs). Each FlowGraphLevel receives a kind based on its structure and on the differentiation directives present in the source. FlowGraphDifferentiator has a separate differentiation method for each kind. Kinds can be:

  • PLAIN_BLOCK: essentially for an individual Block

  • ENTRY_BLOCK: for the Unit’s EntryBlock

  • DO_LOOP: for a clean DO loop or a C clean for loop

  • WHILE_LOOP: for a while loop.

  • TIMES_LOOP: for a loop running a fixed number of times.

  • NATURAL_LOOP: for a loop not clean enough to fit in any of the above kinds (extra entries or exits).

  • II_LOOP: for a clean loop that the user guarantees with Independent Iterations and thus deserves a special adjoint differentiation.

  • BINOMIAL_LOOP: for a clean loop that the user guarantees is time-stepping and thus deserves a special adjoint differentiation.

  • FIXEDPOINT_LOOP: for a clean while loop that the user guarantees implements a fixed-point iteration and thus deserves a special adjoint differentiation.

  • CHECKPOINT: for a (connected) subset of a set of “sibling” structures, that we group because they will be treated in a special way (checkpointing in adjoint differentiation).

  • NODIFF: for a (connected) subset of a set of “sibling” structures, that the user requests they are not differentiated.

  • PLAIN_GROUP: for a (connected) subset of a set of “sibling” structures, that are grouped for any other reason.

  • TOP_LEVEL: for the top node of the tree of nested structures, therefore representing the complete Flow-Graph of the current Unit.

Each node of the tree of control structures is differentiated in turn, recursively, bottom up. The recursive entry point in this algorithm is FlowGraphDifferentiator.differentiateFlowGraphLevel.

Missing: details on fwd/bwd entry/exit diffs of one given FlowGraphLevel

At the level of one particular Basic Block

  • Files BlockDifferentiator

Each time the FlowGraphDifferentiator must fill the derivative Block’s of a given source Block, this is subcontracted to the BlockDifferentiator.

Hint: to gain insight on how the BlockDifferentiator works and where (in the Java source) are created the individual differentiated instructions, one may run Tapenade with the command-line option -traceDifferentiation <unit-name>. Upon creation, each new instruction is printed with a short explanation text that one may then search for in BlockDifferentiator.java.

The main entry point is BlockDifferentiator.differentiateBlock, that fills a given empty forward-diff Block and a given empty backward-diff Block with the differentiation of the instructions of the given source Block, in the correct order, and obeying the given differentiationMode (one of TANGENT_MODE, ADJOINT_MODE, ADJOINT_SPLIT_MODE). Internally, the Instruction’s of the source Block are swept from first to last, and for each the forward and backward differentiated Instruction’s are built. The differentiated Instruction’s are accumulated into two data-dependency graphs (one for forward, one for backward). When all instructions of the source Block have been swept, each data-dependency graph is first searched for desirable node fusion (preserving data-dependencies), then it is sorted (topological sort), and the resulting list of differentiated Instruction’s is put into the corresponding (forward- or backward-) diff Block.

The primitive that adds a new instruction into the forward data-dependency graph is addFuturePlainNodeFwd.

The primitives that add a new instruction into the backward data-dependency graph are addFuturePlainNodeBwd, addFutureZeroDiffNodeBwd, addFutureSetDiffNodeBwd, addFutureIncrDiffNodeBwd, plus a special addFutureBwdSplitDecl.

At the level of one particular procedure call

  • Files ProcedureCallDifferentiator

Javadoc

At the level of one particular assignment

  • Files ExpressionDifferentiator

Javadoc

In the most frequent situation, differentiating an expression happens in the context of an assignment lhs=expression. Method tangentDifferentiateAssignedExpression returns the tangent-differentiated statement of lhs=expression, whereas method adjointDifferentiateAssignedExpression returns the list of its adjoint-differentiated statements.

There are special situations where there is no lhs, for instance when expression is the return expression of a function. For these situations, which indeed occur only in tangent mode, method tangentDifferentiatePlainExpression returns the tangent-differentiated expression.

These three methods share most of their mechanism:

  1. A “balanced” copy of expression is built. It is balanced in the sense that chains of associative operations are rearranged in the shape of a balanced binary tree. In the sequel, expression is indeed the balanced copy. A few sweeps through expression evaluate the execution costs and the number of occurences of each of its sub-expressions and of their derivatives in the future derivative code. These are used to take splitting decisions, i.e. decision to precompute a sub-expression or its derivative into a temporary variable. See method findDiffOptimalCuts. Final decision is stored at each node of the expression, in annotation CutCosts.

  2. A “linearization tree” is built from expression (together with lhs if present), by a call to the linearize method. In the linearization tree, where each node is an object of (a sub-class of) LinTree. This linearization tree reflects the structure of the linearize expression. For example, the topmost node of the linearization tree is of type LinAssign. Available sub-classes of LinTree are:

    • LinLeaf for a reference to a variable, i.e. a leaf of the linearization tree.

    • LinStd for a standard arithmetic operation.

    • LinAssign for the assignment

    • LinConvert for a conversion operation

    • LinIf fon an if-expression

    • LinCatenate for a plain juxtaposition of values, e.g. an array constructor

    • LinReduce for an array SUM reduction, possibly with DIM and MASK.

    • LinSpread for an expansion of a scalar (or an array with one less dimension) into an array.

    These sub-classes seem to cover all needs to-date. However, we provide a template to build new sub-classes of LinTree if needed README-LinOther.md. Linearization observes splitting decisions: when the CutCosts annotation on one sub-tree of expression requires the derivative (happens only for the adjoint) to be split here, this branch of the current linearization tree is terminated with a LinLeaf standing for the temporary (derivative) variable, and a new linearization tree is built starting from a LinAssign of the temporary variable with the sub-expression. The linearization tree built is returnd by side effect into the second (parent) argument. Assignments of split primal sub-expressions into temporary variables are collected by side-effect into the toPrecomputes argument. Linearization trees of split derivative sub-expressions are collected by side-effect into the toLinTrees argument.

  3. The linearization tree (each of them if splitting occured) is swept to produce the derivative statements. These sweeps are different in tangent and reverse mode.

    • In tangent mode, method buildTreeForward is applied to the root of each linearization tree, root linearization tree last. The linearization tree is swept top-down, and the differentiated expression is built progressively on the way back up.

    • In adjoint mode, method buildTreesBackward is applied to the root of each linearization tree, root linearization tree first. The linearization tree is swept top-down, bringing a copy of the differentiated lhs to each leaf, then the differentiated instruction corresponding to this leaf is built by a special sweep up from the leaf (method buildTreeBackward), and is accumulated in the result list of instructions.

  4. Upon return from the method, argument toPrecomputes is filled with the precomputation instructions introduced by the splitting mechanism.

At the level of one particular dynamic memory command

Missing

At the level of one particular declaration

Missing

At the level of one particular reference expression

  • Files VarRefDifferentiator

Javadoc

Notes about reverse differentiation of Pass-By-Value variables

When variables are passed-by-value, a variable v used by the called function F is indeed a new variable, holding a copy of v and vanishing at the exit of F.

=> All passed-by-value formal parameter is always passive (because not useful) and adjoint-dead (because vanishing) at the end of F.

EXCEPTION: because of the optimization described below, that identifies vb and vb0 when v is ONLY READ, we must consider that in this case usefulness of v propagates backwards across call exit (cf C:lh14).

Therefore, for every formal parameter v of function F:

  • if v is not active upon entry of F then since it is also passive on exit of F, there will be no vb parameter (there can be a vb inside F_B, but it will be local)

  • else if v is ONLY READ during F and therefore vb is ONLY incremented during F_B. In this easier case, we can identify this vb in F_B with the outside vb.

    • Each call to F_B must pass a reference &vb,

    • the header of F_B expects a *vb,

    • all occurences of vb inside F_B become *vb.

  • else in the general case, we cannot identify the inside vb with the outside vb:

    • Each call to F_B must pass a reference &vb,

    • the header of F_B expects a variable we call *vb0,

    • all occurences of vb inside F_B remain accesses to a local vb (not *vb) which holds the derivative of the local copy of v inside F_B. At the end of F_B, vb is transferred into *vb0 by *vb0 += vb;.