Tapenade data-flow analyses¶
Files: Javadoc
ADActivityAnalyzer ADTBRAnalyzer ActivityPattern DataFlowAnalyzer DepsAnalyzer
DiffLivenessAnalyzer InOutAnalyzer LoopArrayAccessAnalyzer PointerAnalyzer ReqExplicit
OmpClauseAnalyzer
new XXAnalyzer()
XXAnalyzer.run()
--> runBottomUpAnalysis() or runTopDownAnalysis()
--> setCurUnitEtc()
--> initializeCGForUnit()
--> initializeCGForRootUnit()
--> analyze()
--> analyzeForward() or analyzeBackward()
--> initializeUnit()
--> setCurBlockEtc()
--> initializeFGForBlock()
--> initializeInitBlock()
--> accumulateValuesFromUpstream() or ...FromDownstream()
--> setEmptyCumulAndCycleValues()
--> getValueFlowingThrough() or ...FlowingBack()
--> cumulValueWithAdditional()
--> cumulCycleValueWithAdditional()
--> initializeCumulValue()
--> compareUpstreamValues()
--> propagateValuesForwardThroughBlock() or ...BackwardThroughBlock()
--> propagateValuesForwardThroughExpression() or ...BackwardThroughExpression()
--> compareDownstreamValues()
--> terminateFGForBlock()
--> terminateTermBlock()
--> terminateUnit()
--> terminateCGForUnit()
Data-Flow analysis are context-sensitive (run on the full CallGraph
)
and flow-sensitive (for each procedure, runs on its Flow-Graph). They
behave well on circular Call-Graphs (recursivity) and circular
Flow-Graphs (loops) thanks to intensive use of fixed-point iterations.
For efficiency reasons, the order in which the graph nodes are visited
is based on
DFST (Depth First Spanning Tree) order for top-down and for forward propagations, and the reverse of DFST for bottom-up and for backwards propagations.
a Waiting List mechanism to avoid sweeps on nodes that propagate nothing new.
Each Data-Flow analysis is supposed to run on the whole IR. There is no equivalent of separate compilation. The advantage is that data-flow information on a procedure can be used to compute more finely the data-flow information on the procedures that call it.
Each data-flow analysis is defined by its own class, that inherits from
DataFlowAnalyzer
. In the following, we show in red
the (virtual)
methods that a specific data-flow analysis (suppose it is named
XXAnalyzer
) must define, and in green
the predefined methods that
are provided by the DataFlowAnalyzer
and that may be used in the
implementation of the analysis-specific methods. In the case of
data-flow analysis, these predefined methods very often call other
virtual methods that must be defined by XXAnalyzer
. This back and
forth behavior between XXAnalyzer
and DataFlowAnalyzer
is a
natural consequence of the classical theory of data-flow analyses known
as abstract interpretation.
For each data-flow analysis (e.g. the XX
analysis), XXAnalyzer
must
at least define/overload:
its constructor:
XXAnalyzer(TapEnv, CallGraph, String, TapList<String>)
: creates theXXAnalyzer
object.its launcher:
run(TapList<Unit> rootUnits)
: runs the analysis on the Call-Graph below the given rootUnits.
One must implement their own run
. It can be standalone, but in most
cases one will make it call the predefined
runTopDownAnalysis(TapList<Unit>)
and/or
runBottomUpAnalysis(TapList<Unit>)
. In this case, these predefined
methods eventually call the following virtual methods. For each of
those, class XXAnalyzer
must provide their own implementation, or
call a predefined implementation when provided:
setCurUnitEtc(Unit)
: hook to declare that the givenUnit
becomes the new currentcurUnit
.initializeCGForUnit()
: hook to initialize things for the currentcurUnit
before running the analysis on the currentcurCallGraph
.initializeCGForRootUnit(Object)
: hook to initialize things for thecurUnit
, which is a rootUnit, before running the analysis on the Call-Graphanalyze()
: runs the analysis on thecurUnit
concludeCGForUnit()
: hook to terminate things for thecurUnit
after running the analysis on the Call-Graph
One must implement their own analyze()
. It can be standalone, but in
most cases one will make it call the predefined
analyzeForward(TapList<FGArrow> entryArrows, TapList<Block> insideBlocks, TapList<FGArrow> exitArrows)
and/or
analyzeBackward(TapList<FGArrow> entryArrows, TapList<Block> insideBlocks, TapList<FGArrow> exitArrows)
.
In that case, these predefined methods eventually call the following
virtual methods. For each of those, class XXAnalyzer
must provide
their own implementation, or call a predefined implementation when
provided:
initializeUnit()
: hook to initialize things forcurUnit
before starting the analysis oncurUnit
.setCurBlockEtc(Block)
: hook to declare that a Block becomes the newcurBlock
.initializeFGForBlock()
: hook to initialize things for thecurBlock
before starting the analysis oncurUnit
.initializeInitBlock()
: hook to initialize things for thecurBlock
, which is an initial block of thecurUnit
for the given analysis direction, before starting the analysis oncurUnit
.accumulateValuesFromUpstream()
(resp.Downstream
when backwards): accumulate the data-flow information coming flow-wise (resp. back-flow-wise) through theFGArrow
’s flowing to (resp. from)curBlock
.compareUpstreamValues()
: compare data-flow information just upstreamcurBlock
, with the data-flow information at the same location but upon previous passage. Return true when the information has changed and therefore a new fixed-point iteration is required.propagateValuesForwardThroughBlock()
(resp.Backward
when backwards): propagate the data-flow information forward (resp. backward) acrosscurBlock
. Must return true when the control flow can effectively go throughcurBlock
(nostop
norexit
).compareDownstreamValues()
: compare data-flow information just downstreamcurBlock
, with the data-flow information at the same location but upon previous passage. Return true when the information has changed and therefore a new fixed-point iteration is required.terminateFGForBlock()
: hook to terminate things for thecurBlock
after running the analysis oncurUnit
.terminateTermBlock()
: hook to terminate things for thecurBlock
, which is a final block of thecurUnit
for the given analysis direction, after running the analysis oncurUnit
.terminateUnit()
: hook to terminate things forcurUnit
after running the analysis oncurUnit
.
One can implement their own accumulateValuesFromUpstream
(resp.
Downstream
when backwards). One can alternatively do nothing and use
the predefined accumulateValuesFromUpstream
(resp. Downstream
)
provided in DataFlowAnalyzer
. In that case the predefined
implementation eventually calls the following virtual methods. For each
of those, class XXAnalyzer
must provide their own implementation, or
call a predefined implementation when provided:
setEmptyCumulAndCycleValues()
: initialize a recipient for fusing the values arriving through several flow arrows.getValueFlowingThrough()
(resp.FlowingBack
when backwards): get the value that arrives through the currentcurArrow
.cumulValueWithAdditional(SymbolTable)
: accumulates the arriving value into the non-cycling part of the accumulation recipient.cumulCycleValueWithAdditional(SymbolTable)
: accumulates the arriving value into the cycling part of the accumulation recipient.initializeCumulValue()
(needed only when backwards): Create an empty info to propagate it backwards. Necessary whencurBlock
containsstop
orexit
.
One can implement their own propagateValuesForwardThroughBlock
(resp.
Backward
when backwards). One can alternatively do nothing and use
the predefined propagateValuesForwardThroughBlock
(resp. Backward
)
provided in DataFlowAnalyzer
. In that case the predefined
implementation eventually calls the following virtual methods. For each
of those, class XXAnalyzer
must provide their own implementation, or
call a predefined implementation when provided:
propagateValuesForwardThroughExpression(Tree)
(resp.Backward
when backwards): propagate the data-flow information forward (resp. backward) across the given instruction.
AD-independent preliminary data-flow analyses - [Step 2]¶
Files:
LoopArrayAccessAnalyzer PointerAnalyzer InOutAnalyzer
CallGraph.prepareAnalyses()
--> LoopArrayAccessAnalyzer.runAnalysis()
--> PointerAnalyzer.runAnalysis()
--> InOutAnalyzer.runAnalysis()
AD-independent Data-Flow analyses are run in this order:
Preparatory analysis on loops
Pointer destinations
In-Out (aka read-write).
Each of these 3 analyses is run completely on the full code (full
CallGraph
) before the next analysis starts.
Notes on LoopArrayAccessAnalyzer¶
Utility analysis that detects loop-localized array zones,
and array zones completely swept by a loop.
Vocabulary: regarding a loop "H", a given zone "i" can be:
-- "localized" (L) when accesses to "i" in the loop "H"
can access the same cell only during the same iteration.
In other words, there is no loop carried dependency.
-- "uniqueAccess" (U) when each iteration inside "H"
only accesses one unique memory cell of "i"
-- "topUniqueAccess" (T) when "H" is the topmost "U" loop for "i"
-- "loopComplete" (C) when after all iterations of loop "H",
each and every cell of "i" has been visited.
We claim there are only the following possible combinations of UTC:
not-U ; U-and-not-T ; U-and-T-and-not-C ; U-and-T-and-C
Here are several examples. Consider 2 nested loops
B1 : Do i...
B2 : Do j...
EndDo
EndDo
When the only accesses | then the zone of T | and the zone of T
to array T inside are: | is (regarding B1): | is (regarding B2):
--------------------------------------------------------------------
T(i,j) | LUTC | LU
T(i,j,cste) | LUT | LU
T(i,j,3) and T(i,j,4) | L | L
T(j) | . | LUTC
T(i) | LUTC | .U
T(i,2j) and T(i,2j+1) | L | L
T(2i,j) and T(2i+1,j) | L | L
T(12,5) | .UT | .U
T(i/2) | . | .UT
This information is used to stop propagation of data-flow info
along inexistent loop-carried dependencies, and to have a
minimal treatment of array-kill info by loops.
Source IR with basic data-flow info - [State 3]¶
AD-specific data-flow analyses - <Step 3>¶
Files:
ADActivityAnalyzer ReqExplicit DepsAnalyzer DiffLivenessAnalyzer ADTBRAnalyzer DiffRoot DiffPattern ActivityPattern
ADActivityAnalyzer.runAnalysis()
ReqExplicit.runAnalysis()
DepsAnalyzer.runAnalysis()
DiffLivenessAnalyzer.runAnalysis()
ADTBRAnalyzer.runAnalysis()
AD-specific Data-Flow analyses are run in this order: Activity; Pointer
Activity; Dependencies; Diff-Liveness; TBR. Each of these 5 analyses is
run completely on the full code (full CallGraph
) before the next
analysis starts.
Notes on the 4 phases Storage Activity Analysis (ReqExplicit)¶
[1] BOTTOMUP_1=="conditional maybe_required" : For each Unit curU, computes
** RE: reqXEffects(curU): The effect of any call curU() on ReqX analysis,
a BoolMatrix which, multiplied on the right with the ReqX
downstream the call curU(), returns the ReqX upstream the call curU().
** REA: reqXEffectsAdded(curU): The "added" effect of any call curU() on
ReqX analysis, a BoolVector of the zones to be added unconditionally
to the ReqX upstream the call curU(). REA could be an extra column in RE.
Computation: (Backwards from final point to entry point of each analyzed unit curU)
-- Initialization values written down at the final point:
RE_d := Identity
REA_d := Empty
Rationale: This is the "effect" of an empty-body routine,
i.e. a neutral element for the propagation through a call.
-- up through an active access (read or write) to *P :
RE_u := RE_d (i.e. unmodified!)
REA_u := REA_d U {P}
Rationale: If *P is active, this makes P' required up, unconditionally
-- up through a pointer assignment P=>P2 or P=>malloc or P=>&T :
RE_u := RE_d {set {P->zero} if P is total} U {P2->RE_d(P)}
REA_u := REA_d \ {P if P is total} U {P2 if P in REA_d}
Rationale: The reasons why P may be required down, get accumulated into
the reasons why P2 may be required up. If P is totally overwritten,
the upstream P doesn't yet require a P'.
-- up through a free(P) or nullify(P) (special case of the above):
RE_u := RE_d {set {P->zero} if P is total}
REA_u := REA_d \ {P if P is total}
Rationale: If P is total, P' is freed, so P' is not required up any more.
Notice this case requires an incorrect code as P, like P', is required
by downstream code, but we found no allocation of P upstream.
-- up through a call e.g. call Foo(...,P,...)
RE_u := {RE(Foo) +Id for non-total actual-args} * RE_d
REA_u := ({RE(Foo) +Id for non-total actual-args} * REA_d) U REA(Foo)
U {actual-args of active formal args}
Rationale: For any P, P' is required upstream the call Foo()
either because Foo may require it depending on pointers required
downstream call Foo(), or because Foo may require it unconditionally.
-- Final result found at the initial point:
RE(curU) := RE_u
REA(curU) := REA_u
Rationale: this is what is needed for the TOPDOWN_2 phase on any call curU()
[2] TOPDOWN_2=="maybe_required" : For each Unit curU, computes
** ReqX: Required active storage
Between each instruction, the (pointer) zones for which the
diff pointer may need to be allocated when reaching that point.
Computation: (Backwards from final point to entry point of each analyzed unit curU)
-- Initialization value written down at the final point:
R_d := {accumulated R_d's analysis context on the exit of curU}, or Empty.
Rationale: no P' required at the final exit of the differentiated part.
-- up through an active access (read or write) to *P :
R_u := R_d U {P}
Rationale: active use of P requires to have a P' available.
-- up through a pointer assignment P=>P2 or P=>malloc or P=>&T :
R_u := R_d \ {P if P is total} U {P2 if P in R_d}
Rationale: a required down P' must be made here => not required up.
Furthermore, if P' required down, then P2' is required up.
-- up through a free(P) or nullify(P) (special case of the above):
R_u := R_d \ {P if P is total}
Rationale: a priori, the upstream P doesn't yet require a P'.
-- up through a call e.g. call Foo(...,P,...)
==> Accumulate R_d into the TOPDOWN_2 analysis context on the exit of Foo
R_u := ({RE(Foo) +Id for non-total actual-args} * R_d) U REA(Foo)
Rationale: by definition of RE(Foo) and of REA(Foo).
Non-total actual-args must retain their status from R_d.
-- Final result found at the initial point:
R_u
When curU is a top Unit, R_u(curU) can be used to require the calling
context of curU to provide hand-built derivative variables.
[3] BOTTOMUP_3=="conditional maybe_present" : For each Unit curU, computes
** AEV: unitsAvlXEffectsVisible(curU): The "Visibility" effect of any call curU() on
AvlX analysis, i.e. the pointer arguments that possibly at least
partly keep their allocation status through the call curU().
This may be seen as the list of P possibly not deallocated by curU.
** AEA: unitsAvlXEffectsAdded(curU): The "added" effect of any call curU() on AvlX analysis,
a BoolVector of the zones to be added to the AvlX downstream the call curU().
This may be seen as the list of pointers P outputs of call curU() whose
derivative P' may become (partly) available (hopefully
because allocated) upon return from curU.
Computation: (Forwards from entry point to final point of each analyzed unit curU)
-- Initialization values written down at the initial point:
AEV_u := Full
AEA_u := Empty
Rationale: neutral element, i.e. the effect of a no-op.
-- down through an active access (read or write) to *P :
AEV_d := AEV_u (i.e. unmodified!)
AEA_d := AEA_u (i.e. unmodified!)
Rationale: The "maybe-present" analysis considers that the
generating fact for a pointer variable P' to be present is
the place where it is effectively allocated and not the place
where this allocation has been required. Therefore we assume
that future A_u at this point will certainly contain P.
-- down through a pointer assignment P=>P2 or P=>malloc or P=>&T :
AEV_d := AEV_u \ {P if P is total}
AEA_d := AEA_u \ {P if P is total} U {P if P in R_d}
Rationale: the up P is overwritten, so its "presence" status vanishes.
If the down P is ReqX down, then P' must be made here and therefore
becomes present down.
-- down through a free(P) or nullify(P) (special case of the above):
AEV_d := AEV_u \ {P if P is total}
AEA_d := AEA_u \ {P if P is total}
Rationale: the up P is overwritten, so its "presence" status vanishes.
-- down through a call e.g. call Foo(...,P,...)
AEV_d := AEV_u \ {(!AEV(Foo)) except for non-total actual-args}
AEA_d := AEA_u \ {(!AEV(Foo)) except for non-total actual-args} U AEA(Foo)
Rationale: by definition of AEV(Foo) and of AEA(Foo).
-- Final result found at the final point:
AEV(curU) := AEV_d
AEA(curU) := AEA_d
Rationale: this is what is needed for the TOPDOWN_4 phase on any call curU()
[4] TOPDOWN_4=="maybe_present" : For each Unit curU, computes
** AvlX: MaybePresent active storage
Between each instruction, the (pointer) zones for which the
diff pointer may be present and allocated and therefore should be
deallocated (with if_present test) along with the source variable.
Computation: (Forwards from entry point to final point of each analyzed unit curU)
-- Initialization value written down at the initial point:
A_u := {accumulated A_u's analysis context on the entry of curU} U R_u
Rationale: we assume everything required up
will have been made available by the calling context.
-- down through an active access (read or write) to *P :
A_d := A_u (i.e. unmodified!)
Rationale: could be "A_u U {P}", but we believe that this analysis
has already put P into A_u.
-- down through a pointer assignment P=>P2 or P=>malloc or P=>&T :
A_d := A_u \ {P if P is total} U {P if P in R_d}
Rationale: the up P is overwritten, so its "presence" status vanishes.
actually, even if we wanted to free P', this statement looses its address!
Then if P is required down, P' will be made here => it is now present!
Also for P2, we believe that this analysis has already placed the
presence status in A_u if P2' is present.
-- down through a free(P) or nullify(P) (special case of the above):
A_d := A_u \ {P if P is total}
Rationale: the up P is overwritten, so its "presence" status vanishes.
-- down through a call e.g. call Foo(...,P,...)
==> Accumulate A_u into the TOPDOWN_4 analysis context on the entry of Foo
A_d := A_u \ {(!AEV(Foo)) except for non-total actual-args} U AEA(Foo)
Rationale: by definition of AEV(Foo) and of AEA(Foo).
Non-total actual-args must retain their status from A_u.
-- Final result found at the final point:
A_d
When curU is a top Unit, A_d(curU) can be used to tell the calling
context of curU that it may free the available differentiated
derivatives if they exist.