Source IR with data-flow infrastructure - [State 2]¶
Tapenade represent the memory that each variable can access by abstract memory zones. This memory layout is abstract in the sense that it need not match the one that a compiler may build. This abstract layout is only concerned with ensuring an easy detection of memory overlapping. This abstract layout defines so-called memory zones, each one with a unique zone rank. Different zones are guaranteed to not overlap in memory. On the other hand, zones are not required to be consecutive in the actual memory. Reciprocally, we can compute the set of zones possibly referenced by any reference expression. In cases of aliasing, two different variables can share some zones in their sets possibly referenced. If this expression, in the context defined by its position in the IR, may access to one zone, then this zone is guaranteed to be in the set. But, due to the uncertainties of static data-flow analysis, the set of zones for this expression may very well contain zones that are never accessed in actual execution.
For efficency, the set of all zones must remain small. Therefore we never distinguish array elements. In other words, a simple array will have only one zone. On the other hand, we want to distinguish record fields, and therefore a variable of “record” type will have as many zones as record fields.
To illustrate, consider variable Z
declared in Fortran90 by
type point
real, dimension(3) :: x
real :: y
end type point
type(point), dimension(len), target :: Z
Z
is an array of records. It only has two zones assigned, one (say 17)
for all the Z(1:len)%x(1:3)
and the other (say 18) for all the
Z(1:len)%y
. In passing, this also illustrates why a zone need not be
consecutive in actual memory.
To allow for easy access to the zones of any expression, the zones
assigned to individual variables are arranged in the form of a tree.
These trees are built using TapList
’s, and the leaves are
TapIntList
’s rather than individual int
’s to capture the cases where
the actual zone cannot be determined statically. Taking again the
example above, the tree of zones for Z
is (as Tapenade would print it)
((<17>) (<18>))
or equivalently (with a graphical convention that we
use a lot):
.______.______()
| |
.__() .__()
| |
<17> <18>
Convention is: dots “.
” stand for a TapList
object, horizontal
lines go from a TapList
on the left to its tail
on the right,
vertical lines from a TapList
on top to its head
below. An
empty list
“()
” stands for the Java null
, and TapIntList
’s are shown
between angle brackets “< >
”.
The tree of zones of a single simple variable or array has the shape:
.__()
|
<9>
Similarly, the tree of zones of a reference to, say, Z(5)%y
is easily
found using the zones tree of Z
yielding:
.__()
|
<18>
This tree arrangement also captures the case of pointers, in the
following way: The zones of a pointer variable are represented with a
TapList
whose head holds (like any simple variable) the zone of the
pointer itself (therefore as a TapIntList
), and whose tail (if given)
holds the zone of the pointer destination or possible destinations. For
example, suppose first that we have defined Z2
as
type(point), dimension(len2), target :: Z2
and Tapenade has assigned zones tree ((<22>) (<23>))
to Z2
. Suppose
furthermore that a pointer p
declared as
type(point), pointer :: p
is made to point to Z(3)
in one code branch and to Z2(1)
in another
branch. Then at some code location after the branches have merged,
Tapenade will dynamically compute the following tree of zones for p
as
.______._______.______()
| | |
<40> .__() .__()
| |
<17 22> <18 23>
or equivalently (as Tapenade would print it)
(<40> (<17 22>) (<18 23>))
Note that in C, arrays are in fact mostly pointers, so the same example
gives birth to more pointers. Pointer p
declared and defined as:
typedef struct {float x[3]; float y;} point;
point Z[10];
point Z2[20];
point *p = (someTest?&(Z[3]):&(Z2[1]))
point **pp = (someTest?&Z:&Z2);
will receive a slightly different zones tree:
.______.__________________.__()
| | |
<39> .________.__() .__()
| | |
<17 22> <19 24> <18 23>
that reflects the additional flexibility of this code (one can e.g.
reassign (*p)[5].x
itself). The analyses of Tapenade further down the
line must/will handle this difference and treat these codes in a similar
way. For further comparison, the tree of zones for Z
itself is:
.______.__________________.__()
| | |
<16> .________.__() .__()
| | |
<17> <19> <18>
whereas the tree of zones for the new variable pp
is:
.______._______.__________________.__()
| | | |
<40> <16 21> .________.__() .__()
| | |
<17 22> <19 24> <18 23>
Missing: details on ZoneInfo and PublicInfo
Missing: description of Unit.externalShape, Unit.translator, Unit.transferMatrix, CallArrow.translator
Missing: details on translation of Data-Flow info through calls