A key aspect of the STG is that each node
represents a set of instances of the task, one per
processor that executes the task at runtime.
Similarly, an edge in the STG actually represents a set of edge instances
connecting pairs of dynamic node instances.
We use symbolic integer sets to describe the set of instances for a given
node, e.g., a task executed by P processors
would be described by the set:
,
and
symbolic integer mappings to describe the edge instances,
e.g., an edge from a SEND task on processor
to a RECV task
on processor
(i.e., each processor
sends data to its right neighbor, if any) would be described by the mapping:
.
This kind of mapping enables precise symbolic representations of arbitrary
regular communication patterns.
Irregular patterns (i.e., data-dependent patterns that cannot be determined
until runtime) have to be represented as an all-to-all communication,
which is the best that can be done statically.
To capture high level communication patterns where possible (e.g., shift, pipeline, broadcast, etc. [21]) we group the communication operations in the program into related groups, each describing a single ``logical communication event''. A communication event descriptor, kept separate from the STG, captures all information about a single communication event. This includes the communication pattern, the set of communication tasks involved, and a symbolic expression for the communication size. The CPU components of each communication event are represented explicitly as communication tasks in the STG, allowing us to use task graph edges between these tasks to explicitly capture the synchronization implied by the underlying communication calls. The number of communication nodes and edges depends on the communication pattern and also on the type of message passing calls used. This technique does not work for MPI receive operations that use a wildcard message tag (because the matching send cannot be easily identified). It does work for receive operations that use a wildcard for the sending processor, but the symbolic mapping on the communication edges may be an all-to-all mapping (for the processors that execute the send and receive statements). Making the communication tasks explicit in the STG has proved valuable also because it allows us to describe arbitrary interleavings (i.e., overlaps) of communication and computation tasks on individual processors and across processors.
In addition to the symbolic sets and mappings above, each node and communication event in the STG includes a symbolic scaling function that describes how the task computation time or the message size scales as a function of program variables. Finally, note that the STG of a program containing multiple procedures is represented as a number of unconnected graphs, each corresponding to a single procedure. Each call site is represented by a CALL task that identifies the called procedure by name.