Introduction to Operator apply and arguments

This tutorial describes two fundamental user APIs:

We will use a trivial Operator that, at each time step, increments by 1 all points in the physical domain.

from devito import Grid, TimeFunction, Eq, Operator

grid = Grid(shape=(4, 4))
u = TimeFunction(name='u', grid=grid, save=3)
op = Operator(Eq(u.forward, u + 1))

To run op, we have to “apply” it.

#NBVAL_IGNORE_OUTPUT
summary = op.apply()
Operator `Kernel` run in 0.00 s

Under-the-hood, some code has been generated (print(op) to display the generated code), JIT-compiled, and executed. Since no additional arguments have been passed, op has used u as input. We can verify that the content of u.data is as expected

u.dimensions, u.shape
((time, x, y), (3, 4, 4))
u.data
Data([[[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]],

      [[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]],

      [[2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.]]], dtype=float32)

In particular, we observe that:

To access all default arguments used by op without running the Operator, one can run

#NBVAL_IGNORE_OUTPUT
op.arguments()
{'u': <cparam 'P' (0x7fb0d01d45a8)>,
 'time_m': 0,
 'time_size': 3,
 'time_M': 1,
 'x_m': 0,
 'x_size': 4,
 'x_M': 3,
 'y_m': 0,
 'y_size': 4,
 'y_M': 3,
 'timers': <cparam 'P' (0x7fb0d0550918)>}

'u' stores a pointer to the allocated data; 'timers' stores a pointer to a data structure used for C-level performance profiling.

One may want to replace some of these default arguments. For example, we could increase the minimum iteration point along the spatial Dimensions x and y, and execute only the very first timestep:

#NBVAL_IGNORE_OUTPUT
u.data[:] = 0.  # Explicit reset to initial value
summary = op.apply(x_m=2, y_m=2, time_M=0)
Operator `Kernel` run in 0.00 s

We look again at the computed data to convince ourselves that everything went as intended to go

u.data
Data([[[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]],

      [[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 1., 1.],
       [0., 0., 1., 1.]],

      [[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]]], dtype=float32)

Given a generic Dimension d, the naming convention is such that:

Hence, op.apply(..., d_m=4, d_M=7, ...) will run op in the compact interval [4, 7] along d. For historical reasons, d=... aliases to d_M=...; in many Devito examples it happens to see op.apply(..., time=10, ...) – this is just equivalent to op.apply(..., time_M=10, ...).

If we try to specify an invalid iteration extreme, Devito will raise an exception.

from devito.exceptions import InvalidArgument
try:
    op.apply(time_M=2)
except InvalidArgument as e:
    print(e)
OOB detected due to time_M=2

The same Operator can be applied to a different TimeFunction. For example:

#NBVAL_IGNORE_OUTPUT
u2 = TimeFunction(name='u', grid=grid, save=5)
summary = op.apply(u=u2)
Operator `Kernel` run in 0.00 s
u2.data
Data([[[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]],

      [[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]],

      [[2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.]],

      [[3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.]],

      [[4., 4., 4., 4.],
       [4., 4., 4., 4.],
       [4., 4., 4., 4.],
       [4., 4., 4., 4.]]], dtype=float32)

Note that this was the third call to op.apply, but code generation and JIT-compilation only occurred upon the very first call.

There is one relevant case in which the maximum iteration point along the time dimension must be specified – whenever save is unset, as in such a case the Operator wouldn’t know for how many iterations to run.

v = TimeFunction(name='v', grid=grid)
op2 = Operator(Eq(v.forward, v + 1))
try:
    op2.apply()
except ValueError as e:
    print(e)
No value found for parameter time_M
#NBVAL_IGNORE_OUTPUT
summary = op2.apply(time_M=4)
Operator `Kernel` run in 0.00 s
v.data
Data([[[4., 4., 4., 4.],
       [4., 4., 4., 4.],
       [4., 4., 4., 4.],
       [4., 4., 4., 4.]],

      [[5., 5., 5., 5.],
       [5., 5., 5., 5.],
       [5., 5., 5., 5.],
       [5., 5., 5., 5.]]], dtype=float32)

The summary variable can be inspected to retrieve performance metrics.

#NBVAL_IGNORE_OUTPUT
summary
PerformanceSummary([('section0',
                     PerfEntry(time=3e-06, gflopss=0.0, gpointss=0.0, oi=0.0, ops=0, itershapes=[]))])

We observe that basically all entries except for the execution time are fixed at 0. This is because by default Devito avoids to compute performance metrics, to minimize the processing time before returning control to the user (in complex Operators, the processing time to retrieve, for instance, the operation count or the memory footprint could be significant). To compute all performance metrics, a user could either export the environment variable DEVITO_PROFILING to 'advanced' or change the profiling level programmatically before the Operator is constructed

#NBVAL_IGNORE_OUTPUT
from devito import configuration
configuration['profiling'] = 'advanced'

op = Operator(Eq(u.forward, u*u + 1.))
op.apply()
Operator `Kernel` run in 0.00 s
PerformanceSummary([('section0',
                     PerfEntry(time=3e-06, gflopss=0.021333333333333333, gpointss=0.010666666666666666, oi=0.16666666666666666, ops=2, itershapes=[(2, 4, 4)]))])

A PerformanceSummary will contain as many entries as “sections” in the generated code. Currently, there is no way to automatically tie a compiler-generated section to the user-provided Eqs (in general, there can be more than one Eq in a section). The only option is to look at the generated code and search for bodies of code wrapped within C comments such as

<code>

For example

# Uncomment me and search for START(section0) ... STOP(section0) */
# print(op)

In the PerformanceSummary, associated to section0 is a PerfEntry, whose entries represent: