Pencil configurations

A pencil configuration refers to a given distribution of multidimensional data among MPI processes. This information is encoded in the Pencil type.

A pencil configuration includes:

  • MPI topology information,
  • global and local dimensions of the numerical grid,
  • subset of decomposed dimensions,
  • definition of optional permutation of dimensions.

Construction

High-level interface

The simplest way of constructing a new Pencil is by passing the global dataset dimensions and an MPI communicator to the Pencil constructor:

julia> dims_global = (16, 32, 64);

julia> comm = MPI.COMM_WORLD;

julia> pen = Pencil(dims_global, comm)
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (2, 3)
    Data permutation: NoPermutation()

This will decompose the data along the two last dimensions of the dataset:[1]

julia> decomposition(pen)
(2, 3)

For instance, if the communicator comm has 4 MPI processes, then each process will hold a subset of data of size (16, 16, 32):

julia> topology(pen)
MPI topology: 2D decomposition (2×2 processes)  # assuming MPI.Comm_size(comm) == 4

julia> size_local(pen)
(16, 16, 32)

Instead of the default, one may want to choose the subset of dimensions that should be decomposed. For instance, to decompose along the first dimension only:

julia> decomp_dims = (1,);

julia> pen = Pencil(dims_global, decomp_dims, comm)
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (1,)
    Data permutation: NoPermutation()

julia> decomposition(pen)
(1,)

julia> topology(pen)
MPI topology: 1D decomposition (4 processes)  # again, assuming 4 MPI processes

Low-level interface

Note that the above high-level constructors don't require the definition of a MPITopology, which is constructed implicitly. For more control, one may want to manually construct a MPITopology, and then construct a Pencil from that. As above, one may also specify the list of decomposed dimensions.

For instance, we may want to decompose 32 MPI processes into a 8×4 Cartesian topology. This is done as follows:

julia> topo = MPITopology(comm, (8, 4))  # NOTE: fails if MPI.Comm_size(comm) ≠ 32
MPI topology: 2D decomposition (8×4 processes)

julia> dims_global = (16, 32, 64);

julia> pen = Pencil(topo, dims_global)
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (2, 3)
    Data permutation: NoPermutation()

As before, the decomposed dimensions are the rightmost ones by default (in this case, dimensions 2 and 3). A different set of dimensions may be selected via an optional positional argument. For instance, to decompose along dimensions 1 and 3 instead:

julia> decomp_dims = (1, 3);

julia> pen = Pencil(topo, dims_global, decomp_dims)
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (1, 3)
    Data permutation: NoPermutation()

Defining multiple pencils

One may also want to work with multiple pencil configurations that differ, for instance, on the selection of decomposed dimensions. For this case, a constructor is available that takes an already existing Pencil instance. Calling this constructor should be preferred when possible since it allows sharing memory buffers (used for instance for global transpositions) and thus reducing memory usage. The following creates a Pencil equivalent to the one above, but with different decomposed dimensions:

julia> pen_y = Pencil(pen; decomp_dims = (1, 3))
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (1, 3)
    Data permutation: NoPermutation()

See the Pencil documentation for more details.

Dimension permutations

A Pencil may optionally be given information on dimension permutations. In this case, the layout of the data arrays in memory is different from the logical order of dimensions. For performance reasons, permutations are compile-time objects defined in the StaticPermutations package.

To make permutations clearer, consider the example above where the global data dimensions are $N_x × N_y × N_z = 16 × 32 × 64$. In this case, the logical order is $(x, y, z)$. Now let's say that we want the memory order of the data to be $(y, z, x)$,[2] which corresponds to the permutation (2, 3, 1).

Permutations are passed to the Pencil constructor via the permute keyword argument. Dimension permutations should be specified using a Permutation object. For instance,

julia> perm = Permutation(2, 3, 1);

julia> pen = Pencil(dims_global, comm; permute = perm)
Decomposition of 3D data
    Data dimensions: (16, 32, 64)
    Decomposed dimensions: (2, 3)
    Data permutation: Permutation(2, 3, 1)

One can also pass a NoPermutation to disable permutations (this is the default).

Types

PencilArrays.Pencils.PencilType
Pencil{N,M}

Describes the decomposition of an N-dimensional array among MPI processes along M directions (with M < N).


Pencil(
    [A = Array],
    topology::MPITopology{M}, size_global::Dims{N},
    decomp_dims::Dims{M} = default_decomposition(N, Val(M));
    permute::AbstractPermutation = NoPermutation(),
    timer = TimerOutput(),
)

Define the decomposition of an N-dimensional geometry along M dimensions.

The dimensions of the geometry are given by size_global = (N1, N2, ...). The Pencil describes the decomposition of an array of dimensions size_global across a group of MPI processes.

Data is distributed over the given M-dimensional MPI topology (with M ≤ N).

The decomposed dimensions may optionally be provided via the decomp_dims argument. By default, the M rightmost dimensions are decomposed. For instance, for a 2D decomposition of 5D data (M = 2 and N = 5), the dimensions (4, 5) are decomposed by default.

It is also possible to distribute over all dimensions (M = N). Note that, in this specific case, transpositions are currently not possible.

The optional argument A allows to work with arrays other than the base Array type. In particular, this should be useful for working with GPU array types such as CuArray.

The optional permute parameter may be used to indicate a permutation of the data indices from logical order (the order in which the arrays are accessed in code) to memory order (the actual order of indices in memory). Permutations must be specified using the exported Permutation type, as in permute = Permutation(3, 1, 2).

It is also possible to pass a TimerOutput to the constructor. See Measuring performance for details.

Examples

Decompose a 3D geometry of global dimensions $N_x × N_y × N_z = 4×8×12$ along the second ($y$) and third ($z$) dimensions:

julia> topo = MPITopology(MPI.COMM_WORLD, Val(2));

julia> Pencil(topo, (4, 8, 12), (2, 3))
Decomposition of 3D data
    Data dimensions: (4, 8, 12)
    Decomposed dimensions: (2, 3)
    Data permutation: NoPermutation()
    Array type: Array

julia> Pencil(topo, (4, 8, 12), (2, 3); permute = Permutation(3, 2, 1))
Decomposition of 3D data
    Data dimensions: (4, 8, 12)
    Decomposed dimensions: (2, 3)
    Data permutation: Permutation(3, 2, 1)
    Array type: Array

In the second case, the actual data is stored in (z, y, x) order within each MPI process.


Pencil([A = Array], size_global::Dims{N}, [decomp_dims = (2, …, N)], comm::MPI.Comm; kws...)

Convenience constructor that implicitly creates a MPITopology.

The number of decomposed dimensions specified by decomp_dims must be M < N. If decomp_dims is not passed, dimensions 2:N are decomposed.

Keyword arguments are passed to alternative constructor taking an MPITopology. That constructor should be used if more control is desired.

Examples

julia> Pencil((4, 8, 12), MPI.COMM_WORLD)
Decomposition of 3D data
    Data dimensions: (4, 8, 12)
    Decomposed dimensions: (2, 3)
    Data permutation: NoPermutation()
    Array type: Array

julia> Pencil((4, 8, 12), (1, ), MPI.COMM_WORLD)
Decomposition of 3D data
    Data dimensions: (4, 8, 12)
    Decomposed dimensions: (1,)
    Data permutation: NoPermutation()
    Array type: Array

Pencil(
    [A = Array],
    p::Pencil{N,M};
    decomp_dims::Dims{M} = decomposition(p),
    size_global::Dims{N} = size_global(p),
    permute::P = permutation(p),
    timer::TimerOutput = timer(p),
)

Create new pencil configuration from an existent one.

This constructor enables sharing temporary data buffers between the two pencil configurations, leading to reduced global memory usage.

source

Methods

PencilArrays.Permutations.permutationMethod
permutation(::Type{<:Pencil}) -> AbstractPermutation
permutation(p::Pencil)        -> AbstractPermutation

Get index permutation associated to the given pencil configuration.

Returns NoPermutation() if there is no associated permutation.

source
Base.lengthMethod
length(p::Pencil)

Get linear length of the local data associated to the decomposition.

Equivalent to length_local(p).

source
Base.sizeMethod
size(p::Pencil)

Returns the local data dimensions associated to the decomposition, in logical order.

This is defined as size_local(p, LogicalOrder()).

source
Base.ndimsMethod
ndims(t::MPITopology)

Get dimensionality of Cartesian topology.

source
ndims(p::Pencil)

Number of spatial dimensions associated to pencil data.

This corresponds to the total number of dimensions of the space, which includes the decomposed and non-decomposed dimensions.

source
PencilArrays.Pencils.range_remoteMethod
range_remote(p::Pencil, coords, [order = LogicalOrder()])
range_remote(p::Pencil, n::Integer, [order = LogicalOrder()])

Get data range held by a given MPI process.

In the first variant, coords are the coordinates of the MPI process in the Cartesian topology. They can be specified as a tuple (i, j, ...) or as a CartesianIndex.

In the second variant, n is the linear index of a given process in the topology.

source
PencilArrays.Pencils.range_localMethod
range_local(p::Pencil, [order = LogicalOrder()])

Local data range held by the pencil.

By default the dimensions are not permuted, i.e. they follow the logical order of dimensions.

source
PencilArrays.Pencils.size_globalMethod
size_global(p::Pencil, [order = LogicalOrder()])

Global dimensions of the Cartesian grid associated to the given domain decomposition.

Like size_local, by default the returned dimensions are in logical order.

source
PencilArrays.Pencils.size_localMethod
size_local(p::Pencil, [order = LogicalOrder()])

Local dimensions of the data held by the pencil.

By default the dimensions are not permuted, i.e. they follow the logical order of dimensions.

source
PencilArrays.Pencils.to_localMethod
to_local(p::Pencil, global_inds, [order = LogicalOrder()])

Convert non-permuted (logical) global indices to local indices.

If order = MemoryOrder(), returned indices will be permuted using the permutation associated to the pencil configuration p.

source
Base.similarMethod
similar(x::PencilArray, [element_type=eltype(x)], [dims]) -> PencilArray

Returns a PencilArray similar to x.

In particular, the new array shares the same parallel decomposition (the same Pencil) than x. This means that the dimensions of the new array must be the same as those of x. Note that the optional dims argument is allowed for the sole reason of making things work nicely with other packages (such as StructArrays.jl), but things will fail if dims ≠ size(x).

Examples

julia> pen = Pencil((20, 10, 12), MPI.COMM_WORLD);

julia> u = PencilArray{Float64}(undef, pen);

julia> similar(u) |> summary
"20×10×12 PencilArray{Float64, 3}(::Pencil{3, 2, NoPermutation, Array})"

julia> similar(u, size(u)) |> summary
"20×10×12 PencilArray{Float64, 3}(::Pencil{3, 2, NoPermutation, Array})"

julia> similar(u, ComplexF32) |> summary
"20×10×12 PencilArray{ComplexF32, 3}(::Pencil{3, 2, NoPermutation, Array})"

julia> similar(u, (4, 3, 8))
ERROR: DimensionMismatch: cannot construct a similar PencilArray with different size

julia> similar(u, (4, 3)) |> summary
ERROR: DimensionMismatch: cannot construct a similar PencilArray with different size

julia> similar(u, ComplexF32) |> summary
"20×10×12 PencilArray{ComplexF32, 3}(::Pencil{3, 2, NoPermutation, Array})"

julia> similar(u, ComplexF32, (4, 3))
ERROR: DimensionMismatch: cannot construct a similar PencilArray with different size

similar(x::PencilArray, [element_type = eltype(x)], p::Pencil)

Create a PencilArray with the decomposition described by the given Pencil.

This variant may be used to create a PencilArray that has a different decomposition than the input PencilArray.

Examples

julia> pen_u = Pencil((20, 10, 12), (2, 3), MPI.COMM_WORLD);

julia> u = PencilArray{Float64}(undef, pen_u);

julia> pen_v = Pencil(pen_u; decomp_dims = (1, 3), permute = Permutation(2, 3, 1))
Decomposition of 3D data
    Data dimensions: (20, 10, 12)
    Decomposed dimensions: (1, 3)
    Data permutation: Permutation(2, 3, 1)
    Array type: Array

julia> v = similar(u, pen_v);

julia> summary(v)
"20×10×12 PencilArray{Float64, 3}(::Pencil{3, 2, Permutation{(2, 3, 1), 3}, Array})"

julia> pencil(v) === pen_v
true

julia> vint = similar(u, Int, pen_v);

julia> summary(vint)
"20×10×12 PencilArray{Int64, 3}(::Pencil{3, 2, Permutation{(2, 3, 1), 3}, Array})"

julia> pencil(vint) === pen_v
true
source
similar(p::Pencil, [A = typeof_array(p)], [dims = size_global(p)])

Returns a Pencil decomposition with global dimensions dims and with underlying array type A.

Typically, A should be something like Array or CuArray (see Pencil for details).

source

Index

  • 1More generally, an $N$-dimensional dataset is by default decomposed along its $N - 1$ last dimensions.
  • 2Why would we want this? One application is to efficiently perform FFTs along $y$, which, under this permutation, would be the fastest dimension. This is used by the PencilFFTs package.