Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: FFTW 3.3.5: MPI Plan Creation Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42:
Chris@42:

Chris@42: Next: , Previous: , Up: FFTW MPI Reference   [Contents][Index]

Chris@42:
Chris@42:
Chris@42: Chris@42:

6.12.5 MPI Plan Creation

Chris@42: Chris@42: Chris@42:

Complex-data MPI DFTs

Chris@42: Chris@42:

Plans for complex-data DFTs (see 2d MPI example) are created by: Chris@42:

Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42:
Chris@42:
fftw_plan fftw_mpi_plan_dft_1d(ptrdiff_t n0, fftw_complex *in, fftw_complex *out,
Chris@42:                                MPI_Comm comm, int sign, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@42:                                fftw_complex *in, fftw_complex *out,
Chris@42:                                MPI_Comm comm, int sign, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@42:                                fftw_complex *in, fftw_complex *out,
Chris@42:                                MPI_Comm comm, int sign, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft(int rnk, const ptrdiff_t *n, 
Chris@42:                             fftw_complex *in, fftw_complex *out,
Chris@42:                             MPI_Comm comm, int sign, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_many_dft(int rnk, const ptrdiff_t *n,
Chris@42:                                  ptrdiff_t howmany, ptrdiff_t block, ptrdiff_t tblock,
Chris@42:                                  fftw_complex *in, fftw_complex *out,
Chris@42:                                  MPI_Comm comm, int sign, unsigned flags);
Chris@42: 
Chris@42: Chris@42: Chris@42: Chris@42:

These are similar to their serial counterparts (see Complex DFTs) Chris@42: in specifying the dimensions, sign, and flags of the transform. The Chris@42: comm argument gives an MPI communicator that specifies the set Chris@42: of processes to participate in the transform; plan creation is a Chris@42: collective function that must be called for all processes in the Chris@42: communicator. The in and out pointers refer only to a Chris@42: portion of the overall transform data (see MPI Data Distribution) Chris@42: as specified by the ‘local_size’ functions in the previous Chris@42: section. Unless flags contains FFTW_ESTIMATE, these Chris@42: arrays are overwritten during plan creation as for the serial Chris@42: interface. For multi-dimensional transforms, any dimensions > Chris@42: 1 are supported; for one-dimensional transforms, only composite Chris@42: (non-prime) n0 are currently supported (unlike the serial Chris@42: FFTW). Requesting an unsupported transform size will yield a Chris@42: NULL plan. (As in the serial interface, highly composite sizes Chris@42: generally yield the best performance.) Chris@42:

Chris@42: Chris@42: Chris@42: Chris@42:

The advanced-interface fftw_mpi_plan_many_dft additionally Chris@42: allows you to specify the block sizes for the first dimension Chris@42: (block) of the n0 × n1 × n2 × … × nd-1 input data and the first dimension Chris@42: (tblock) of the n1 × n0 × n2 ×…× nd-1 transposed data (at intermediate Chris@42: steps of the transform, and for the output if Chris@42: FFTW_TRANSPOSED_OUT is specified in flags). These must Chris@42: be the same block sizes as were passed to the corresponding Chris@42: ‘local_size’ function; you can pass FFTW_MPI_DEFAULT_BLOCK Chris@42: to use FFTW’s default block size as in the basic interface. Also, the Chris@42: howmany parameter specifies that the transform is of contiguous Chris@42: howmany-tuples rather than individual complex numbers; this Chris@42: corresponds to the same parameter in the serial advanced interface Chris@42: (see Advanced Complex DFTs) with stride = howmany and Chris@42: dist = 1. Chris@42:

Chris@42: Chris@42:

MPI flags

Chris@42: Chris@42:

The flags can be any of those for the serial FFTW Chris@42: (see Planner Flags), and in addition may include one or more of Chris@42: the following MPI-specific flags, which improve performance at the Chris@42: cost of changing the output or input data formats. Chris@42:

Chris@42: Chris@42: Chris@42: Chris@42:

Real-data MPI DFTs

Chris@42: Chris@42: Chris@42:

Plans for real-input/output (r2c/c2r) DFTs (see Multi-dimensional MPI DFTs of Real Data) are created by: Chris@42:

Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: Chris@42:
Chris@42:
fftw_plan fftw_mpi_plan_dft_r2c_2d(ptrdiff_t n0, ptrdiff_t n1, 
Chris@42:                                    double *in, fftw_complex *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_r2c_2d(ptrdiff_t n0, ptrdiff_t n1, 
Chris@42:                                    double *in, fftw_complex *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_r2c_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@42:                                    double *in, fftw_complex *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_r2c(int rnk, const ptrdiff_t *n,
Chris@42:                                 double *in, fftw_complex *out,
Chris@42:                                 MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_c2r_2d(ptrdiff_t n0, ptrdiff_t n1, 
Chris@42:                                    fftw_complex *in, double *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_c2r_2d(ptrdiff_t n0, ptrdiff_t n1, 
Chris@42:                                    fftw_complex *in, double *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_c2r_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@42:                                    fftw_complex *in, double *out,
Chris@42:                                    MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_dft_c2r(int rnk, const ptrdiff_t *n,
Chris@42:                                 fftw_complex *in, double *out,
Chris@42:                                 MPI_Comm comm, unsigned flags);
Chris@42: 
Chris@42: Chris@42:

Similar to the serial interface (see Real-data DFTs), these Chris@42: transform logically n0 × n1 × n2 × … × nd-1 real data to/from n0 × n1 × n2 × … × (nd-1/2 + 1) complex Chris@42: data, representing the non-redundant half of the conjugate-symmetry Chris@42: output of a real-input DFT (see Multi-dimensional Transforms). Chris@42: However, the real array must be stored within a padded n0 × n1 × n2 × … × [2 (nd-1/2 + 1)] Chris@42: array (much like the in-place serial r2c transforms, but here for Chris@42: out-of-place transforms as well). Currently, only multi-dimensional Chris@42: (rnk > 1) r2c/c2r transforms are supported (requesting a plan Chris@42: for rnk = 1 will yield NULL). As explained above Chris@42: (see Multi-dimensional MPI DFTs of Real Data), the data Chris@42: distribution of both the real and complex arrays is given by the Chris@42: ‘local_size’ function called for the dimensions of the Chris@42: complex array. Similar to the other planning functions, the Chris@42: input and output arrays are overwritten when the plan is created Chris@42: except in FFTW_ESTIMATE mode. Chris@42:

Chris@42:

As for the complex DFTs above, there is an advance interface that Chris@42: allows you to manually specify block sizes and to transform contiguous Chris@42: howmany-tuples of real/complex numbers: Chris@42:

Chris@42: Chris@42: Chris@42:
Chris@42:
fftw_plan fftw_mpi_plan_many_dft_r2c
Chris@42:               (int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@42:                ptrdiff_t iblock, ptrdiff_t oblock,
Chris@42:                double *in, fftw_complex *out,
Chris@42:                MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_many_dft_c2r
Chris@42:               (int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@42:                ptrdiff_t iblock, ptrdiff_t oblock,
Chris@42:                fftw_complex *in, double *out,
Chris@42:                MPI_Comm comm, unsigned flags);               
Chris@42: 
Chris@42: Chris@42: Chris@42:

MPI r2r transforms

Chris@42: Chris@42: Chris@42:

There are corresponding plan-creation routines for r2r Chris@42: transforms (see More DFTs of Real Data), currently supporting Chris@42: multidimensional (rnk > 1) transforms only (rnk = 1 will Chris@42: yield a NULL plan): Chris@42:

Chris@42:
Chris@42:
fftw_plan fftw_mpi_plan_r2r_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@42:                                double *in, double *out,
Chris@42:                                MPI_Comm comm,
Chris@42:                                fftw_r2r_kind kind0, fftw_r2r_kind kind1,
Chris@42:                                unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_r2r_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@42:                                double *in, double *out,
Chris@42:                                MPI_Comm comm,
Chris@42:                                fftw_r2r_kind kind0, fftw_r2r_kind kind1, fftw_r2r_kind kind2,
Chris@42:                                unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_r2r(int rnk, const ptrdiff_t *n,
Chris@42:                             double *in, double *out,
Chris@42:                             MPI_Comm comm, const fftw_r2r_kind *kind, 
Chris@42:                             unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_many_r2r(int rnk, const ptrdiff_t *n,
Chris@42:                                  ptrdiff_t iblock, ptrdiff_t oblock,
Chris@42:                                  double *in, double *out,
Chris@42:                                  MPI_Comm comm, const fftw_r2r_kind *kind, 
Chris@42:                                  unsigned flags);
Chris@42: 
Chris@42: Chris@42:

The parameters are much the same as for the complex DFTs above, except Chris@42: that the arrays are of real numbers (and hence the outputs of the Chris@42: ‘local_size’ data-distribution functions should be interpreted as Chris@42: counts of real rather than complex numbers). Also, the kind Chris@42: parameters specify the r2r kinds along each dimension as for the Chris@42: serial interface (see Real-to-Real Transform Kinds). See Other Multi-dimensional Real-data MPI Transforms. Chris@42:

Chris@42: Chris@42:

MPI transposition

Chris@42: Chris@42: Chris@42:

FFTW also provides routines to plan a transpose of a distributed Chris@42: n0 by n1 array of real numbers, or an array of Chris@42: howmany-tuples of real numbers with specified block sizes Chris@42: (see FFTW MPI Transposes): Chris@42:

Chris@42: Chris@42: Chris@42:
Chris@42:
fftw_plan fftw_mpi_plan_transpose(ptrdiff_t n0, ptrdiff_t n1,
Chris@42:                                   double *in, double *out,
Chris@42:                                   MPI_Comm comm, unsigned flags);
Chris@42: fftw_plan fftw_mpi_plan_many_transpose
Chris@42:                 (ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t howmany,
Chris@42:                  ptrdiff_t block0, ptrdiff_t block1,
Chris@42:                  double *in, double *out, MPI_Comm comm, unsigned flags);
Chris@42: 
Chris@42: Chris@42: Chris@42: Chris@42:

These plans are used with the fftw_mpi_execute_r2r new-array Chris@42: execute function (see Using MPI Plans), since they count as (rank Chris@42: zero) r2r plans from FFTW’s perspective. Chris@42:

Chris@42:
Chris@42:
Chris@42:

Chris@42: Next: , Previous: , Up: FFTW MPI Reference   [Contents][Index]

Chris@42:
Chris@42: Chris@42: Chris@42: Chris@42: Chris@42: