Chris@19: Chris@19: Chris@19: MPI Data Distribution Functions - FFTW 3.3.4 Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19: Chris@19:
Chris@19: Chris@19:

Chris@19: Next: , Chris@19: Previous: Using MPI Plans, Chris@19: Up: FFTW MPI Reference Chris@19:


Chris@19:
Chris@19: Chris@19:

6.12.4 MPI Data Distribution Functions

Chris@19: Chris@19:

As described above (see MPI Data Distribution), in order to Chris@19: allocate your arrays, before creating a plan, you must first Chris@19: call one of the following routines to determine the required Chris@19: allocation size and the portion of the array locally stored on a given Chris@19: process. The MPI_Comm communicator passed here must be Chris@19: equivalent to the communicator used below for plan creation. Chris@19: Chris@19:

The basic interface for multidimensional transforms consists of the Chris@19: functions: Chris@19: Chris@19:

Chris@19:

     ptrdiff_t fftw_mpi_local_size_2d(ptrdiff_t n0, ptrdiff_t n1, MPI_Comm comm,
Chris@19:                                       ptrdiff_t *local_n0, ptrdiff_t *local_0_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@19:                                       MPI_Comm comm,
Chris@19:                                       ptrdiff_t *local_n0, ptrdiff_t *local_0_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size(int rnk, const ptrdiff_t *n, MPI_Comm comm,
Chris@19:                                    ptrdiff_t *local_n0, ptrdiff_t *local_0_start);
Chris@19:      
Chris@19:      ptrdiff_t fftw_mpi_local_size_2d_transposed(ptrdiff_t n0, ptrdiff_t n1, MPI_Comm comm,
Chris@19:                                                  ptrdiff_t *local_n0, ptrdiff_t *local_0_start,
Chris@19:                                                  ptrdiff_t *local_n1, ptrdiff_t *local_1_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size_3d_transposed(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@19:                                                  MPI_Comm comm,
Chris@19:                                                  ptrdiff_t *local_n0, ptrdiff_t *local_0_start,
Chris@19:                                                  ptrdiff_t *local_n1, ptrdiff_t *local_1_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size_transposed(int rnk, const ptrdiff_t *n, MPI_Comm comm,
Chris@19:                                               ptrdiff_t *local_n0, ptrdiff_t *local_0_start,
Chris@19:                                               ptrdiff_t *local_n1, ptrdiff_t *local_1_start);
Chris@19: 
Chris@19:

These functions return the number of elements to allocate (complex Chris@19: numbers for DFT/r2c/c2r plans, real numbers for r2r plans), whereas Chris@19: the local_n0 and local_0_start return the portion Chris@19: (local_0_start to local_0_start + local_n0 - 1) of the Chris@19: first dimension of an n0 × n1 × n2 × … × nd-1 array that is stored on the local Chris@19: process. See Basic and advanced distribution interfaces. For Chris@19: FFTW_MPI_TRANSPOSED_OUT plans, the ‘_transposed’ variants Chris@19: are useful in order to also return the local portion of the first Chris@19: dimension in the n1 × n0 × n2 ×…× nd-1 transposed output. Chris@19: See Transposed distributions. Chris@19: The advanced interface for multidimensional transforms is: Chris@19: Chris@19:

Chris@19:

     ptrdiff_t fftw_mpi_local_size_many(int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@19:                                         ptrdiff_t block0, MPI_Comm comm,
Chris@19:                                         ptrdiff_t *local_n0, ptrdiff_t *local_0_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size_many_transposed(int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@19:                                                    ptrdiff_t block0, ptrdiff_t block1, MPI_Comm comm,
Chris@19:                                                    ptrdiff_t *local_n0, ptrdiff_t *local_0_start,
Chris@19:                                                    ptrdiff_t *local_n1, ptrdiff_t *local_1_start);
Chris@19: 
Chris@19:

These differ from the basic interface in only two ways. First, they Chris@19: allow you to specify block sizes block0 and block1 (the Chris@19: latter for the transposed output); you can pass Chris@19: FFTW_MPI_DEFAULT_BLOCK to use FFTW's default block size as in Chris@19: the basic interface. Second, you can pass a howmany parameter, Chris@19: corresponding to the advanced planning interface below: this is for Chris@19: transforms of contiguous howmany-tuples of numbers Chris@19: (howmany = 1 in the basic interface). Chris@19: Chris@19:

The corresponding basic and advanced routines for one-dimensional Chris@19: transforms (currently only complex DFTs) are: Chris@19: Chris@19:

Chris@19:

     ptrdiff_t fftw_mpi_local_size_1d(
Chris@19:                   ptrdiff_t n0, MPI_Comm comm, int sign, unsigned flags,
Chris@19:                   ptrdiff_t *local_ni, ptrdiff_t *local_i_start,
Chris@19:                   ptrdiff_t *local_no, ptrdiff_t *local_o_start);
Chris@19:      ptrdiff_t fftw_mpi_local_size_many_1d(
Chris@19:                   ptrdiff_t n0, ptrdiff_t howmany,
Chris@19:                   MPI_Comm comm, int sign, unsigned flags,
Chris@19:                   ptrdiff_t *local_ni, ptrdiff_t *local_i_start,
Chris@19:                   ptrdiff_t *local_no, ptrdiff_t *local_o_start);
Chris@19: 
Chris@19:

As above, the return value is the number of elements to allocate Chris@19: (complex numbers, for complex DFTs). The local_ni and Chris@19: local_i_start arguments return the portion Chris@19: (local_i_start to local_i_start + local_ni - 1) of the Chris@19: 1d array that is stored on this process for the transform Chris@19: input, and local_no and local_o_start are the Chris@19: corresponding quantities for the input. The sign Chris@19: (FFTW_FORWARD or FFTW_BACKWARD) and flags must Chris@19: match the arguments passed when creating a plan. Although the inputs Chris@19: and outputs have different data distributions in general, it is Chris@19: guaranteed that the output data distribution of an Chris@19: FFTW_FORWARD plan will match the input data distribution Chris@19: of an FFTW_BACKWARD plan and vice versa; similarly for the Chris@19: FFTW_MPI_SCRAMBLED_OUT and FFTW_MPI_SCRAMBLED_IN flags. Chris@19: See One-dimensional distributions. Chris@19: Chris@19: Chris@19: