FFTW MPI Fortran Interface

cannam@95: cannam@95: cannam@95: FFTW MPI Fortran Interface - FFTW 3.3.3 cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95:

cannam@95: cannam@95:

cannam@95: Previous: FFTW MPI Reference, cannam@95: Up: Distributed-memory FFTW with MPI cannam@95:

cannam@95:

cannam@95: cannam@95:

6.13 FFTW MPI Fortran Interface

cannam@95: cannam@95:

cannam@95: The FFTW MPI interface is callable from modern Fortran compilers cannam@95: supporting the Fortran 2003 iso_c_binding standard for calling cannam@95: C functions. As described in Calling FFTW from Modern Fortran, cannam@95: this means that you can directly call FFTW's C interface from Fortran cannam@95: with only minor changes in syntax. There are, however, a few things cannam@95: specific to the MPI interface to keep in mind: cannam@95: cannam@95:

Instead of including fftw3.f03 as in Overview of Fortran interface, you should include 'fftw3-mpi.f03' (after cannam@95: use, intrinsic :: iso_c_binding as before). The cannam@95: fftw3-mpi.f03 file includes fftw3.f03, so you should cannam@95: not include them both yourself. (You will also want to cannam@95: include the MPI header file, usually via include 'mpif.h' or cannam@95: similar, although though this is not needed by fftw3-mpi.f03 cannam@95: per se.) (To use the ‘fftwl_’ long double extended-precision routines in supporting compilers, you should include fftw3f-mpi.f03 in addition to fftw3-mpi.f03. See Extended and quadruple precision in Fortran.) cannam@95: cannam@95:
Because of the different storage conventions between C and Fortran, cannam@95: you reverse the order of your array dimensions when passing them to cannam@95: FFTW (see Reversing array dimensions). This is merely a cannam@95: difference in notation and incurs no performance overhead. However, cannam@95: it means that, whereas in C the first dimension is distributed, cannam@95: in Fortran the last dimension of your array is distributed. cannam@95: cannam@95:
In Fortran, communicators are stored as integer types; there is cannam@95: no MPI_Comm type, nor is there any way to access a C cannam@95: MPI_Comm. Fortunately, this is taken care of for you by the cannam@95: FFTW Fortran interface: whenever the C interface expects an cannam@95: MPI_Comm type, you should pass the Fortran communicator as an cannam@95: integer.¹ cannam@95: cannam@95:
Because you need to call the ‘local_size’ function to find out cannam@95: how much space to allocate, and this may be larger than the cannam@95: local portion of the array (see MPI Data Distribution), you should cannam@95: always allocate your arrays dynamically using FFTW's allocation cannam@95: routines as described in Allocating aligned memory in Fortran. cannam@95: (Coincidentally, this also provides the best performance by cannam@95: guaranteeding proper data alignment.) cannam@95: cannam@95:
Because all sizes in the MPI FFTW interface are declared as cannam@95: ptrdiff_t in C, you should use integer(C_INTPTR_T) in cannam@95: Fortran (see FFTW Fortran type reference). cannam@95: cannam@95:
In Fortran, because of the language semantics, we generally recommend cannam@95: using the new-array execute functions for all plans, even in the cannam@95: common case where you are executing the plan on the same arrays for cannam@95: which the plan was created (see Plan execution in Fortran). cannam@95: However, note that in the MPI interface these functions are changed: cannam@95: fftw_execute_dft becomes fftw_mpi_execute_dft, cannam@95: etcetera. See Using MPI Plans. cannam@95: cannam@95:

cannam@95: cannam@95:

For example, here is a Fortran code snippet to perform a distributed cannam@95: L × M complex DFT in-place. (This assumes you have already cannam@95: initialized MPI with MPI_init and have also performed cannam@95: call fftw_mpi_init.) cannam@95: cannam@95:

       use, intrinsic :: iso_c_binding
cannam@95:        include 'fftw3-mpi.f03'
cannam@95:        integer(C_INTPTR_T), parameter :: L = ...
cannam@95:        integer(C_INTPTR_T), parameter :: M = ...
cannam@95:        type(C_PTR) :: plan, cdata
cannam@95:        complex(C_DOUBLE_COMPLEX), pointer :: data(:,:)
cannam@95:        integer(C_INTPTR_T) :: i, j, alloc_local, local_M, local_j_offset
cannam@95:      
cannam@95:      !   get local data size and allocate (note dimension reversal)
cannam@95:        alloc_local = fftw_mpi_local_size_2d(M, L, MPI_COMM_WORLD, &
cannam@95:                                             local_M, local_j_offset)
cannam@95:        cdata = fftw_alloc_complex(alloc_local)
cannam@95:        call c_f_pointer(cdata, data, [L,local_M])
cannam@95:      
cannam@95:      !   create MPI plan for in-place forward DFT (note dimension reversal)
cannam@95:        plan = fftw_mpi_plan_dft_2d(M, L, data, data, MPI_COMM_WORLD, &
cannam@95:                                    FFTW_FORWARD, FFTW_MEASURE)
cannam@95:      
cannam@95:      ! initialize data to some function my_function(i,j)
cannam@95:        do j = 1, local_M
cannam@95:          do i = 1, L
cannam@95:            data(i, j) = my_function(i, j + local_j_offset)
cannam@95:          end do
cannam@95:        end do
cannam@95:      
cannam@95:      ! compute transform (as many times as desired)
cannam@95:        call fftw_mpi_execute_dft(plan, data, data)
cannam@95:      
cannam@95:        call fftw_destroy_plan(plan)
cannam@95:        call fftw_free(cdata)
cannam@95:

cannam@95:

Note that when we called fftw_mpi_local_size_2d and cannam@95: fftw_mpi_plan_dft_2d with the dimensions in reversed order, cannam@95: since a L × M Fortran array is viewed by FFTW in C as a cannam@95: M × L array. This means that the array was distributed over cannam@95: the M dimension, the local portion of which is a cannam@95: L × local_M array in Fortran. (You must not use an cannam@95: allocate statement to allocate an L × local_M array, cannam@95: however; you must allocate alloc_local complex numbers, which cannam@95: may be greater than L * local_M, in order to reserve space for cannam@95: intermediate steps of the transform.) Finally, we mention that cannam@95: because C's array indices are zero-based, the local_j_offset cannam@95: argument can conveniently be interpreted as an offset in the 1-based cannam@95: j index (rather than as a starting index as in C). cannam@95: cannam@95:

If instead you had used the ior(FFTW_MEASURE, cannam@95: FFTW_MPI_TRANSPOSED_OUT) flag, the output of the transform would be a cannam@95: transposed M × local_L array, associated with the same cannam@95: cdata allocation (since the transform is in-place), and which cannam@95: you could declare with: cannam@95: cannam@95:

       complex(C_DOUBLE_COMPLEX), pointer :: tdata(:,:)
cannam@95:        ...
cannam@95:        call c_f_pointer(cdata, tdata, [M,local_L])
cannam@95:

cannam@95:

where local_L would have been obtained by changing the cannam@95: fftw_mpi_local_size_2d call to: cannam@95: cannam@95:

       alloc_local = fftw_mpi_local_size_2d_transposed(M, L, MPI_COMM_WORLD, &
cannam@95:                                 local_M, local_j_offset, local_L, local_i_offset)
cannam@95:

cannam@95:

Footnotes

[1] Technically, this is because you aren't cannam@95: actually calling the C functions directly. You are calling wrapper cannam@95: functions that translate the communicator with MPI_Comm_f2c cannam@95: before calling the ordinary C interface. This is all done cannam@95: transparently, however, since the fftw3-mpi.f03 interface file cannam@95: renames the wrappers so that they are called in Fortran with the same cannam@95: names as the C interface functions.

cannam@95: cannam@95:

cannam@95: cannam@95: cannam@95: