FFTW MPI Wisdom - FFTW 3.2alpha3

d@0: d@0: d@0: FFTW MPI Wisdom - FFTW 3.2alpha3 d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0:

d@0:

d@0: d@0: Next: Avoiding MPI Deadlocks, d@0: Previous: FFTW MPI Transposes, d@0: Up: Distributed-memory FFTW with MPI d@0:

d@0:

d@0: d@0:

6.8 FFTW MPI Wisdom

d@0: d@0:

d@0: FFTW's “wisdom” facility (see Words of Wisdom-Saving Plans) can d@0: be used to save MPI plans as well as to save uniprocessor plans. d@0: However, for MPI there are several unavoidable complications. d@0: d@0:

First, the MPI standard does not guarantee that every process can d@0: perform file I/O (at least, not using C stdio routines)—only the d@0: process of rank MPI_IO can in general (unless MPI_IO == d@0: MPI_ANY_SOURCE). (In practice, MPI_IO is commonly assumed to d@0: be zero, i.e. at least process 0 can perform I/O, since otherwise a d@0: single-process run could not perform I/O.) So, if we want to export d@0: the wisdom from a single process to a file, we must first export the d@0: wisdom to a string, then send it to process 0, then write it to a d@0: file. d@0: d@0: Second, in principle we may want to have separate wisdom for every d@0: process, since in general the processes may run on different hardware d@0: even for a single MPI program. However, in practice FFTW's MPI code d@0: is designed for the case of homogeneous hardware (see Load balancing), and in this case it is convenient to use the same wisdom d@0: for every process. Thus, we need a mechanism to synchronize the wisdom. d@0: d@0:

To address both of these problems, FFTW provides the following two d@0: functions: d@0: d@0:

     void fftw_mpi_broadcast_wisdom(MPI_Comm comm);
d@0:      void fftw_mpi_gather_wisdom(MPI_Comm comm);
d@0:

d@0:

d@0: Given a communicator comm, fftw_mpi_broadcast_wisdom d@0: will broadcast the wisdom from process 0 to all other processes. d@0: Conversely, fftw_mpi_gather_wisdom will collect wisdom from all d@0: processes onto process 0. (If the plans created for the same problem d@0: by different processes are not the same, fftw_mpi_gather_wisdom d@0: will arbitrarily choose one of the plans.) Both of these functions d@0: may result in suboptimal plans for different processes if the d@0: processes are running on non-identical hardware. Both of these d@0: functions are collective calls, which means that they must be d@0: executed by all processes in the communicator. d@0: d@0: So, for example, a typical code snippet to import wisdom from a file d@0: and use it on all processes would be: d@0: d@0:

     {
d@0:          int rank;
d@0:          FILE *f;
d@0:      
d@0:          fftw_mpi_init();
d@0:          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
d@0:          if (rank == 0 && (f = fopen("mywisdom", "r"))) {
d@0:              fftw_import_wisdom_from_file(f);
d@0:              fclose(f);
d@0:          }
d@0:          fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD);
d@0:      }
d@0:

d@0:

(Note that we must call fftw_mpi_init before importing any d@0: wisdom that might contain MPI plans.) Similarly, a typical code d@0: snippet to export wisdom from all processes to a file is: d@0: d@0:

     {
d@0:          int rank;
d@0:          FILE *f;
d@0:      
d@0:          fftw_mpi_gather_wisdom(MPI_COMM_WORLD);
d@0:          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
d@0:          if (rank == 0 && (f = fopen("mywisdom", "w"))) {
d@0:              fftw_export_wisdom_to_file(f);
d@0:              fclose(f);
d@0:          }
d@0:      }
d@0:

d@0: d@0: d@0: