FFTW MPI Wisdom - FFTW 3.3.3

Chris@10: FFTW's “wisdom” facility (see Words of Wisdom-Saving Plans) can Chris@10: be used to save MPI plans as well as to save uniprocessor plans. Chris@10: However, for MPI there are several unavoidable complications. Chris@10: Chris@10:

First, the MPI standard does not guarantee that every process can Chris@10: perform file I/O (at least, not using C stdio routines)—in general, Chris@10: we may only assume that process 0 is capable of I/O.¹ So, if we Chris@10: want to export the wisdom from a single process to a file, we must Chris@10: first export the wisdom to a string, then send it to process 0, then Chris@10: write it to a file. Chris@10: Chris@10:

Second, in principle we may want to have separate wisdom for every Chris@10: process, since in general the processes may run on different hardware Chris@10: even for a single MPI program. However, in practice FFTW's MPI code Chris@10: is designed for the case of homogeneous hardware (see Load balancing), and in this case it is convenient to use the same wisdom Chris@10: for every process. Thus, we need a mechanism to synchronize the wisdom. Chris@10: Chris@10:

To address both of these problems, FFTW provides the following two Chris@10: functions: Chris@10: Chris@10:

Chris@10: Given a communicator comm, fftw_mpi_broadcast_wisdom Chris@10: will broadcast the wisdom from process 0 to all other processes. Chris@10: Conversely, fftw_mpi_gather_wisdom will collect wisdom from all Chris@10: processes onto process 0. (If the plans created for the same problem Chris@10: by different processes are not the same, fftw_mpi_gather_wisdom Chris@10: will arbitrarily choose one of the plans.) Both of these functions Chris@10: may result in suboptimal plans for different processes if the Chris@10: processes are running on non-identical hardware. Both of these Chris@10: functions are collective calls, which means that they must be Chris@10: executed by all processes in the communicator. Chris@10: Chris@10: Chris@10:

So, for example, a typical code snippet to import wisdom from a file Chris@10: and use it on all processes would be: Chris@10: Chris@10:

(Note that we must call fftw_mpi_init before importing any Chris@10: wisdom that might contain MPI plans.) Similarly, a typical code Chris@10: snippet to export wisdom from all processes to a file is: Chris@10: Chris@10:

Chris@10:

Footnotes

[1] In fact, Chris@10: even this assumption is not technically guaranteed by the standard, Chris@10: although it seems to be universal in actual MPI implementations and is Chris@10: widely assumed by MPI-using software. Technically, you need to query Chris@10: the MPI_IO attribute of MPI_COMM_WORLD with Chris@10: MPI_Attr_get. If this attribute is MPI_PROC_NULL, no Chris@10: I/O is possible. If it is MPI_ANY_SOURCE, any process can Chris@10: perform I/O. Otherwise, it is the rank of a process that can perform Chris@10: I/O ... but since it is not guaranteed to yield the same rank Chris@10: on all processes, you have to do an MPI_Allreduce of some kind Chris@10: if you want all processes to agree about which is going to do I/O. Chris@10: And even then, the standard only guarantees that this process can Chris@10: perform output, but not input. See e.g. Parallel Programming Chris@10: with MPI by P. S. Pacheco, section 8.1.3. Needless to say, in our Chris@10: experience virtually no MPI programmers worry about this.

Chris@10: Chris@10:

6.8 FFTW MPI Wisdom

Footnotes