cannam@95: cannam@95: cannam@95: FFTW MPI Wisdom - FFTW 3.3.3 cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95: cannam@95:
cannam@95: cannam@95:

cannam@95: Next: , cannam@95: Previous: FFTW MPI Transposes, cannam@95: Up: Distributed-memory FFTW with MPI cannam@95:


cannam@95:
cannam@95: cannam@95:

6.8 FFTW MPI Wisdom

cannam@95: cannam@95:

cannam@95: FFTW's “wisdom” facility (see Words of Wisdom-Saving Plans) can cannam@95: be used to save MPI plans as well as to save uniprocessor plans. cannam@95: However, for MPI there are several unavoidable complications. cannam@95: cannam@95:

First, the MPI standard does not guarantee that every process can cannam@95: perform file I/O (at least, not using C stdio routines)—in general, cannam@95: we may only assume that process 0 is capable of I/O.1 So, if we cannam@95: want to export the wisdom from a single process to a file, we must cannam@95: first export the wisdom to a string, then send it to process 0, then cannam@95: write it to a file. cannam@95: cannam@95:

Second, in principle we may want to have separate wisdom for every cannam@95: process, since in general the processes may run on different hardware cannam@95: even for a single MPI program. However, in practice FFTW's MPI code cannam@95: is designed for the case of homogeneous hardware (see Load balancing), and in this case it is convenient to use the same wisdom cannam@95: for every process. Thus, we need a mechanism to synchronize the wisdom. cannam@95: cannam@95:

To address both of these problems, FFTW provides the following two cannam@95: functions: cannam@95: cannam@95:

     void fftw_mpi_broadcast_wisdom(MPI_Comm comm);
cannam@95:      void fftw_mpi_gather_wisdom(MPI_Comm comm);
cannam@95: 
cannam@95:

cannam@95: Given a communicator comm, fftw_mpi_broadcast_wisdom cannam@95: will broadcast the wisdom from process 0 to all other processes. cannam@95: Conversely, fftw_mpi_gather_wisdom will collect wisdom from all cannam@95: processes onto process 0. (If the plans created for the same problem cannam@95: by different processes are not the same, fftw_mpi_gather_wisdom cannam@95: will arbitrarily choose one of the plans.) Both of these functions cannam@95: may result in suboptimal plans for different processes if the cannam@95: processes are running on non-identical hardware. Both of these cannam@95: functions are collective calls, which means that they must be cannam@95: executed by all processes in the communicator. cannam@95: cannam@95: cannam@95:

So, for example, a typical code snippet to import wisdom from a file cannam@95: and use it on all processes would be: cannam@95: cannam@95:

     {
cannam@95:          int rank;
cannam@95:      
cannam@95:          fftw_mpi_init();
cannam@95:          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
cannam@95:          if (rank == 0) fftw_import_wisdom_from_filename("mywisdom");
cannam@95:          fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD);
cannam@95:      }
cannam@95: 
cannam@95:

(Note that we must call fftw_mpi_init before importing any cannam@95: wisdom that might contain MPI plans.) Similarly, a typical code cannam@95: snippet to export wisdom from all processes to a file is: cannam@95: cannam@95:

     {
cannam@95:          int rank;
cannam@95:      
cannam@95:          fftw_mpi_gather_wisdom(MPI_COMM_WORLD);
cannam@95:          MPI_Comm_rank(MPI_COMM_WORLD, &rank);
cannam@95:          if (rank == 0) fftw_export_wisdom_to_filename("mywisdom");
cannam@95:      }
cannam@95: 
cannam@95: cannam@95:
cannam@95:
cannam@95:

Footnotes

[1] In fact, cannam@95: even this assumption is not technically guaranteed by the standard, cannam@95: although it seems to be universal in actual MPI implementations and is cannam@95: widely assumed by MPI-using software. Technically, you need to query cannam@95: the MPI_IO attribute of MPI_COMM_WORLD with cannam@95: MPI_Attr_get. If this attribute is MPI_PROC_NULL, no cannam@95: I/O is possible. If it is MPI_ANY_SOURCE, any process can cannam@95: perform I/O. Otherwise, it is the rank of a process that can perform cannam@95: I/O ... but since it is not guaranteed to yield the same rank cannam@95: on all processes, you have to do an MPI_Allreduce of some kind cannam@95: if you want all processes to agree about which is going to do I/O. cannam@95: And even then, the standard only guarantees that this process can cannam@95: perform output, but not input. See e.g. Parallel Programming cannam@95: with MPI by P. S. Pacheco, section 8.1.3. Needless to say, in our cannam@95: experience virtually no MPI programmers worry about this.

cannam@95: cannam@95:
cannam@95: cannam@95: cannam@95: