cannam@127: cannam@127: cannam@127: cannam@127: cannam@127:
cannam@127:cannam@127: Next: Avoiding MPI Deadlocks, Previous: FFTW MPI Transposes, Up: Distributed-memory FFTW with MPI [Contents][Index]
cannam@127:FFTW’s “wisdom” facility (see Words of Wisdom-Saving Plans) can cannam@127: be used to save MPI plans as well as to save uniprocessor plans. cannam@127: However, for MPI there are several unavoidable complications. cannam@127:
cannam@127: cannam@127:First, the MPI standard does not guarantee that every process can cannam@127: perform file I/O (at least, not using C stdio routines)—in general, cannam@127: we may only assume that process 0 is capable of I/O.7 So, if we cannam@127: want to export the wisdom from a single process to a file, we must cannam@127: first export the wisdom to a string, then send it to process 0, then cannam@127: write it to a file. cannam@127:
cannam@127:Second, in principle we may want to have separate wisdom for every cannam@127: process, since in general the processes may run on different hardware cannam@127: even for a single MPI program. However, in practice FFTW’s MPI code cannam@127: is designed for the case of homogeneous hardware (see Load balancing), and in this case it is convenient to use the same wisdom cannam@127: for every process. Thus, we need a mechanism to synchronize the wisdom. cannam@127:
cannam@127:To address both of these problems, FFTW provides the following two cannam@127: functions: cannam@127:
cannam@127:void fftw_mpi_broadcast_wisdom(MPI_Comm comm); cannam@127: void fftw_mpi_gather_wisdom(MPI_Comm comm); cannam@127:
Given a communicator comm
, fftw_mpi_broadcast_wisdom
cannam@127: will broadcast the wisdom from process 0 to all other processes.
cannam@127: Conversely, fftw_mpi_gather_wisdom
will collect wisdom from all
cannam@127: processes onto process 0. (If the plans created for the same problem
cannam@127: by different processes are not the same, fftw_mpi_gather_wisdom
cannam@127: will arbitrarily choose one of the plans.) Both of these functions
cannam@127: may result in suboptimal plans for different processes if the
cannam@127: processes are running on non-identical hardware. Both of these
cannam@127: functions are collective calls, which means that they must be
cannam@127: executed by all processes in the communicator.
cannam@127:
cannam@127:
So, for example, a typical code snippet to import wisdom from a file cannam@127: and use it on all processes would be: cannam@127:
cannam@127:{ cannam@127: int rank; cannam@127: cannam@127: fftw_mpi_init(); cannam@127: MPI_Comm_rank(MPI_COMM_WORLD, &rank); cannam@127: if (rank == 0) fftw_import_wisdom_from_filename("mywisdom"); cannam@127: fftw_mpi_broadcast_wisdom(MPI_COMM_WORLD); cannam@127: } cannam@127:
(Note that we must call fftw_mpi_init
before importing any
cannam@127: wisdom that might contain MPI plans.) Similarly, a typical code
cannam@127: snippet to export wisdom from all processes to a file is:
cannam@127:
cannam@127:
{ cannam@127: int rank; cannam@127: cannam@127: fftw_mpi_gather_wisdom(MPI_COMM_WORLD); cannam@127: MPI_Comm_rank(MPI_COMM_WORLD, &rank); cannam@127: if (rank == 0) fftw_export_wisdom_to_filename("mywisdom"); cannam@127: } cannam@127:
In fact,
cannam@127: even this assumption is not technically guaranteed by the standard,
cannam@127: although it seems to be universal in actual MPI implementations and is
cannam@127: widely assumed by MPI-using software. Technically, you need to query
cannam@127: the MPI_IO
attribute of MPI_COMM_WORLD
with
cannam@127: MPI_Attr_get
. If this attribute is MPI_PROC_NULL
, no
cannam@127: I/O is possible. If it is MPI_ANY_SOURCE
, any process can
cannam@127: perform I/O. Otherwise, it is the rank of a process that can perform
cannam@127: I/O ... but since it is not guaranteed to yield the same rank
cannam@127: on all processes, you have to do an MPI_Allreduce
of some kind
cannam@127: if you want all processes to agree about which is going to do I/O.
cannam@127: And even then, the standard only guarantees that this process can
cannam@127: perform output, but not input. See e.g. Parallel Programming
cannam@127: with MPI by P. S. Pacheco, section 8.1.3. Needless to say, in our
cannam@127: experience virtually no MPI programmers worry about this.
cannam@127: Next: Avoiding MPI Deadlocks, Previous: FFTW MPI Transposes, Up: Distributed-memory FFTW with MPI [Contents][Index]
cannam@127: