Chris@42: Chris@42: Chris@42: Chris@42: Chris@42:
Chris@42:Chris@42: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
Chris@42:In this chapter we document the parallel FFTW routines for parallel Chris@42: systems supporting the MPI message-passing interface. Unlike the Chris@42: shared-memory threads described in the previous chapter, MPI allows Chris@42: you to use distributed-memory parallelism, where each CPU has Chris@42: its own separate memory, and which can scale up to clusters of many Chris@42: thousands of processors. This capability comes at a price, however: Chris@42: each process only stores a portion of the data to be Chris@42: transformed, which means that the data structures and Chris@42: programming-interface are quite different from the serial or threads Chris@42: versions of FFTW. Chris@42: Chris@42:
Chris@42: Chris@42:Distributed-memory parallelism is especially useful when you are Chris@42: transforming arrays so large that they do not fit into the memory of a Chris@42: single processor. The storage per-process required by FFTW’s MPI Chris@42: routines is proportional to the total array size divided by the number Chris@42: of processes. Conversely, distributed-memory parallelism can easily Chris@42: pose an unacceptably high communications overhead for small problems; Chris@42: the threshold problem size for which parallelism becomes advantageous Chris@42: will depend on the precise problem you are interested in, your Chris@42: hardware, and your MPI implementation. Chris@42:
Chris@42:A note on terminology: in MPI, you divide the data among a set of
Chris@42: “processes” which each run in their own memory address space.
Chris@42: Generally, each process runs on a different physical processor, but
Chris@42: this is not required. A set of processes in MPI is described by an
Chris@42: opaque data structure called a “communicator,” the most common of
Chris@42: which is the predefined communicator MPI_COMM_WORLD
which
Chris@42: refers to all processes. For more information on these and
Chris@42: other concepts common to all MPI programs, we refer the reader to the
Chris@42: documentation at the MPI home
Chris@42: page.
Chris@42:
Chris@42:
Chris@42:
We assume in this chapter that the reader is familiar with the usage Chris@42: of the serial (uniprocessor) FFTW, and focus only on the concepts new Chris@42: to the MPI interface. Chris@42:
Chris@42:• FFTW MPI Installation: | Chris@42: | |
• Linking and Initializing MPI FFTW: | Chris@42: | |
• 2d MPI example: | Chris@42: | |
• MPI Data Distribution: | Chris@42: | |
• Multi-dimensional MPI DFTs of Real Data: | Chris@42: | |
• Other Multi-dimensional Real-data MPI Transforms: | Chris@42: | |
• FFTW MPI Transposes: | Chris@42: | |
• FFTW MPI Wisdom: | Chris@42: | |
• Avoiding MPI Deadlocks: | Chris@42: | |
• FFTW MPI Performance Tips: | Chris@42: | |
• Combining MPI and Threads: | Chris@42: | |
• FFTW MPI Reference: | Chris@42: | |
• FFTW MPI Fortran Interface: | Chris@42: |
Chris@42: Next: Calling FFTW from Modern Fortran, Previous: Multi-threaded FFTW, Up: Top [Contents][Index]
Chris@42: