annotate src/fftw-3.3.3/doc/html/MPI-Plan-Creation.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 <html lang="en">
Chris@10 2 <head>
Chris@10 3 <title>MPI Plan Creation - FFTW 3.3.3</title>
Chris@10 4 <meta http-equiv="Content-Type" content="text/html">
Chris@10 5 <meta name="description" content="FFTW 3.3.3">
Chris@10 6 <meta name="generator" content="makeinfo 4.13">
Chris@10 7 <link title="Top" rel="start" href="index.html#Top">
Chris@10 8 <link rel="up" href="FFTW-MPI-Reference.html#FFTW-MPI-Reference" title="FFTW MPI Reference">
Chris@10 9 <link rel="prev" href="MPI-Data-Distribution-Functions.html#MPI-Data-Distribution-Functions" title="MPI Data Distribution Functions">
Chris@10 10 <link rel="next" href="MPI-Wisdom-Communication.html#MPI-Wisdom-Communication" title="MPI Wisdom Communication">
Chris@10 11 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
Chris@10 12 <!--
Chris@10 13 This manual is for FFTW
Chris@10 14 (version 3.3.3, 25 November 2012).
Chris@10 15
Chris@10 16 Copyright (C) 2003 Matteo Frigo.
Chris@10 17
Chris@10 18 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@10 19
Chris@10 20 Permission is granted to make and distribute verbatim copies of
Chris@10 21 this manual provided the copyright notice and this permission
Chris@10 22 notice are preserved on all copies.
Chris@10 23
Chris@10 24 Permission is granted to copy and distribute modified versions of
Chris@10 25 this manual under the conditions for verbatim copying, provided
Chris@10 26 that the entire resulting derived work is distributed under the
Chris@10 27 terms of a permission notice identical to this one.
Chris@10 28
Chris@10 29 Permission is granted to copy and distribute translations of this
Chris@10 30 manual into another language, under the above conditions for
Chris@10 31 modified versions, except that this permission notice may be
Chris@10 32 stated in a translation approved by the Free Software Foundation.
Chris@10 33 -->
Chris@10 34 <meta http-equiv="Content-Style-Type" content="text/css">
Chris@10 35 <style type="text/css"><!--
Chris@10 36 pre.display { font-family:inherit }
Chris@10 37 pre.format { font-family:inherit }
Chris@10 38 pre.smalldisplay { font-family:inherit; font-size:smaller }
Chris@10 39 pre.smallformat { font-family:inherit; font-size:smaller }
Chris@10 40 pre.smallexample { font-size:smaller }
Chris@10 41 pre.smalllisp { font-size:smaller }
Chris@10 42 span.sc { font-variant:small-caps }
Chris@10 43 span.roman { font-family:serif; font-weight:normal; }
Chris@10 44 span.sansserif { font-family:sans-serif; font-weight:normal; }
Chris@10 45 --></style>
Chris@10 46 </head>
Chris@10 47 <body>
Chris@10 48 <div class="node">
Chris@10 49 <a name="MPI-Plan-Creation"></a>
Chris@10 50 <p>
Chris@10 51 Next:&nbsp;<a rel="next" accesskey="n" href="MPI-Wisdom-Communication.html#MPI-Wisdom-Communication">MPI Wisdom Communication</a>,
Chris@10 52 Previous:&nbsp;<a rel="previous" accesskey="p" href="MPI-Data-Distribution-Functions.html#MPI-Data-Distribution-Functions">MPI Data Distribution Functions</a>,
Chris@10 53 Up:&nbsp;<a rel="up" accesskey="u" href="FFTW-MPI-Reference.html#FFTW-MPI-Reference">FFTW MPI Reference</a>
Chris@10 54 <hr>
Chris@10 55 </div>
Chris@10 56
Chris@10 57 <h4 class="subsection">6.12.5 MPI Plan Creation</h4>
Chris@10 58
Chris@10 59 <h5 class="subsubheading">Complex-data MPI DFTs</h5>
Chris@10 60
Chris@10 61 <p>Plans for complex-data DFTs (see <a href="2d-MPI-example.html#g_t2d-MPI-example">2d MPI example</a>) are created by:
Chris@10 62
Chris@10 63 <p><a name="index-fftw_005fmpi_005fplan_005fdft_005f1d-461"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005f2d-462"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005f3d-463"></a><a name="index-fftw_005fmpi_005fplan_005fdft-464"></a><a name="index-fftw_005fmpi_005fplan_005fmany_005fdft-465"></a>
Chris@10 64 <pre class="example"> fftw_plan fftw_mpi_plan_dft_1d(ptrdiff_t n0, fftw_complex *in, fftw_complex *out,
Chris@10 65 MPI_Comm comm, int sign, unsigned flags);
Chris@10 66 fftw_plan fftw_mpi_plan_dft_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 67 fftw_complex *in, fftw_complex *out,
Chris@10 68 MPI_Comm comm, int sign, unsigned flags);
Chris@10 69 fftw_plan fftw_mpi_plan_dft_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@10 70 fftw_complex *in, fftw_complex *out,
Chris@10 71 MPI_Comm comm, int sign, unsigned flags);
Chris@10 72 fftw_plan fftw_mpi_plan_dft(int rnk, const ptrdiff_t *n,
Chris@10 73 fftw_complex *in, fftw_complex *out,
Chris@10 74 MPI_Comm comm, int sign, unsigned flags);
Chris@10 75 fftw_plan fftw_mpi_plan_many_dft(int rnk, const ptrdiff_t *n,
Chris@10 76 ptrdiff_t howmany, ptrdiff_t block, ptrdiff_t tblock,
Chris@10 77 fftw_complex *in, fftw_complex *out,
Chris@10 78 MPI_Comm comm, int sign, unsigned flags);
Chris@10 79 </pre>
Chris@10 80 <p><a name="index-MPI-communicator-466"></a><a name="index-collective-function-467"></a>These are similar to their serial counterparts (see <a href="Complex-DFTs.html#Complex-DFTs">Complex DFTs</a>)
Chris@10 81 in specifying the dimensions, sign, and flags of the transform. The
Chris@10 82 <code>comm</code> argument gives an MPI communicator that specifies the set
Chris@10 83 of processes to participate in the transform; plan creation is a
Chris@10 84 collective function that must be called for all processes in the
Chris@10 85 communicator. The <code>in</code> and <code>out</code> pointers refer only to a
Chris@10 86 portion of the overall transform data (see <a href="MPI-Data-Distribution.html#MPI-Data-Distribution">MPI Data Distribution</a>)
Chris@10 87 as specified by the &lsquo;<samp><span class="samp">local_size</span></samp>&rsquo; functions in the previous
Chris@10 88 section. Unless <code>flags</code> contains <code>FFTW_ESTIMATE</code>, these
Chris@10 89 arrays are overwritten during plan creation as for the serial
Chris@10 90 interface. For multi-dimensional transforms, any dimensions <code>&gt;
Chris@10 91 1</code> are supported; for one-dimensional transforms, only composite
Chris@10 92 (non-prime) <code>n0</code> are currently supported (unlike the serial
Chris@10 93 FFTW). Requesting an unsupported transform size will yield a
Chris@10 94 <code>NULL</code> plan. (As in the serial interface, highly composite sizes
Chris@10 95 generally yield the best performance.)
Chris@10 96
Chris@10 97 <p><a name="index-advanced-interface-468"></a><a name="index-FFTW_005fMPI_005fDEFAULT_005fBLOCK-469"></a><a name="index-stride-470"></a>The advanced-interface <code>fftw_mpi_plan_many_dft</code> additionally
Chris@10 98 allows you to specify the block sizes for the first dimension
Chris@10 99 (<code>block</code>) of the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> input data and the first dimension
Chris@10 100 (<code>tblock</code>) of the n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&hellip;&times;&nbsp;n<sub>d-1</sub> transposed data (at intermediate
Chris@10 101 steps of the transform, and for the output if
Chris@10 102 <code>FFTW_TRANSPOSED_OUT</code> is specified in <code>flags</code>). These must
Chris@10 103 be the same block sizes as were passed to the corresponding
Chris@10 104 &lsquo;<samp><span class="samp">local_size</span></samp>&rsquo; function; you can pass <code>FFTW_MPI_DEFAULT_BLOCK</code>
Chris@10 105 to use FFTW's default block size as in the basic interface. Also, the
Chris@10 106 <code>howmany</code> parameter specifies that the transform is of contiguous
Chris@10 107 <code>howmany</code>-tuples rather than individual complex numbers; this
Chris@10 108 corresponds to the same parameter in the serial advanced interface
Chris@10 109 (see <a href="Advanced-Complex-DFTs.html#Advanced-Complex-DFTs">Advanced Complex DFTs</a>) with <code>stride = howmany</code> and
Chris@10 110 <code>dist = 1</code>.
Chris@10 111
Chris@10 112 <h5 class="subsubheading">MPI flags</h5>
Chris@10 113
Chris@10 114 <p>The <code>flags</code> can be any of those for the serial FFTW
Chris@10 115 (see <a href="Planner-Flags.html#Planner-Flags">Planner Flags</a>), and in addition may include one or more of
Chris@10 116 the following MPI-specific flags, which improve performance at the
Chris@10 117 cost of changing the output or input data formats.
Chris@10 118
Chris@10 119 <ul>
Chris@10 120 <li><a name="index-FFTW_005fMPI_005fSCRAMBLED_005fOUT-471"></a><a name="index-FFTW_005fMPI_005fSCRAMBLED_005fIN-472"></a><code>FFTW_MPI_SCRAMBLED_OUT</code>, <code>FFTW_MPI_SCRAMBLED_IN</code>: valid for
Chris@10 121 1d transforms only, these flags indicate that the output/input of the
Chris@10 122 transform are in an undocumented &ldquo;scrambled&rdquo; order. A forward
Chris@10 123 <code>FFTW_MPI_SCRAMBLED_OUT</code> transform can be inverted by a backward
Chris@10 124 <code>FFTW_MPI_SCRAMBLED_IN</code> (times the usual 1/<i>N</i> normalization).
Chris@10 125 See <a href="One_002ddimensional-distributions.html#One_002ddimensional-distributions">One-dimensional distributions</a>.
Chris@10 126
Chris@10 127 <li><a name="index-FFTW_005fMPI_005fTRANSPOSED_005fOUT-473"></a><a name="index-FFTW_005fMPI_005fTRANSPOSED_005fIN-474"></a><code>FFTW_MPI_TRANSPOSED_OUT</code>, <code>FFTW_MPI_TRANSPOSED_IN</code>: valid
Chris@10 128 for multidimensional (<code>rnk &gt; 1</code>) transforms only, these flags
Chris@10 129 specify that the output or input of an n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> transform is
Chris@10 130 transposed to n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&hellip;&times;&nbsp;n<sub>d-1</sub>. See <a href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>.
Chris@10 131
Chris@10 132 </ul>
Chris@10 133
Chris@10 134 <h5 class="subsubheading">Real-data MPI DFTs</h5>
Chris@10 135
Chris@10 136 <p><a name="index-r2c-475"></a>Plans for real-input/output (r2c/c2r) DFTs (see <a href="Multi_002ddimensional-MPI-DFTs-of-Real-Data.html#Multi_002ddimensional-MPI-DFTs-of-Real-Data">Multi-dimensional MPI DFTs of Real Data</a>) are created by:
Chris@10 137
Chris@10 138 <p><a name="index-fftw_005fmpi_005fplan_005fdft_005fr2c_005f2d-476"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fr2c_005f2d-477"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fr2c_005f3d-478"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fr2c-479"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fc2r_005f2d-480"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fc2r_005f2d-481"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fc2r_005f3d-482"></a><a name="index-fftw_005fmpi_005fplan_005fdft_005fc2r-483"></a>
Chris@10 139 <pre class="example"> fftw_plan fftw_mpi_plan_dft_r2c_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 140 double *in, fftw_complex *out,
Chris@10 141 MPI_Comm comm, unsigned flags);
Chris@10 142 fftw_plan fftw_mpi_plan_dft_r2c_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 143 double *in, fftw_complex *out,
Chris@10 144 MPI_Comm comm, unsigned flags);
Chris@10 145 fftw_plan fftw_mpi_plan_dft_r2c_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@10 146 double *in, fftw_complex *out,
Chris@10 147 MPI_Comm comm, unsigned flags);
Chris@10 148 fftw_plan fftw_mpi_plan_dft_r2c(int rnk, const ptrdiff_t *n,
Chris@10 149 double *in, fftw_complex *out,
Chris@10 150 MPI_Comm comm, unsigned flags);
Chris@10 151 fftw_plan fftw_mpi_plan_dft_c2r_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 152 fftw_complex *in, double *out,
Chris@10 153 MPI_Comm comm, unsigned flags);
Chris@10 154 fftw_plan fftw_mpi_plan_dft_c2r_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 155 fftw_complex *in, double *out,
Chris@10 156 MPI_Comm comm, unsigned flags);
Chris@10 157 fftw_plan fftw_mpi_plan_dft_c2r_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@10 158 fftw_complex *in, double *out,
Chris@10 159 MPI_Comm comm, unsigned flags);
Chris@10 160 fftw_plan fftw_mpi_plan_dft_c2r(int rnk, const ptrdiff_t *n,
Chris@10 161 fftw_complex *in, double *out,
Chris@10 162 MPI_Comm comm, unsigned flags);
Chris@10 163 </pre>
Chris@10 164 <p>Similar to the serial interface (see <a href="Real_002ddata-DFTs.html#Real_002ddata-DFTs">Real-data DFTs</a>), these
Chris@10 165 transform logically n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> real data to/from n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1) complex
Chris@10 166 data, representing the non-redundant half of the conjugate-symmetry
Chris@10 167 output of a real-input DFT (see <a href="Multi_002ddimensional-Transforms.html#Multi_002ddimensional-Transforms">Multi-dimensional Transforms</a>).
Chris@10 168 However, the real array must be stored within a padded n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;[2&nbsp;(n<sub>d-1</sub>/2 + 1)]
Chris@10 169
Chris@10 170 <p>array (much like the in-place serial r2c transforms, but here for
Chris@10 171 out-of-place transforms as well). Currently, only multi-dimensional
Chris@10 172 (<code>rnk &gt; 1</code>) r2c/c2r transforms are supported (requesting a plan
Chris@10 173 for <code>rnk = 1</code> will yield <code>NULL</code>). As explained above
Chris@10 174 (see <a href="Multi_002ddimensional-MPI-DFTs-of-Real-Data.html#Multi_002ddimensional-MPI-DFTs-of-Real-Data">Multi-dimensional MPI DFTs of Real Data</a>), the data
Chris@10 175 distribution of both the real and complex arrays is given by the
Chris@10 176 &lsquo;<samp><span class="samp">local_size</span></samp>&rsquo; function called for the dimensions of the
Chris@10 177 <em>complex</em> array. Similar to the other planning functions, the
Chris@10 178 input and output arrays are overwritten when the plan is created
Chris@10 179 except in <code>FFTW_ESTIMATE</code> mode.
Chris@10 180
Chris@10 181 <p>As for the complex DFTs above, there is an advance interface that
Chris@10 182 allows you to manually specify block sizes and to transform contiguous
Chris@10 183 <code>howmany</code>-tuples of real/complex numbers:
Chris@10 184
Chris@10 185 <p><a name="index-fftw_005fmpi_005fplan_005fmany_005fdft_005fr2c-484"></a><a name="index-fftw_005fmpi_005fplan_005fmany_005fdft_005fc2r-485"></a>
Chris@10 186 <pre class="example"> fftw_plan fftw_mpi_plan_many_dft_r2c
Chris@10 187 (int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@10 188 ptrdiff_t iblock, ptrdiff_t oblock,
Chris@10 189 double *in, fftw_complex *out,
Chris@10 190 MPI_Comm comm, unsigned flags);
Chris@10 191 fftw_plan fftw_mpi_plan_many_dft_c2r
Chris@10 192 (int rnk, const ptrdiff_t *n, ptrdiff_t howmany,
Chris@10 193 ptrdiff_t iblock, ptrdiff_t oblock,
Chris@10 194 fftw_complex *in, double *out,
Chris@10 195 MPI_Comm comm, unsigned flags);
Chris@10 196 </pre>
Chris@10 197 <h5 class="subsubheading">MPI r2r transforms</h5>
Chris@10 198
Chris@10 199 <p><a name="index-r2r-486"></a>There are corresponding plan-creation routines for r2r
Chris@10 200 transforms (see <a href="More-DFTs-of-Real-Data.html#More-DFTs-of-Real-Data">More DFTs of Real Data</a>), currently supporting
Chris@10 201 multidimensional (<code>rnk &gt; 1</code>) transforms only (<code>rnk = 1</code> will
Chris@10 202 yield a <code>NULL</code> plan):
Chris@10 203
Chris@10 204 <pre class="example"> fftw_plan fftw_mpi_plan_r2r_2d(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 205 double *in, double *out,
Chris@10 206 MPI_Comm comm,
Chris@10 207 fftw_r2r_kind kind0, fftw_r2r_kind kind1,
Chris@10 208 unsigned flags);
Chris@10 209 fftw_plan fftw_mpi_plan_r2r_3d(ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t n2,
Chris@10 210 double *in, double *out,
Chris@10 211 MPI_Comm comm,
Chris@10 212 fftw_r2r_kind kind0, fftw_r2r_kind kind1, fftw_r2r_kind kind2,
Chris@10 213 unsigned flags);
Chris@10 214 fftw_plan fftw_mpi_plan_r2r(int rnk, const ptrdiff_t *n,
Chris@10 215 double *in, double *out,
Chris@10 216 MPI_Comm comm, const fftw_r2r_kind *kind,
Chris@10 217 unsigned flags);
Chris@10 218 fftw_plan fftw_mpi_plan_many_r2r(int rnk, const ptrdiff_t *n,
Chris@10 219 ptrdiff_t iblock, ptrdiff_t oblock,
Chris@10 220 double *in, double *out,
Chris@10 221 MPI_Comm comm, const fftw_r2r_kind *kind,
Chris@10 222 unsigned flags);
Chris@10 223 </pre>
Chris@10 224 <p>The parameters are much the same as for the complex DFTs above, except
Chris@10 225 that the arrays are of real numbers (and hence the outputs of the
Chris@10 226 &lsquo;<samp><span class="samp">local_size</span></samp>&rsquo; data-distribution functions should be interpreted as
Chris@10 227 counts of real rather than complex numbers). Also, the <code>kind</code>
Chris@10 228 parameters specify the r2r kinds along each dimension as for the
Chris@10 229 serial interface (see <a href="Real_002dto_002dReal-Transform-Kinds.html#Real_002dto_002dReal-Transform-Kinds">Real-to-Real Transform Kinds</a>). See <a href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms">Other Multi-dimensional Real-data MPI Transforms</a>.
Chris@10 230
Chris@10 231 <h5 class="subsubheading">MPI transposition</h5>
Chris@10 232
Chris@10 233 <p><a name="index-transpose-487"></a>
Chris@10 234 FFTW also provides routines to plan a transpose of a distributed
Chris@10 235 <code>n0</code> by <code>n1</code> array of real numbers, or an array of
Chris@10 236 <code>howmany</code>-tuples of real numbers with specified block sizes
Chris@10 237 (see <a href="FFTW-MPI-Transposes.html#FFTW-MPI-Transposes">FFTW MPI Transposes</a>):
Chris@10 238
Chris@10 239 <p><a name="index-fftw_005fmpi_005fplan_005ftranspose-488"></a><a name="index-fftw_005fmpi_005fplan_005fmany_005ftranspose-489"></a>
Chris@10 240 <pre class="example"> fftw_plan fftw_mpi_plan_transpose(ptrdiff_t n0, ptrdiff_t n1,
Chris@10 241 double *in, double *out,
Chris@10 242 MPI_Comm comm, unsigned flags);
Chris@10 243 fftw_plan fftw_mpi_plan_many_transpose
Chris@10 244 (ptrdiff_t n0, ptrdiff_t n1, ptrdiff_t howmany,
Chris@10 245 ptrdiff_t block0, ptrdiff_t block1,
Chris@10 246 double *in, double *out, MPI_Comm comm, unsigned flags);
Chris@10 247 </pre>
Chris@10 248 <p><a name="index-new_002darray-execution-490"></a><a name="index-fftw_005fmpi_005fexecute_005fr2r-491"></a>These plans are used with the <code>fftw_mpi_execute_r2r</code> new-array
Chris@10 249 execute function (see <a href="Using-MPI-Plans.html#Using-MPI-Plans">Using MPI Plans</a>), since they count as (rank
Chris@10 250 zero) r2r plans from FFTW's perspective.
Chris@10 251
Chris@10 252 </body></html>
Chris@10 253