annotate src/fftw-3.3.3/doc/html/Multi_002ddimensional-MPI-DFTs-of-Real-Data.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 <html lang="en">
Chris@10 2 <head>
Chris@10 3 <title>Multi-dimensional MPI DFTs of Real Data - FFTW 3.3.3</title>
Chris@10 4 <meta http-equiv="Content-Type" content="text/html">
Chris@10 5 <meta name="description" content="FFTW 3.3.3">
Chris@10 6 <meta name="generator" content="makeinfo 4.13">
Chris@10 7 <link title="Top" rel="start" href="index.html#Top">
Chris@10 8 <link rel="up" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" title="Distributed-memory FFTW with MPI">
Chris@10 9 <link rel="prev" href="MPI-Data-Distribution.html#MPI-Data-Distribution" title="MPI Data Distribution">
Chris@10 10 <link rel="next" href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms" title="Other Multi-dimensional Real-data MPI Transforms">
Chris@10 11 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
Chris@10 12 <!--
Chris@10 13 This manual is for FFTW
Chris@10 14 (version 3.3.3, 25 November 2012).
Chris@10 15
Chris@10 16 Copyright (C) 2003 Matteo Frigo.
Chris@10 17
Chris@10 18 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@10 19
Chris@10 20 Permission is granted to make and distribute verbatim copies of
Chris@10 21 this manual provided the copyright notice and this permission
Chris@10 22 notice are preserved on all copies.
Chris@10 23
Chris@10 24 Permission is granted to copy and distribute modified versions of
Chris@10 25 this manual under the conditions for verbatim copying, provided
Chris@10 26 that the entire resulting derived work is distributed under the
Chris@10 27 terms of a permission notice identical to this one.
Chris@10 28
Chris@10 29 Permission is granted to copy and distribute translations of this
Chris@10 30 manual into another language, under the above conditions for
Chris@10 31 modified versions, except that this permission notice may be
Chris@10 32 stated in a translation approved by the Free Software Foundation.
Chris@10 33 -->
Chris@10 34 <meta http-equiv="Content-Style-Type" content="text/css">
Chris@10 35 <style type="text/css"><!--
Chris@10 36 pre.display { font-family:inherit }
Chris@10 37 pre.format { font-family:inherit }
Chris@10 38 pre.smalldisplay { font-family:inherit; font-size:smaller }
Chris@10 39 pre.smallformat { font-family:inherit; font-size:smaller }
Chris@10 40 pre.smallexample { font-size:smaller }
Chris@10 41 pre.smalllisp { font-size:smaller }
Chris@10 42 span.sc { font-variant:small-caps }
Chris@10 43 span.roman { font-family:serif; font-weight:normal; }
Chris@10 44 span.sansserif { font-family:sans-serif; font-weight:normal; }
Chris@10 45 --></style>
Chris@10 46 </head>
Chris@10 47 <body>
Chris@10 48 <div class="node">
Chris@10 49 <a name="Multi-dimensional-MPI-DFTs-of-Real-Data"></a>
Chris@10 50 <a name="Multi_002ddimensional-MPI-DFTs-of-Real-Data"></a>
Chris@10 51 <p>
Chris@10 52 Next:&nbsp;<a rel="next" accesskey="n" href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms">Other Multi-dimensional Real-data MPI Transforms</a>,
Chris@10 53 Previous:&nbsp;<a rel="previous" accesskey="p" href="MPI-Data-Distribution.html#MPI-Data-Distribution">MPI Data Distribution</a>,
Chris@10 54 Up:&nbsp;<a rel="up" accesskey="u" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI">Distributed-memory FFTW with MPI</a>
Chris@10 55 <hr>
Chris@10 56 </div>
Chris@10 57
Chris@10 58 <h3 class="section">6.5 Multi-dimensional MPI DFTs of Real Data</h3>
Chris@10 59
Chris@10 60 <p>FFTW's MPI interface also supports multi-dimensional DFTs of real
Chris@10 61 data, similar to the serial r2c and c2r interfaces. (Parallel
Chris@10 62 one-dimensional real-data DFTs are not currently supported; you must
Chris@10 63 use a complex transform and set the imaginary parts of the inputs to
Chris@10 64 zero.)
Chris@10 65
Chris@10 66 <p>The key points to understand for r2c and c2r MPI transforms (compared
Chris@10 67 to the MPI complex DFTs or the serial r2c/c2r transforms), are:
Chris@10 68
Chris@10 69 <ul>
Chris@10 70 <li>Just as for serial transforms, r2c/c2r DFTs transform n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> real
Chris@10 71 data to/from n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1) complex data: the last dimension of the
Chris@10 72 complex data is cut in half (rounded down), plus one. As for the
Chris@10 73 serial transforms, the sizes you pass to the &lsquo;<samp><span class="samp">plan_dft_r2c</span></samp>&rsquo; and
Chris@10 74 &lsquo;<samp><span class="samp">plan_dft_c2r</span></samp>&rsquo; are the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> dimensions of the real data.
Chris@10 75
Chris@10 76 <li><a name="index-padding-386"></a>Although the real data is <em>conceptually</em> n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub>, it is
Chris@10 77 <em>physically</em> stored as an n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;[2&nbsp;(n<sub>d-1</sub>/2 + 1)] array, where the last
Chris@10 78 dimension has been <em>padded</em> to make it the same size as the
Chris@10 79 complex output. This is much like the in-place serial r2c/c2r
Chris@10 80 interface (see <a href="Multi_002dDimensional-DFTs-of-Real-Data.html#Multi_002dDimensional-DFTs-of-Real-Data">Multi-Dimensional DFTs of Real Data</a>), except that
Chris@10 81 in MPI the padding is required even for out-of-place data. The extra
Chris@10 82 padding numbers are ignored by FFTW (they are <em>not</em> like
Chris@10 83 zero-padding the transform to a larger size); they are only used to
Chris@10 84 determine the data layout.
Chris@10 85
Chris@10 86 <li><a name="index-data-distribution-387"></a>The data distribution in MPI for <em>both</em> the real and complex data
Chris@10 87 is determined by the shape of the <em>complex</em> data. That is, you
Chris@10 88 call the appropriate &lsquo;<samp><span class="samp">local size</span></samp>&rsquo; function for the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1)
Chris@10 89
Chris@10 90 <p>complex data, and then use the <em>same</em> distribution for the real
Chris@10 91 data except that the last complex dimension is replaced by a (padded)
Chris@10 92 real dimension of twice the length.
Chris@10 93
Chris@10 94 </ul>
Chris@10 95
Chris@10 96 <p>For example suppose we are performing an out-of-place r2c transform of
Chris@10 97 L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N real data [padded to L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1)],
Chris@10 98 resulting in L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N/2+1 complex data. Similar to the
Chris@10 99 example in <a href="2d-MPI-example.html#g_t2d-MPI-example">2d MPI example</a>, we might do something like:
Chris@10 100
Chris@10 101 <pre class="example"> #include &lt;fftw3-mpi.h&gt;
Chris@10 102
Chris@10 103 int main(int argc, char **argv)
Chris@10 104 {
Chris@10 105 const ptrdiff_t L = ..., M = ..., N = ...;
Chris@10 106 fftw_plan plan;
Chris@10 107 double *rin;
Chris@10 108 fftw_complex *cout;
Chris@10 109 ptrdiff_t alloc_local, local_n0, local_0_start, i, j, k;
Chris@10 110
Chris@10 111 MPI_Init(&amp;argc, &amp;argv);
Chris@10 112 fftw_mpi_init();
Chris@10 113
Chris@10 114 /* <span class="roman">get local data size and allocate</span> */
Chris@10 115 alloc_local = fftw_mpi_local_size_3d(L, M, N/2+1, MPI_COMM_WORLD,
Chris@10 116 &amp;local_n0, &amp;local_0_start);
Chris@10 117 rin = fftw_alloc_real(2 * alloc_local);
Chris@10 118 cout = fftw_alloc_complex(alloc_local);
Chris@10 119
Chris@10 120 /* <span class="roman">create plan for out-of-place r2c DFT</span> */
Chris@10 121 plan = fftw_mpi_plan_dft_r2c_3d(L, M, N, rin, cout, MPI_COMM_WORLD,
Chris@10 122 FFTW_MEASURE);
Chris@10 123
Chris@10 124 /* <span class="roman">initialize rin to some function</span> my_func(x,y,z) */
Chris@10 125 for (i = 0; i &lt; local_n0; ++i)
Chris@10 126 for (j = 0; j &lt; M; ++j)
Chris@10 127 for (k = 0; k &lt; N; ++k)
Chris@10 128 rin[(i*M + j) * (2*(N/2+1)) + k] = my_func(local_0_start+i, j, k);
Chris@10 129
Chris@10 130 /* <span class="roman">compute transforms as many times as desired</span> */
Chris@10 131 fftw_execute(plan);
Chris@10 132
Chris@10 133 fftw_destroy_plan(plan);
Chris@10 134
Chris@10 135 MPI_Finalize();
Chris@10 136 }
Chris@10 137 </pre>
Chris@10 138 <p><a name="index-fftw_005falloc_005freal-388"></a><a name="index-row_002dmajor-389"></a>Note that we allocated <code>rin</code> using <code>fftw_alloc_real</code> with an
Chris@10 139 argument of <code>2 * alloc_local</code>: since <code>alloc_local</code> is the
Chris@10 140 number of <em>complex</em> values to allocate, the number of <em>real</em>
Chris@10 141 values is twice as many. The <code>rin</code> array is then
Chris@10 142 local_n0&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1) in row-major order, so its
Chris@10 143 <code>(i,j,k)</code> element is at the index <code>(i*M + j) * (2*(N/2+1)) +
Chris@10 144 k</code> (see <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>).
Chris@10 145
Chris@10 146 <p><a name="index-transpose-390"></a><a name="index-FFTW_005fTRANSPOSED_005fOUT-391"></a><a name="index-FFTW_005fTRANSPOSED_005fIN-392"></a>As for the complex transforms, improved performance can be obtained by
Chris@10 147 specifying that the output is the transpose of the input or vice versa
Chris@10 148 (see <a href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>). In our L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N r2c
Chris@10 149 example, including <code>FFTW_TRANSPOSED_OUT</code> in the flags means that
Chris@10 150 the input would be a padded L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1) real array
Chris@10 151 distributed over the <code>L</code> dimension, while the output would be a
Chris@10 152 M&nbsp;&times;&nbsp;L&nbsp;&times;&nbsp;N/2+1 complex array distributed over the <code>M</code>
Chris@10 153 dimension. To perform the inverse c2r transform with the same data
Chris@10 154 distributions, you would use the <code>FFTW_TRANSPOSED_IN</code> flag.
Chris@10 155
Chris@10 156 <!-- -->
Chris@10 157 </body></html>
Chris@10 158