annotate src/fftw-3.3.8/doc/html/Multi_002ddimensional-MPI-DFTs-of-Real-Data.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents d0c2a83c1364
children
rev   line source
Chris@82 1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
Chris@82 2 <html>
Chris@82 3 <!-- This manual is for FFTW
Chris@82 4 (version 3.3.8, 24 May 2018).
Chris@82 5
Chris@82 6 Copyright (C) 2003 Matteo Frigo.
Chris@82 7
Chris@82 8 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@82 9
Chris@82 10 Permission is granted to make and distribute verbatim copies of this
Chris@82 11 manual provided the copyright notice and this permission notice are
Chris@82 12 preserved on all copies.
Chris@82 13
Chris@82 14 Permission is granted to copy and distribute modified versions of this
Chris@82 15 manual under the conditions for verbatim copying, provided that the
Chris@82 16 entire resulting derived work is distributed under the terms of a
Chris@82 17 permission notice identical to this one.
Chris@82 18
Chris@82 19 Permission is granted to copy and distribute translations of this manual
Chris@82 20 into another language, under the above conditions for modified versions,
Chris@82 21 except that this permission notice may be stated in a translation
Chris@82 22 approved by the Free Software Foundation. -->
Chris@82 23 <!-- Created by GNU Texinfo 6.3, http://www.gnu.org/software/texinfo/ -->
Chris@82 24 <head>
Chris@82 25 <title>FFTW 3.3.8: Multi-dimensional MPI DFTs of Real Data</title>
Chris@82 26
Chris@82 27 <meta name="description" content="FFTW 3.3.8: Multi-dimensional MPI DFTs of Real Data">
Chris@82 28 <meta name="keywords" content="FFTW 3.3.8: Multi-dimensional MPI DFTs of Real Data">
Chris@82 29 <meta name="resource-type" content="document">
Chris@82 30 <meta name="distribution" content="global">
Chris@82 31 <meta name="Generator" content="makeinfo">
Chris@82 32 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Chris@82 33 <link href="index.html#Top" rel="start" title="Top">
Chris@82 34 <link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
Chris@82 35 <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
Chris@82 36 <link href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" rel="up" title="Distributed-memory FFTW with MPI">
Chris@82 37 <link href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms" rel="next" title="Other Multi-dimensional Real-data MPI Transforms">
Chris@82 38 <link href="One_002ddimensional-distributions.html#One_002ddimensional-distributions" rel="prev" title="One-dimensional distributions">
Chris@82 39 <style type="text/css">
Chris@82 40 <!--
Chris@82 41 a.summary-letter {text-decoration: none}
Chris@82 42 blockquote.indentedblock {margin-right: 0em}
Chris@82 43 blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
Chris@82 44 blockquote.smallquotation {font-size: smaller}
Chris@82 45 div.display {margin-left: 3.2em}
Chris@82 46 div.example {margin-left: 3.2em}
Chris@82 47 div.lisp {margin-left: 3.2em}
Chris@82 48 div.smalldisplay {margin-left: 3.2em}
Chris@82 49 div.smallexample {margin-left: 3.2em}
Chris@82 50 div.smalllisp {margin-left: 3.2em}
Chris@82 51 kbd {font-style: oblique}
Chris@82 52 pre.display {font-family: inherit}
Chris@82 53 pre.format {font-family: inherit}
Chris@82 54 pre.menu-comment {font-family: serif}
Chris@82 55 pre.menu-preformatted {font-family: serif}
Chris@82 56 pre.smalldisplay {font-family: inherit; font-size: smaller}
Chris@82 57 pre.smallexample {font-size: smaller}
Chris@82 58 pre.smallformat {font-family: inherit; font-size: smaller}
Chris@82 59 pre.smalllisp {font-size: smaller}
Chris@82 60 span.nolinebreak {white-space: nowrap}
Chris@82 61 span.roman {font-family: initial; font-weight: normal}
Chris@82 62 span.sansserif {font-family: sans-serif; font-weight: normal}
Chris@82 63 ul.no-bullet {list-style: none}
Chris@82 64 -->
Chris@82 65 </style>
Chris@82 66
Chris@82 67
Chris@82 68 </head>
Chris@82 69
Chris@82 70 <body lang="en">
Chris@82 71 <a name="Multi_002ddimensional-MPI-DFTs-of-Real-Data"></a>
Chris@82 72 <div class="header">
Chris@82 73 <p>
Chris@82 74 Next: <a href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms" accesskey="n" rel="next">Other Multi-dimensional Real-data MPI Transforms</a>, Previous: <a href="MPI-Data-Distribution.html#MPI-Data-Distribution" accesskey="p" rel="prev">MPI Data Distribution</a>, Up: <a href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" accesskey="u" rel="up">Distributed-memory FFTW with MPI</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
Chris@82 75 </div>
Chris@82 76 <hr>
Chris@82 77 <a name="Multi_002ddimensional-MPI-DFTs-of-Real-Data-1"></a>
Chris@82 78 <h3 class="section">6.5 Multi-dimensional MPI DFTs of Real Data</h3>
Chris@82 79
Chris@82 80 <p>FFTW&rsquo;s MPI interface also supports multi-dimensional DFTs of real
Chris@82 81 data, similar to the serial r2c and c2r interfaces. (Parallel
Chris@82 82 one-dimensional real-data DFTs are not currently supported; you must
Chris@82 83 use a complex transform and set the imaginary parts of the inputs to
Chris@82 84 zero.)
Chris@82 85 </p>
Chris@82 86 <p>The key points to understand for r2c and c2r MPI transforms (compared
Chris@82 87 to the MPI complex DFTs or the serial r2c/c2r transforms), are:
Chris@82 88 </p>
Chris@82 89 <ul>
Chris@82 90 <li> Just as for serial transforms, r2c/c2r DFTs transform n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub>
Chris@82 91 real
Chris@82 92 data to/from n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1)
Chris@82 93 complex data: the last dimension of the
Chris@82 94 complex data is cut in half (rounded down), plus one. As for the
Chris@82 95 serial transforms, the sizes you pass to the &lsquo;<samp>plan_dft_r2c</samp>&rsquo; and
Chris@82 96 &lsquo;<samp>plan_dft_c2r</samp>&rsquo; are the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub>
Chris@82 97 dimensions of the real data.
Chris@82 98
Chris@82 99 </li><li> <a name="index-padding-4"></a>
Chris@82 100 Although the real data is <em>conceptually</em> n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub>
Chris@82 101 , it is
Chris@82 102 <em>physically</em> stored as an n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;[2&nbsp;(n<sub>d-1</sub>/2 + 1)]
Chris@82 103 array, where the last
Chris@82 104 dimension has been <em>padded</em> to make it the same size as the
Chris@82 105 complex output. This is much like the in-place serial r2c/c2r
Chris@82 106 interface (see <a href="Multi_002dDimensional-DFTs-of-Real-Data.html#Multi_002dDimensional-DFTs-of-Real-Data">Multi-Dimensional DFTs of Real Data</a>), except that
Chris@82 107 in MPI the padding is required even for out-of-place data. The extra
Chris@82 108 padding numbers are ignored by FFTW (they are <em>not</em> like
Chris@82 109 zero-padding the transform to a larger size); they are only used to
Chris@82 110 determine the data layout.
Chris@82 111
Chris@82 112 </li><li> <a name="index-data-distribution-3"></a>
Chris@82 113 The data distribution in MPI for <em>both</em> the real and complex data
Chris@82 114 is determined by the shape of the <em>complex</em> data. That is, you
Chris@82 115 call the appropriate &lsquo;<samp>local size</samp>&rsquo; function for the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1)
Chris@82 116
Chris@82 117 complex data, and then use the <em>same</em> distribution for the real
Chris@82 118 data except that the last complex dimension is replaced by a (padded)
Chris@82 119 real dimension of twice the length.
Chris@82 120
Chris@82 121 </li></ul>
Chris@82 122
Chris@82 123 <p>For example suppose we are performing an out-of-place r2c transform of
Chris@82 124 L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N
Chris@82 125 real data [padded to L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1)
Chris@82 126 ],
Chris@82 127 resulting in L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N/2+1
Chris@82 128 complex data. Similar to the
Chris@82 129 example in <a href="2d-MPI-example.html#g_t2d-MPI-example">2d MPI example</a>, we might do something like:
Chris@82 130 </p>
Chris@82 131 <div class="example">
Chris@82 132 <pre class="example">#include &lt;fftw3-mpi.h&gt;
Chris@82 133
Chris@82 134 int main(int argc, char **argv)
Chris@82 135 {
Chris@82 136 const ptrdiff_t L = ..., M = ..., N = ...;
Chris@82 137 fftw_plan plan;
Chris@82 138 double *rin;
Chris@82 139 fftw_complex *cout;
Chris@82 140 ptrdiff_t alloc_local, local_n0, local_0_start, i, j, k;
Chris@82 141
Chris@82 142 MPI_Init(&amp;argc, &amp;argv);
Chris@82 143 fftw_mpi_init();
Chris@82 144
Chris@82 145 /* <span class="roman">get local data size and allocate</span> */
Chris@82 146 alloc_local = fftw_mpi_local_size_3d(L, M, N/2+1, MPI_COMM_WORLD,
Chris@82 147 &amp;local_n0, &amp;local_0_start);
Chris@82 148 rin = fftw_alloc_real(2 * alloc_local);
Chris@82 149 cout = fftw_alloc_complex(alloc_local);
Chris@82 150
Chris@82 151 /* <span class="roman">create plan for out-of-place r2c DFT</span> */
Chris@82 152 plan = fftw_mpi_plan_dft_r2c_3d(L, M, N, rin, cout, MPI_COMM_WORLD,
Chris@82 153 FFTW_MEASURE);
Chris@82 154
Chris@82 155 /* <span class="roman">initialize rin to some function</span> my_func(x,y,z) */
Chris@82 156 for (i = 0; i &lt; local_n0; ++i)
Chris@82 157 for (j = 0; j &lt; M; ++j)
Chris@82 158 for (k = 0; k &lt; N; ++k)
Chris@82 159 rin[(i*M + j) * (2*(N/2+1)) + k] = my_func(local_0_start+i, j, k);
Chris@82 160
Chris@82 161 /* <span class="roman">compute transforms as many times as desired</span> */
Chris@82 162 fftw_execute(plan);
Chris@82 163
Chris@82 164 fftw_destroy_plan(plan);
Chris@82 165
Chris@82 166 MPI_Finalize();
Chris@82 167 }
Chris@82 168 </pre></div>
Chris@82 169
Chris@82 170 <a name="index-fftw_005falloc_005freal-2"></a>
Chris@82 171 <a name="index-row_002dmajor-5"></a>
Chris@82 172 <p>Note that we allocated <code>rin</code> using <code>fftw_alloc_real</code> with an
Chris@82 173 argument of <code>2 * alloc_local</code>: since <code>alloc_local</code> is the
Chris@82 174 number of <em>complex</em> values to allocate, the number of <em>real</em>
Chris@82 175 values is twice as many. The <code>rin</code> array is then
Chris@82 176 local_n0&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1)
Chris@82 177 in row-major order, so its
Chris@82 178 <code>(i,j,k)</code> element is at the index <code>(i*M + j) * (2*(N/2+1)) +
Chris@82 179 k</code> (see <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>).
Chris@82 180 </p>
Chris@82 181 <a name="index-transpose-1"></a>
Chris@82 182 <a name="index-FFTW_005fTRANSPOSED_005fOUT"></a>
Chris@82 183 <a name="index-FFTW_005fTRANSPOSED_005fIN"></a>
Chris@82 184 <p>As for the complex transforms, improved performance can be obtained by
Chris@82 185 specifying that the output is the transpose of the input or vice versa
Chris@82 186 (see <a href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>). In our L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N
Chris@82 187 r2c
Chris@82 188 example, including <code>FFTW_TRANSPOSED_OUT</code> in the flags means that
Chris@82 189 the input would be a padded L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1)
Chris@82 190 real array
Chris@82 191 distributed over the <code>L</code> dimension, while the output would be a
Chris@82 192 M&nbsp;&times;&nbsp;L&nbsp;&times;&nbsp;N/2+1
Chris@82 193 complex array distributed over the <code>M</code>
Chris@82 194 dimension. To perform the inverse c2r transform with the same data
Chris@82 195 distributions, you would use the <code>FFTW_TRANSPOSED_IN</code> flag.
Chris@82 196 </p>
Chris@82 197 <hr>
Chris@82 198 <div class="header">
Chris@82 199 <p>
Chris@82 200 Next: <a href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms" accesskey="n" rel="next">Other Multi-dimensional Real-data MPI Transforms</a>, Previous: <a href="MPI-Data-Distribution.html#MPI-Data-Distribution" accesskey="p" rel="prev">MPI Data Distribution</a>, Up: <a href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" accesskey="u" rel="up">Distributed-memory FFTW with MPI</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
Chris@82 201 </div>
Chris@82 202
Chris@82 203
Chris@82 204
Chris@82 205 </body>
Chris@82 206 </html>