comparison src/fftw-3.3.3/doc/html/Multi_002ddimensional-MPI-DFTs-of-Real-Data.html @ 10:37bf6b4a2645

Add FFTW3
author Chris Cannam
date Wed, 20 Mar 2013 15:35:50 +0000
parents
children
comparison
equal deleted inserted replaced
9:c0fb53affa76 10:37bf6b4a2645
1 <html lang="en">
2 <head>
3 <title>Multi-dimensional MPI DFTs of Real Data - FFTW 3.3.3</title>
4 <meta http-equiv="Content-Type" content="text/html">
5 <meta name="description" content="FFTW 3.3.3">
6 <meta name="generator" content="makeinfo 4.13">
7 <link title="Top" rel="start" href="index.html#Top">
8 <link rel="up" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" title="Distributed-memory FFTW with MPI">
9 <link rel="prev" href="MPI-Data-Distribution.html#MPI-Data-Distribution" title="MPI Data Distribution">
10 <link rel="next" href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms" title="Other Multi-dimensional Real-data MPI Transforms">
11 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
12 <!--
13 This manual is for FFTW
14 (version 3.3.3, 25 November 2012).
15
16 Copyright (C) 2003 Matteo Frigo.
17
18 Copyright (C) 2003 Massachusetts Institute of Technology.
19
20 Permission is granted to make and distribute verbatim copies of
21 this manual provided the copyright notice and this permission
22 notice are preserved on all copies.
23
24 Permission is granted to copy and distribute modified versions of
25 this manual under the conditions for verbatim copying, provided
26 that the entire resulting derived work is distributed under the
27 terms of a permission notice identical to this one.
28
29 Permission is granted to copy and distribute translations of this
30 manual into another language, under the above conditions for
31 modified versions, except that this permission notice may be
32 stated in a translation approved by the Free Software Foundation.
33 -->
34 <meta http-equiv="Content-Style-Type" content="text/css">
35 <style type="text/css"><!--
36 pre.display { font-family:inherit }
37 pre.format { font-family:inherit }
38 pre.smalldisplay { font-family:inherit; font-size:smaller }
39 pre.smallformat { font-family:inherit; font-size:smaller }
40 pre.smallexample { font-size:smaller }
41 pre.smalllisp { font-size:smaller }
42 span.sc { font-variant:small-caps }
43 span.roman { font-family:serif; font-weight:normal; }
44 span.sansserif { font-family:sans-serif; font-weight:normal; }
45 --></style>
46 </head>
47 <body>
48 <div class="node">
49 <a name="Multi-dimensional-MPI-DFTs-of-Real-Data"></a>
50 <a name="Multi_002ddimensional-MPI-DFTs-of-Real-Data"></a>
51 <p>
52 Next:&nbsp;<a rel="next" accesskey="n" href="Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms.html#Other-Multi_002ddimensional-Real_002ddata-MPI-Transforms">Other Multi-dimensional Real-data MPI Transforms</a>,
53 Previous:&nbsp;<a rel="previous" accesskey="p" href="MPI-Data-Distribution.html#MPI-Data-Distribution">MPI Data Distribution</a>,
54 Up:&nbsp;<a rel="up" accesskey="u" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI">Distributed-memory FFTW with MPI</a>
55 <hr>
56 </div>
57
58 <h3 class="section">6.5 Multi-dimensional MPI DFTs of Real Data</h3>
59
60 <p>FFTW's MPI interface also supports multi-dimensional DFTs of real
61 data, similar to the serial r2c and c2r interfaces. (Parallel
62 one-dimensional real-data DFTs are not currently supported; you must
63 use a complex transform and set the imaginary parts of the inputs to
64 zero.)
65
66 <p>The key points to understand for r2c and c2r MPI transforms (compared
67 to the MPI complex DFTs or the serial r2c/c2r transforms), are:
68
69 <ul>
70 <li>Just as for serial transforms, r2c/c2r DFTs transform n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> real
71 data to/from n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1) complex data: the last dimension of the
72 complex data is cut in half (rounded down), plus one. As for the
73 serial transforms, the sizes you pass to the &lsquo;<samp><span class="samp">plan_dft_r2c</span></samp>&rsquo; and
74 &lsquo;<samp><span class="samp">plan_dft_c2r</span></samp>&rsquo; are the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub> dimensions of the real data.
75
76 <li><a name="index-padding-386"></a>Although the real data is <em>conceptually</em> n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;n<sub>d-1</sub>, it is
77 <em>physically</em> stored as an n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;[2&nbsp;(n<sub>d-1</sub>/2 + 1)] array, where the last
78 dimension has been <em>padded</em> to make it the same size as the
79 complex output. This is much like the in-place serial r2c/c2r
80 interface (see <a href="Multi_002dDimensional-DFTs-of-Real-Data.html#Multi_002dDimensional-DFTs-of-Real-Data">Multi-Dimensional DFTs of Real Data</a>), except that
81 in MPI the padding is required even for out-of-place data. The extra
82 padding numbers are ignored by FFTW (they are <em>not</em> like
83 zero-padding the transform to a larger size); they are only used to
84 determine the data layout.
85
86 <li><a name="index-data-distribution-387"></a>The data distribution in MPI for <em>both</em> the real and complex data
87 is determined by the shape of the <em>complex</em> data. That is, you
88 call the appropriate &lsquo;<samp><span class="samp">local size</span></samp>&rsquo; function for the n<sub>0</sub>&nbsp;&times;&nbsp;n<sub>1</sub>&nbsp;&times;&nbsp;n<sub>2</sub>&nbsp;&times;&nbsp;&hellip;&nbsp;&times;&nbsp;(n<sub>d-1</sub>/2 + 1)
89
90 <p>complex data, and then use the <em>same</em> distribution for the real
91 data except that the last complex dimension is replaced by a (padded)
92 real dimension of twice the length.
93
94 </ul>
95
96 <p>For example suppose we are performing an out-of-place r2c transform of
97 L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N real data [padded to L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1)],
98 resulting in L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N/2+1 complex data. Similar to the
99 example in <a href="2d-MPI-example.html#g_t2d-MPI-example">2d MPI example</a>, we might do something like:
100
101 <pre class="example"> #include &lt;fftw3-mpi.h&gt;
102
103 int main(int argc, char **argv)
104 {
105 const ptrdiff_t L = ..., M = ..., N = ...;
106 fftw_plan plan;
107 double *rin;
108 fftw_complex *cout;
109 ptrdiff_t alloc_local, local_n0, local_0_start, i, j, k;
110
111 MPI_Init(&amp;argc, &amp;argv);
112 fftw_mpi_init();
113
114 /* <span class="roman">get local data size and allocate</span> */
115 alloc_local = fftw_mpi_local_size_3d(L, M, N/2+1, MPI_COMM_WORLD,
116 &amp;local_n0, &amp;local_0_start);
117 rin = fftw_alloc_real(2 * alloc_local);
118 cout = fftw_alloc_complex(alloc_local);
119
120 /* <span class="roman">create plan for out-of-place r2c DFT</span> */
121 plan = fftw_mpi_plan_dft_r2c_3d(L, M, N, rin, cout, MPI_COMM_WORLD,
122 FFTW_MEASURE);
123
124 /* <span class="roman">initialize rin to some function</span> my_func(x,y,z) */
125 for (i = 0; i &lt; local_n0; ++i)
126 for (j = 0; j &lt; M; ++j)
127 for (k = 0; k &lt; N; ++k)
128 rin[(i*M + j) * (2*(N/2+1)) + k] = my_func(local_0_start+i, j, k);
129
130 /* <span class="roman">compute transforms as many times as desired</span> */
131 fftw_execute(plan);
132
133 fftw_destroy_plan(plan);
134
135 MPI_Finalize();
136 }
137 </pre>
138 <p><a name="index-fftw_005falloc_005freal-388"></a><a name="index-row_002dmajor-389"></a>Note that we allocated <code>rin</code> using <code>fftw_alloc_real</code> with an
139 argument of <code>2 * alloc_local</code>: since <code>alloc_local</code> is the
140 number of <em>complex</em> values to allocate, the number of <em>real</em>
141 values is twice as many. The <code>rin</code> array is then
142 local_n0&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1) in row-major order, so its
143 <code>(i,j,k)</code> element is at the index <code>(i*M + j) * (2*(N/2+1)) +
144 k</code> (see <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>).
145
146 <p><a name="index-transpose-390"></a><a name="index-FFTW_005fTRANSPOSED_005fOUT-391"></a><a name="index-FFTW_005fTRANSPOSED_005fIN-392"></a>As for the complex transforms, improved performance can be obtained by
147 specifying that the output is the transpose of the input or vice versa
148 (see <a href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>). In our L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;N r2c
149 example, including <code>FFTW_TRANSPOSED_OUT</code> in the flags means that
150 the input would be a padded L&nbsp;&times;&nbsp;M&nbsp;&times;&nbsp;2(N/2+1) real array
151 distributed over the <code>L</code> dimension, while the output would be a
152 M&nbsp;&times;&nbsp;L&nbsp;&times;&nbsp;N/2+1 complex array distributed over the <code>M</code>
153 dimension. To perform the inverse c2r transform with the same data
154 distributions, you would use the <code>FFTW_TRANSPOSED_IN</code> flag.
155
156 <!-- -->
157 </body></html>
158