comparison Lib/fftw-3.2.1/doc/html/.svn/text-base/MPI-data-distribution.html.svn-base @ 0:25bf17994ef1

First commit. VS2013, Codeblocks and Mac OSX configuration
author Geogaddi\David <d.m.ronan@qmul.ac.uk>
date Thu, 09 Jul 2015 01:12:16 +0100
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:25bf17994ef1
1 <html lang="en">
2 <head>
3 <title>MPI data distribution - FFTW 3.2alpha3</title>
4 <meta http-equiv="Content-Type" content="text/html">
5 <meta name="description" content="FFTW 3.2alpha3">
6 <meta name="generator" content="makeinfo 4.8">
7 <link title="Top" rel="start" href="index.html#Top">
8 <link rel="up" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI" title="Distributed-memory FFTW with MPI">
9 <link rel="prev" href="Simple-MPI-example.html#Simple-MPI-example" title="Simple MPI example">
10 <link rel="next" href="Multi_002ddimensional-MPI-DFT-of-Real-Data.html#Multi_002ddimensional-MPI-DFT-of-Real-Data" title="Multi-dimensional MPI DFT of Real Data">
11 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
12 <!--
13 This manual is for FFTW
14 (version 3.2alpha3, 14 August 2007).
15
16 Copyright (C) 2003 Matteo Frigo.
17
18 Copyright (C) 2003 Massachusetts Institute of Technology.
19
20 Permission is granted to make and distribute verbatim copies of
21 this manual provided the copyright notice and this permission
22 notice are preserved on all copies.
23
24 Permission is granted to copy and distribute modified versions of
25 this manual under the conditions for verbatim copying, provided
26 that the entire resulting derived work is distributed under the
27 terms of a permission notice identical to this one.
28
29 Permission is granted to copy and distribute translations of this
30 manual into another language, under the above conditions for
31 modified versions, except that this permission notice may be
32 stated in a translation approved by the Free Software Foundation.
33 -->
34 <meta http-equiv="Content-Style-Type" content="text/css">
35 <style type="text/css"><!--
36 pre.display { font-family:inherit }
37 pre.format { font-family:inherit }
38 pre.smalldisplay { font-family:inherit; font-size:smaller }
39 pre.smallformat { font-family:inherit; font-size:smaller }
40 pre.smallexample { font-size:smaller }
41 pre.smalllisp { font-size:smaller }
42 span.sc { font-variant:small-caps }
43 span.roman { font-family:serif; font-weight:normal; }
44 span.sansserif { font-family:sans-serif; font-weight:normal; }
45 --></style>
46 </head>
47 <body>
48 <div class="node">
49 <p>
50 <a name="MPI-data-distribution"></a>
51 Next:&nbsp;<a rel="next" accesskey="n" href="Multi_002ddimensional-MPI-DFT-of-Real-Data.html#Multi_002ddimensional-MPI-DFT-of-Real-Data">Multi-dimensional MPI DFT of Real Data</a>,
52 Previous:&nbsp;<a rel="previous" accesskey="p" href="Simple-MPI-example.html#Simple-MPI-example">Simple MPI example</a>,
53 Up:&nbsp;<a rel="up" accesskey="u" href="Distributed_002dmemory-FFTW-with-MPI.html#Distributed_002dmemory-FFTW-with-MPI">Distributed-memory FFTW with MPI</a>
54 <hr>
55 </div>
56
57 <h3 class="section">6.4 MPI data distribution</h3>
58
59 <p><a name="index-data-distribution-350"></a>
60 The most important concept to understand in using FFTW's MPI interface
61 is the data distribution. With a serial or multithreaded FFT, all of
62 the input and outputs are stored as a single contiguous chunk of
63 memory. With a distributed-memory FFT, the inputs and outputs are
64 broken into disjoint blocks, one per process.
65
66 <p>In particular, FFTW uses a <em>1d block distribution</em> of the data,
67 distributed along the <em>first dimension</em>. For example, if you
68 want to perform a 100&nbsp;&times;&nbsp;200 complex DFT, distributed over 4
69 processes, each process will get a 25&nbsp;&times;&nbsp;200 slice of the data.
70 That is, process 0 will get rows 0 through 24, process 1 will get rows
71 25 through 49, process 2 will get rows 50 through 74, and process 3
72 will get rows 75 through 99. If you take the same array but
73 distribute it over 3 processes, then it is not evenly divisible so the
74 different processes will have unequal chunks. FFTW's default choice
75 in this case is to assign 34 rows to processes 0 and 1, and 32 rows to
76 process 2.
77 <a name="index-block-distribution-351"></a>
78 FFTW provides several `<samp><span class="samp">fftw_mpi_local_size</span></samp>' routines that you can
79 call to find out what portion of an array is stored on the current
80 process. In most cases, you should use the default block sizes picked
81 by FFTW, but it is also possible to specify your own block size. For
82 example, with a 100&nbsp;&times;&nbsp;200 array on three processes, you can
83 tell FFTW to use a block size of 40, which would assign 40 rows to
84 processes 0 and 1, and 20 rows to process 2. FFTW's default is to
85 divide the data equally among the processes if possible, and as best
86 it can otherwise. The rows are always assigned in &ldquo;rank order,&rdquo;
87 i.e. process 0 gets the first block of rows, then process 1, and so
88 on. (You can change this by using <code>MPI_Comm_split</code> to create a
89 new communicator with re-ordered processes.) However, you should
90 always call the `<samp><span class="samp">fftw_mpi_local_size</span></samp>' routines, if possible,
91 rather than trying to predict FFTW's distribution choices.
92
93 <ul class="menu">
94 <li><a accesskey="1" href="Basic-and-advanced-distribution-interfaces.html#Basic-and-advanced-distribution-interfaces">Basic and advanced distribution interfaces</a>
95 <li><a accesskey="2" href="Load-balancing.html#Load-balancing">Load balancing</a>
96 <li><a accesskey="3" href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>
97 <li><a accesskey="4" href="One_002ddimensional-distributions.html#One_002ddimensional-distributions">One-dimensional distributions</a>
98 </ul>
99
100 </body></html>
101