annotate src/fftw-3.3.3/doc/html/One_002ddimensional-distributions.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 <html lang="en">
Chris@10 2 <head>
Chris@10 3 <title>One-dimensional distributions - FFTW 3.3.3</title>
Chris@10 4 <meta http-equiv="Content-Type" content="text/html">
Chris@10 5 <meta name="description" content="FFTW 3.3.3">
Chris@10 6 <meta name="generator" content="makeinfo 4.13">
Chris@10 7 <link title="Top" rel="start" href="index.html#Top">
Chris@10 8 <link rel="up" href="MPI-Data-Distribution.html#MPI-Data-Distribution" title="MPI Data Distribution">
Chris@10 9 <link rel="prev" href="Transposed-distributions.html#Transposed-distributions" title="Transposed distributions">
Chris@10 10 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
Chris@10 11 <!--
Chris@10 12 This manual is for FFTW
Chris@10 13 (version 3.3.3, 25 November 2012).
Chris@10 14
Chris@10 15 Copyright (C) 2003 Matteo Frigo.
Chris@10 16
Chris@10 17 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@10 18
Chris@10 19 Permission is granted to make and distribute verbatim copies of
Chris@10 20 this manual provided the copyright notice and this permission
Chris@10 21 notice are preserved on all copies.
Chris@10 22
Chris@10 23 Permission is granted to copy and distribute modified versions of
Chris@10 24 this manual under the conditions for verbatim copying, provided
Chris@10 25 that the entire resulting derived work is distributed under the
Chris@10 26 terms of a permission notice identical to this one.
Chris@10 27
Chris@10 28 Permission is granted to copy and distribute translations of this
Chris@10 29 manual into another language, under the above conditions for
Chris@10 30 modified versions, except that this permission notice may be
Chris@10 31 stated in a translation approved by the Free Software Foundation.
Chris@10 32 -->
Chris@10 33 <meta http-equiv="Content-Style-Type" content="text/css">
Chris@10 34 <style type="text/css"><!--
Chris@10 35 pre.display { font-family:inherit }
Chris@10 36 pre.format { font-family:inherit }
Chris@10 37 pre.smalldisplay { font-family:inherit; font-size:smaller }
Chris@10 38 pre.smallformat { font-family:inherit; font-size:smaller }
Chris@10 39 pre.smallexample { font-size:smaller }
Chris@10 40 pre.smalllisp { font-size:smaller }
Chris@10 41 span.sc { font-variant:small-caps }
Chris@10 42 span.roman { font-family:serif; font-weight:normal; }
Chris@10 43 span.sansserif { font-family:sans-serif; font-weight:normal; }
Chris@10 44 --></style>
Chris@10 45 </head>
Chris@10 46 <body>
Chris@10 47 <div class="node">
Chris@10 48 <a name="One-dimensional-distributions"></a>
Chris@10 49 <a name="One_002ddimensional-distributions"></a>
Chris@10 50 <p>
Chris@10 51 Previous:&nbsp;<a rel="previous" accesskey="p" href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>,
Chris@10 52 Up:&nbsp;<a rel="up" accesskey="u" href="MPI-Data-Distribution.html#MPI-Data-Distribution">MPI Data Distribution</a>
Chris@10 53 <hr>
Chris@10 54 </div>
Chris@10 55
Chris@10 56 <h4 class="subsection">6.4.4 One-dimensional distributions</h4>
Chris@10 57
Chris@10 58 <p>For one-dimensional distributed DFTs using FFTW, matters are slightly
Chris@10 59 more complicated because the data distribution is more closely tied to
Chris@10 60 how the algorithm works. In particular, you can no longer pass an
Chris@10 61 arbitrary block size and must accept FFTW's default; also, the block
Chris@10 62 sizes may be different for input and output. Also, the data
Chris@10 63 distribution depends on the flags and transform direction, in order
Chris@10 64 for forward and backward transforms to work correctly.
Chris@10 65
Chris@10 66 <pre class="example"> ptrdiff_t fftw_mpi_local_size_1d(ptrdiff_t n0, MPI_Comm comm,
Chris@10 67 int sign, unsigned flags,
Chris@10 68 ptrdiff_t *local_ni, ptrdiff_t *local_i_start,
Chris@10 69 ptrdiff_t *local_no, ptrdiff_t *local_o_start);
Chris@10 70 </pre>
Chris@10 71 <p><a name="index-fftw_005fmpi_005flocal_005fsize_005f1d-383"></a>
Chris@10 72 This function computes the data distribution for a 1d transform of
Chris@10 73 size <code>n0</code> with the given transform <code>sign</code> and <code>flags</code>.
Chris@10 74 Both input and output data use block distributions. The input on the
Chris@10 75 current process will consist of <code>local_ni</code> numbers starting at
Chris@10 76 index <code>local_i_start</code>; e.g. if only a single process is used,
Chris@10 77 then <code>local_ni</code> will be <code>n0</code> and <code>local_i_start</code> will
Chris@10 78 be <code>0</code>. Similarly for the output, with <code>local_no</code> numbers
Chris@10 79 starting at index <code>local_o_start</code>. The return value of
Chris@10 80 <code>fftw_mpi_local_size_1d</code> will be the total number of elements to
Chris@10 81 allocate on the current process (which might be slightly larger than
Chris@10 82 the local size due to intermediate steps in the algorithm).
Chris@10 83
Chris@10 84 <p>As mentioned above (see <a href="Load-balancing.html#Load-balancing">Load balancing</a>), the data will be divided
Chris@10 85 equally among the processes if <code>n0</code> is divisible by the
Chris@10 86 <em>square</em> of the number of processes. In this case,
Chris@10 87 <code>local_ni</code> will equal <code>local_no</code>. Otherwise, they may be
Chris@10 88 different.
Chris@10 89
Chris@10 90 <p>For some applications, such as convolutions, the order of the output
Chris@10 91 data is irrelevant. In this case, performance can be improved by
Chris@10 92 specifying that the output data be stored in an FFTW-defined
Chris@10 93 &ldquo;scrambled&rdquo; format. (In particular, this is the analogue of
Chris@10 94 transposed output in the multidimensional case: scrambled output saves
Chris@10 95 a communications step.) If you pass <code>FFTW_MPI_SCRAMBLED_OUT</code> in
Chris@10 96 the flags, then the output is stored in this (undocumented) scrambled
Chris@10 97 order. Conversely, to perform the inverse transform of data in
Chris@10 98 scrambled order, pass the <code>FFTW_MPI_SCRAMBLED_IN</code> flag.
Chris@10 99 <a name="index-FFTW_005fMPI_005fSCRAMBLED_005fOUT-384"></a><a name="index-FFTW_005fMPI_005fSCRAMBLED_005fIN-385"></a>
Chris@10 100
Chris@10 101 <p>In MPI FFTW, only composite sizes <code>n0</code> can be parallelized; we
Chris@10 102 have not yet implemented a parallel algorithm for large prime sizes.
Chris@10 103
Chris@10 104 <!-- -->
Chris@10 105 </body></html>
Chris@10 106