cannam@95: <html lang="en"> cannam@95: <head> cannam@95: <title>Load balancing - FFTW 3.3.3</title> cannam@95: <meta http-equiv="Content-Type" content="text/html"> cannam@95: <meta name="description" content="FFTW 3.3.3"> cannam@95: <meta name="generator" content="makeinfo 4.13"> cannam@95: <link title="Top" rel="start" href="index.html#Top"> cannam@95: <link rel="up" href="MPI-Data-Distribution.html#MPI-Data-Distribution" title="MPI Data Distribution"> cannam@95: <link rel="prev" href="Basic-and-advanced-distribution-interfaces.html#Basic-and-advanced-distribution-interfaces" title="Basic and advanced distribution interfaces"> cannam@95: <link rel="next" href="Transposed-distributions.html#Transposed-distributions" title="Transposed distributions"> cannam@95: <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage"> cannam@95: <!-- cannam@95: This manual is for FFTW cannam@95: (version 3.3.3, 25 November 2012). cannam@95: cannam@95: Copyright (C) 2003 Matteo Frigo. cannam@95: cannam@95: Copyright (C) 2003 Massachusetts Institute of Technology. cannam@95: cannam@95: Permission is granted to make and distribute verbatim copies of cannam@95: this manual provided the copyright notice and this permission cannam@95: notice are preserved on all copies. cannam@95: cannam@95: Permission is granted to copy and distribute modified versions of cannam@95: this manual under the conditions for verbatim copying, provided cannam@95: that the entire resulting derived work is distributed under the cannam@95: terms of a permission notice identical to this one. cannam@95: cannam@95: Permission is granted to copy and distribute translations of this cannam@95: manual into another language, under the above conditions for cannam@95: modified versions, except that this permission notice may be cannam@95: stated in a translation approved by the Free Software Foundation. cannam@95: --> cannam@95: <meta http-equiv="Content-Style-Type" content="text/css"> cannam@95: <style type="text/css"><!-- cannam@95: pre.display { font-family:inherit } cannam@95: pre.format { font-family:inherit } cannam@95: pre.smalldisplay { font-family:inherit; font-size:smaller } cannam@95: pre.smallformat { font-family:inherit; font-size:smaller } cannam@95: pre.smallexample { font-size:smaller } cannam@95: pre.smalllisp { font-size:smaller } cannam@95: span.sc { font-variant:small-caps } cannam@95: span.roman { font-family:serif; font-weight:normal; } cannam@95: span.sansserif { font-family:sans-serif; font-weight:normal; } cannam@95: --></style> cannam@95: </head> cannam@95: <body> cannam@95: <div class="node"> cannam@95: <a name="Load-balancing"></a> cannam@95: <p> cannam@95: Next: <a rel="next" accesskey="n" href="Transposed-distributions.html#Transposed-distributions">Transposed distributions</a>, cannam@95: Previous: <a rel="previous" accesskey="p" href="Basic-and-advanced-distribution-interfaces.html#Basic-and-advanced-distribution-interfaces">Basic and advanced distribution interfaces</a>, cannam@95: Up: <a rel="up" accesskey="u" href="MPI-Data-Distribution.html#MPI-Data-Distribution">MPI Data Distribution</a> cannam@95: <hr> cannam@95: </div> cannam@95: cannam@95: <h4 class="subsection">6.4.2 Load balancing</h4> cannam@95: cannam@95: <p><a name="index-load-balancing-378"></a> cannam@95: Ideally, when you parallelize a transform over some P cannam@95: processes, each process should end up with work that takes equal time. cannam@95: Otherwise, all of the processes end up waiting on whichever process is cannam@95: slowest. This goal is known as “load balancing.” In this section, cannam@95: we describe the circumstances under which FFTW is able to load-balance cannam@95: well, and in particular how you should choose your transform size in cannam@95: order to load balance. cannam@95: cannam@95: <p>Load balancing is especially difficult when you are parallelizing over cannam@95: heterogeneous machines; for example, if one of your processors is a cannam@95: old 486 and another is a Pentium IV, obviously you should give the cannam@95: Pentium more work to do than the 486 since the latter is much slower. cannam@95: FFTW does not deal with this problem, however—it assumes that your cannam@95: processes run on hardware of comparable speed, and that the goal is cannam@95: therefore to divide the problem as equally as possible. cannam@95: cannam@95: <p>For a multi-dimensional complex DFT, FFTW can divide the problem cannam@95: equally among the processes if: (i) the <em>first</em> dimension cannam@95: <code>n0</code> is divisible by P; and (ii), the <em>product</em> of cannam@95: the subsequent dimensions is divisible by P. (For the advanced cannam@95: interface, where you can specify multiple simultaneous transforms via cannam@95: some “vector” length <code>howmany</code>, a factor of <code>howmany</code> is cannam@95: included in the product of the subsequent dimensions.) cannam@95: cannam@95: <p>For a one-dimensional complex DFT, the length <code>N</code> of the data cannam@95: should be divisible by P <em>squared</em> to be able to divide cannam@95: the problem equally among the processes. cannam@95: cannam@95: </body></html> cannam@95: