annotate src/fftw-3.3.8/doc/html/SIMD-alignment-and-fftw_005fmalloc.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents d0c2a83c1364
children
rev   line source
Chris@82 1 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
Chris@82 2 <html>
Chris@82 3 <!-- This manual is for FFTW
Chris@82 4 (version 3.3.8, 24 May 2018).
Chris@82 5
Chris@82 6 Copyright (C) 2003 Matteo Frigo.
Chris@82 7
Chris@82 8 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@82 9
Chris@82 10 Permission is granted to make and distribute verbatim copies of this
Chris@82 11 manual provided the copyright notice and this permission notice are
Chris@82 12 preserved on all copies.
Chris@82 13
Chris@82 14 Permission is granted to copy and distribute modified versions of this
Chris@82 15 manual under the conditions for verbatim copying, provided that the
Chris@82 16 entire resulting derived work is distributed under the terms of a
Chris@82 17 permission notice identical to this one.
Chris@82 18
Chris@82 19 Permission is granted to copy and distribute translations of this manual
Chris@82 20 into another language, under the above conditions for modified versions,
Chris@82 21 except that this permission notice may be stated in a translation
Chris@82 22 approved by the Free Software Foundation. -->
Chris@82 23 <!-- Created by GNU Texinfo 6.3, http://www.gnu.org/software/texinfo/ -->
Chris@82 24 <head>
Chris@82 25 <title>FFTW 3.3.8: SIMD alignment and fftw_malloc</title>
Chris@82 26
Chris@82 27 <meta name="description" content="FFTW 3.3.8: SIMD alignment and fftw_malloc">
Chris@82 28 <meta name="keywords" content="FFTW 3.3.8: SIMD alignment and fftw_malloc">
Chris@82 29 <meta name="resource-type" content="document">
Chris@82 30 <meta name="distribution" content="global">
Chris@82 31 <meta name="Generator" content="makeinfo">
Chris@82 32 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Chris@82 33 <link href="index.html#Top" rel="start" title="Top">
Chris@82 34 <link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
Chris@82 35 <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
Chris@82 36 <link href="Other-Important-Topics.html#Other-Important-Topics" rel="up" title="Other Important Topics">
Chris@82 37 <link href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format" rel="next" title="Multi-dimensional Array Format">
Chris@82 38 <link href="Other-Important-Topics.html#Other-Important-Topics" rel="prev" title="Other Important Topics">
Chris@82 39 <style type="text/css">
Chris@82 40 <!--
Chris@82 41 a.summary-letter {text-decoration: none}
Chris@82 42 blockquote.indentedblock {margin-right: 0em}
Chris@82 43 blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
Chris@82 44 blockquote.smallquotation {font-size: smaller}
Chris@82 45 div.display {margin-left: 3.2em}
Chris@82 46 div.example {margin-left: 3.2em}
Chris@82 47 div.lisp {margin-left: 3.2em}
Chris@82 48 div.smalldisplay {margin-left: 3.2em}
Chris@82 49 div.smallexample {margin-left: 3.2em}
Chris@82 50 div.smalllisp {margin-left: 3.2em}
Chris@82 51 kbd {font-style: oblique}
Chris@82 52 pre.display {font-family: inherit}
Chris@82 53 pre.format {font-family: inherit}
Chris@82 54 pre.menu-comment {font-family: serif}
Chris@82 55 pre.menu-preformatted {font-family: serif}
Chris@82 56 pre.smalldisplay {font-family: inherit; font-size: smaller}
Chris@82 57 pre.smallexample {font-size: smaller}
Chris@82 58 pre.smallformat {font-family: inherit; font-size: smaller}
Chris@82 59 pre.smalllisp {font-size: smaller}
Chris@82 60 span.nolinebreak {white-space: nowrap}
Chris@82 61 span.roman {font-family: initial; font-weight: normal}
Chris@82 62 span.sansserif {font-family: sans-serif; font-weight: normal}
Chris@82 63 ul.no-bullet {list-style: none}
Chris@82 64 -->
Chris@82 65 </style>
Chris@82 66
Chris@82 67
Chris@82 68 </head>
Chris@82 69
Chris@82 70 <body lang="en">
Chris@82 71 <a name="SIMD-alignment-and-fftw_005fmalloc"></a>
Chris@82 72 <div class="header">
Chris@82 73 <p>
Chris@82 74 Next: <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format" accesskey="n" rel="next">Multi-dimensional Array Format</a>, Previous: <a href="Other-Important-Topics.html#Other-Important-Topics" accesskey="p" rel="prev">Other Important Topics</a>, Up: <a href="Other-Important-Topics.html#Other-Important-Topics" accesskey="u" rel="up">Other Important Topics</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
Chris@82 75 </div>
Chris@82 76 <hr>
Chris@82 77 <a name="SIMD-alignment-and-fftw_005fmalloc-1"></a>
Chris@82 78 <h3 class="section">3.1 SIMD alignment and fftw_malloc</h3>
Chris@82 79
Chris@82 80 <p>SIMD, which stands for &ldquo;Single Instruction Multiple Data,&rdquo; is a set of
Chris@82 81 special operations supported by some processors to perform a single
Chris@82 82 operation on several numbers (usually 2 or 4) simultaneously. SIMD
Chris@82 83 floating-point instructions are available on several popular CPUs:
Chris@82 84 SSE/SSE2/AVX/AVX2/AVX512/KCVI on some x86/x86-64 processors, AltiVec and
Chris@82 85 VSX on some POWER/PowerPCs, NEON on some ARM models. FFTW can be
Chris@82 86 compiled to support the SIMD instructions on any of these systems.
Chris@82 87 <a name="index-SIMD-1"></a>
Chris@82 88 <a name="index-SSE"></a>
Chris@82 89 <a name="index-SSE2"></a>
Chris@82 90 <a name="index-AVX"></a>
Chris@82 91 <a name="index-AVX2"></a>
Chris@82 92 <a name="index-AVX512"></a>
Chris@82 93 <a name="index-AltiVec"></a>
Chris@82 94 <a name="index-VSX"></a>
Chris@82 95 <a name="index-precision-2"></a>
Chris@82 96 </p>
Chris@82 97
Chris@82 98 <p>A program linking to an FFTW library compiled with SIMD support can
Chris@82 99 obtain a nonnegligible speedup for most complex and r2c/c2r
Chris@82 100 transforms. In order to obtain this speedup, however, the arrays of
Chris@82 101 complex (or real) data passed to FFTW must be specially aligned in
Chris@82 102 memory (typically 16-byte aligned), and often this alignment is more
Chris@82 103 stringent than that provided by the usual <code>malloc</code> (etc.)
Chris@82 104 allocation routines.
Chris@82 105 </p>
Chris@82 106 <a name="index-portability"></a>
Chris@82 107 <p>In order to guarantee proper alignment for SIMD, therefore, in case
Chris@82 108 your program is ever linked against a SIMD-using FFTW, we recommend
Chris@82 109 allocating your transform data with <code>fftw_malloc</code> and
Chris@82 110 de-allocating it with <code>fftw_free</code>.
Chris@82 111 <a name="index-fftw_005fmalloc-1"></a>
Chris@82 112 <a name="index-fftw_005ffree-1"></a>
Chris@82 113 These have exactly the same interface and behavior as
Chris@82 114 <code>malloc</code>/<code>free</code>, except that for a SIMD FFTW they ensure
Chris@82 115 that the returned pointer has the necessary alignment (by calling
Chris@82 116 <code>memalign</code> or its equivalent on your OS).
Chris@82 117 </p>
Chris@82 118 <p>You are not <em>required</em> to use <code>fftw_malloc</code>. You can
Chris@82 119 allocate your data in any way that you like, from <code>malloc</code> to
Chris@82 120 <code>new</code> (in C++) to a fixed-size array declaration. If the array
Chris@82 121 happens not to be properly aligned, FFTW will not use the SIMD
Chris@82 122 extensions.
Chris@82 123 <a name="index-C_002b_002b-1"></a>
Chris@82 124 </p>
Chris@82 125 <a name="index-fftw_005falloc_005freal"></a>
Chris@82 126 <a name="index-fftw_005falloc_005fcomplex-1"></a>
Chris@82 127 <p>Since <code>fftw_malloc</code> only ever needs to be used for real and
Chris@82 128 complex arrays, we provide two convenient wrapper routines
Chris@82 129 <code>fftw_alloc_real(N)</code> and <code>fftw_alloc_complex(N)</code> that are
Chris@82 130 equivalent to <code>(double*)fftw_malloc(sizeof(double) * N)</code> and
Chris@82 131 <code>(fftw_complex*)fftw_malloc(sizeof(fftw_complex) * N)</code>,
Chris@82 132 respectively (or their equivalents in other precisions).
Chris@82 133 </p>
Chris@82 134 <hr>
Chris@82 135 <div class="header">
Chris@82 136 <p>
Chris@82 137 Next: <a href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format" accesskey="n" rel="next">Multi-dimensional Array Format</a>, Previous: <a href="Other-Important-Topics.html#Other-Important-Topics" accesskey="p" rel="prev">Other Important Topics</a>, Up: <a href="Other-Important-Topics.html#Other-Important-Topics" accesskey="u" rel="up">Other Important Topics</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
Chris@82 138 </div>
Chris@82 139
Chris@82 140
Chris@82 141
Chris@82 142 </body>
Chris@82 143 </html>