annotate src/fftw-3.3.3/doc/html/SIMD-alignment-and-fftw_005fmalloc.html @ 83:ae30d91d2ffe

Replace these with versions built using an older toolset (so as to avoid ABI compatibilities when linking on Ubuntu 14.04 for packaging purposes)
author Chris Cannam
date Fri, 07 Feb 2020 11:51:13 +0000
parents 37bf6b4a2645
children
rev   line source
Chris@10 1 <html lang="en">
Chris@10 2 <head>
Chris@10 3 <title>SIMD alignment and fftw_malloc - FFTW 3.3.3</title>
Chris@10 4 <meta http-equiv="Content-Type" content="text/html">
Chris@10 5 <meta name="description" content="FFTW 3.3.3">
Chris@10 6 <meta name="generator" content="makeinfo 4.13">
Chris@10 7 <link title="Top" rel="start" href="index.html#Top">
Chris@10 8 <link rel="up" href="Other-Important-Topics.html#Other-Important-Topics" title="Other Important Topics">
Chris@10 9 <link rel="prev" href="Other-Important-Topics.html#Other-Important-Topics" title="Other Important Topics">
Chris@10 10 <link rel="next" href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format" title="Multi-dimensional Array Format">
Chris@10 11 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
Chris@10 12 <!--
Chris@10 13 This manual is for FFTW
Chris@10 14 (version 3.3.3, 25 November 2012).
Chris@10 15
Chris@10 16 Copyright (C) 2003 Matteo Frigo.
Chris@10 17
Chris@10 18 Copyright (C) 2003 Massachusetts Institute of Technology.
Chris@10 19
Chris@10 20 Permission is granted to make and distribute verbatim copies of
Chris@10 21 this manual provided the copyright notice and this permission
Chris@10 22 notice are preserved on all copies.
Chris@10 23
Chris@10 24 Permission is granted to copy and distribute modified versions of
Chris@10 25 this manual under the conditions for verbatim copying, provided
Chris@10 26 that the entire resulting derived work is distributed under the
Chris@10 27 terms of a permission notice identical to this one.
Chris@10 28
Chris@10 29 Permission is granted to copy and distribute translations of this
Chris@10 30 manual into another language, under the above conditions for
Chris@10 31 modified versions, except that this permission notice may be
Chris@10 32 stated in a translation approved by the Free Software Foundation.
Chris@10 33 -->
Chris@10 34 <meta http-equiv="Content-Style-Type" content="text/css">
Chris@10 35 <style type="text/css"><!--
Chris@10 36 pre.display { font-family:inherit }
Chris@10 37 pre.format { font-family:inherit }
Chris@10 38 pre.smalldisplay { font-family:inherit; font-size:smaller }
Chris@10 39 pre.smallformat { font-family:inherit; font-size:smaller }
Chris@10 40 pre.smallexample { font-size:smaller }
Chris@10 41 pre.smalllisp { font-size:smaller }
Chris@10 42 span.sc { font-variant:small-caps }
Chris@10 43 span.roman { font-family:serif; font-weight:normal; }
Chris@10 44 span.sansserif { font-family:sans-serif; font-weight:normal; }
Chris@10 45 --></style>
Chris@10 46 </head>
Chris@10 47 <body>
Chris@10 48 <div class="node">
Chris@10 49 <a name="SIMD-alignment-and-fftw_malloc"></a>
Chris@10 50 <a name="SIMD-alignment-and-fftw_005fmalloc"></a>
Chris@10 51 <p>
Chris@10 52 Next:&nbsp;<a rel="next" accesskey="n" href="Multi_002ddimensional-Array-Format.html#Multi_002ddimensional-Array-Format">Multi-dimensional Array Format</a>,
Chris@10 53 Previous:&nbsp;<a rel="previous" accesskey="p" href="Other-Important-Topics.html#Other-Important-Topics">Other Important Topics</a>,
Chris@10 54 Up:&nbsp;<a rel="up" accesskey="u" href="Other-Important-Topics.html#Other-Important-Topics">Other Important Topics</a>
Chris@10 55 <hr>
Chris@10 56 </div>
Chris@10 57
Chris@10 58 <h3 class="section">3.1 SIMD alignment and fftw_malloc</h3>
Chris@10 59
Chris@10 60 <p>SIMD, which stands for &ldquo;Single Instruction Multiple Data,&rdquo; is a set of
Chris@10 61 special operations supported by some processors to perform a single
Chris@10 62 operation on several numbers (usually 2 or 4) simultaneously. SIMD
Chris@10 63 floating-point instructions are available on several popular CPUs:
Chris@10 64 SSE/SSE2/AVX on recent x86/x86-64 processors, AltiVec (single precision)
Chris@10 65 on some PowerPCs (Apple G4 and higher), NEON on some ARM models, and MIPS Paired Single
Chris@10 66 (currently only in FFTW 3.2.x). FFTW can be compiled to support the
Chris@10 67 SIMD instructions on any of these systems.
Chris@10 68 <a name="index-SIMD-102"></a><a name="index-SSE-103"></a><a name="index-SSE2-104"></a><a name="index-AVX-105"></a><a name="index-AltiVec-106"></a><a name="index-MIPS-PS-107"></a><a name="index-precision-108"></a>
Chris@10 69
Chris@10 70 <p>A program linking to an FFTW library compiled with SIMD support can
Chris@10 71 obtain a nonnegligible speedup for most complex and r2c/c2r
Chris@10 72 transforms. In order to obtain this speedup, however, the arrays of
Chris@10 73 complex (or real) data passed to FFTW must be specially aligned in
Chris@10 74 memory (typically 16-byte aligned), and often this alignment is more
Chris@10 75 stringent than that provided by the usual <code>malloc</code> (etc.)
Chris@10 76 allocation routines.
Chris@10 77
Chris@10 78 <p><a name="index-portability-109"></a>In order to guarantee proper alignment for SIMD, therefore, in case
Chris@10 79 your program is ever linked against a SIMD-using FFTW, we recommend
Chris@10 80 allocating your transform data with <code>fftw_malloc</code> and
Chris@10 81 de-allocating it with <code>fftw_free</code>.
Chris@10 82 <a name="index-fftw_005fmalloc-110"></a><a name="index-fftw_005ffree-111"></a>These have exactly the same interface and behavior as
Chris@10 83 <code>malloc</code>/<code>free</code>, except that for a SIMD FFTW they ensure
Chris@10 84 that the returned pointer has the necessary alignment (by calling
Chris@10 85 <code>memalign</code> or its equivalent on your OS).
Chris@10 86
Chris@10 87 <p>You are not <em>required</em> to use <code>fftw_malloc</code>. You can
Chris@10 88 allocate your data in any way that you like, from <code>malloc</code> to
Chris@10 89 <code>new</code> (in C++) to a fixed-size array declaration. If the array
Chris@10 90 happens not to be properly aligned, FFTW will not use the SIMD
Chris@10 91 extensions.
Chris@10 92 <a name="index-C_002b_002b-112"></a>
Chris@10 93 <a name="index-fftw_005falloc_005freal-113"></a><a name="index-fftw_005falloc_005fcomplex-114"></a>Since <code>fftw_malloc</code> only ever needs to be used for real and
Chris@10 94 complex arrays, we provide two convenient wrapper routines
Chris@10 95 <code>fftw_alloc_real(N)</code> and <code>fftw_alloc_complex(N)</code> that are
Chris@10 96 equivalent to <code>(double*)fftw_malloc(sizeof(double) * N)</code> and
Chris@10 97 <code>(fftw_complex*)fftw_malloc(sizeof(fftw_complex) * N)</code>,
Chris@10 98 respectively (or their equivalents in other precisions).
Chris@10 99
Chris@10 100 <!-- -->
Chris@10 101 </body></html>
Chris@10 102