d@0
|
1 <html lang="en">
|
d@0
|
2 <head>
|
d@0
|
3 <title>Stack alignment on x86 - FFTW 3.2.1</title>
|
d@0
|
4 <meta http-equiv="Content-Type" content="text/html">
|
d@0
|
5 <meta name="description" content="FFTW 3.2.1">
|
d@0
|
6 <meta name="generator" content="makeinfo 4.8">
|
d@0
|
7 <link title="Top" rel="start" href="index.html#Top">
|
d@0
|
8 <link rel="up" href="Data-Alignment.html#Data-Alignment" title="Data Alignment">
|
d@0
|
9 <link rel="prev" href="SIMD-alignment-and-fftw_005fmalloc.html#SIMD-alignment-and-fftw_005fmalloc" title="SIMD alignment and fftw_malloc">
|
d@0
|
10 <link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
|
d@0
|
11 <!--
|
d@0
|
12 This manual is for FFTW
|
d@0
|
13 (version 3.2.1, 5 February 2009).
|
d@0
|
14
|
d@0
|
15 Copyright (C) 2003 Matteo Frigo.
|
d@0
|
16
|
d@0
|
17 Copyright (C) 2003 Massachusetts Institute of Technology.
|
d@0
|
18
|
d@0
|
19 Permission is granted to make and distribute verbatim copies of
|
d@0
|
20 this manual provided the copyright notice and this permission
|
d@0
|
21 notice are preserved on all copies.
|
d@0
|
22
|
d@0
|
23 Permission is granted to copy and distribute modified versions of
|
d@0
|
24 this manual under the conditions for verbatim copying, provided
|
d@0
|
25 that the entire resulting derived work is distributed under the
|
d@0
|
26 terms of a permission notice identical to this one.
|
d@0
|
27
|
d@0
|
28 Permission is granted to copy and distribute translations of this
|
d@0
|
29 manual into another language, under the above conditions for
|
d@0
|
30 modified versions, except that this permission notice may be
|
d@0
|
31 stated in a translation approved by the Free Software Foundation.
|
d@0
|
32 -->
|
d@0
|
33 <meta http-equiv="Content-Style-Type" content="text/css">
|
d@0
|
34 <style type="text/css"><!--
|
d@0
|
35 pre.display { font-family:inherit }
|
d@0
|
36 pre.format { font-family:inherit }
|
d@0
|
37 pre.smalldisplay { font-family:inherit; font-size:smaller }
|
d@0
|
38 pre.smallformat { font-family:inherit; font-size:smaller }
|
d@0
|
39 pre.smallexample { font-size:smaller }
|
d@0
|
40 pre.smalllisp { font-size:smaller }
|
d@0
|
41 span.sc { font-variant:small-caps }
|
d@0
|
42 span.roman { font-family:serif; font-weight:normal; }
|
d@0
|
43 span.sansserif { font-family:sans-serif; font-weight:normal; }
|
d@0
|
44 --></style>
|
d@0
|
45 </head>
|
d@0
|
46 <body>
|
d@0
|
47 <div class="node">
|
d@0
|
48 <p>
|
d@0
|
49 <a name="Stack-alignment-on-x86"></a>
|
d@0
|
50 Previous: <a rel="previous" accesskey="p" href="SIMD-alignment-and-fftw_005fmalloc.html#SIMD-alignment-and-fftw_005fmalloc">SIMD alignment and fftw_malloc</a>,
|
d@0
|
51 Up: <a rel="up" accesskey="u" href="Data-Alignment.html#Data-Alignment">Data Alignment</a>
|
d@0
|
52 <hr>
|
d@0
|
53 </div>
|
d@0
|
54
|
d@0
|
55 <h4 class="subsection">3.1.2 Stack alignment on x86</h4>
|
d@0
|
56
|
d@0
|
57 <p>On the Pentium and subsequent x86 processors, there is a substantial
|
d@0
|
58 performance penalty if double-precision variables are not stored
|
d@0
|
59 8-byte aligned; a factor of two or more is not unusual.
|
d@0
|
60 Unfortunately, the stack (the place that local variables and
|
d@0
|
61 subroutine arguments live) is not guaranteed by the Intel ABI to be
|
d@0
|
62 8-byte aligned.
|
d@0
|
63
|
d@0
|
64 <p>Recent versions of <code>gcc</code> (as well as most other compilers, we are
|
d@0
|
65 told, such as Intel's, Metrowerks', and Microsoft's) are able to keep
|
d@0
|
66 the stack 8-byte aligned; <code>gcc</code> does this by default (see
|
d@0
|
67 <code>-mpreferred-stack-boundary</code> in the <code>gcc</code> documentation).
|
d@0
|
68 If you are not certain whether your compiler maintains stack alignment
|
d@0
|
69 by default, it is a good idea to make sure.
|
d@0
|
70
|
d@0
|
71 <p>Unfortunately, <code>gcc</code> only <em>preserves</em> the stack
|
d@0
|
72 alignment—as a result, if the stack starts off misaligned, it will
|
d@0
|
73 always be misaligned, with a disastrous effect on performance (in
|
d@0
|
74 double precision). To prevent this, FFTW includes hacks to align its
|
d@0
|
75 own stack if necessary, so it should perform well even if you call it
|
d@0
|
76 from a program with a misaligned stack. Currently, our hacks support
|
d@0
|
77 <code>gcc</code> and the Intel C compiler; if you use another compiler you
|
d@0
|
78 are on your own. Fortunately, recent versions of glibc (on GNU/Linux)
|
d@0
|
79 provide a properly-aligned starting stack, but this was not the case
|
d@0
|
80 with a number of older versions, and we are not certain of the
|
d@0
|
81 situation on other operating systems. Hopefully, as time goes by this
|
d@0
|
82 will become less of a concern.
|
d@0
|
83
|
d@0
|
84 <!-- -->
|
d@0
|
85 </body></html>
|
d@0
|
86
|