Stack alignment on x86

d@0: d@0: d@0: Stack alignment on x86 - FFTW 3.2.1 d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0: d@0:

d@0:

d@0: d@0: Previous: SIMD alignment and fftw_malloc, d@0: Up: Data Alignment d@0:

d@0:

d@0: d@0:

3.1.2 Stack alignment on x86

d@0: d@0:

On the Pentium and subsequent x86 processors, there is a substantial d@0: performance penalty if double-precision variables are not stored d@0: 8-byte aligned; a factor of two or more is not unusual. d@0: Unfortunately, the stack (the place that local variables and d@0: subroutine arguments live) is not guaranteed by the Intel ABI to be d@0: 8-byte aligned. d@0: d@0:

Recent versions of gcc (as well as most other compilers, we are d@0: told, such as Intel's, Metrowerks', and Microsoft's) are able to keep d@0: the stack 8-byte aligned; gcc does this by default (see d@0: -mpreferred-stack-boundary in the gcc documentation). d@0: If you are not certain whether your compiler maintains stack alignment d@0: by default, it is a good idea to make sure. d@0: d@0:

Unfortunately, gcc only preserves the stack d@0: alignment—as a result, if the stack starts off misaligned, it will d@0: always be misaligned, with a disastrous effect on performance (in d@0: double precision). To prevent this, FFTW includes hacks to align its d@0: own stack if necessary, so it should perform well even if you call it d@0: from a program with a misaligned stack. Currently, our hacks support d@0: gcc and the Intel C compiler; if you use another compiler you d@0: are on your own. Fortunately, recent versions of glibc (on GNU/Linux) d@0: provide a properly-aligned starting stack, but this was not the case d@0: with a number of older versions, and we are not certain of the d@0: situation on other operating systems. Hopefully, as time goes by this d@0: will become less of a concern. d@0: d@0: d@0: d@0: