Real even/odd DFTs (cosine/sine transforms)

The Fourier transform of a real-even function f(-x) = f(x) is d@0: real-even, and i times the Fourier transform of a real-odd d@0: function f(-x) = -f(x) is real-odd. Similar results hold for a d@0: discrete Fourier transform, and thus for these symmetries the need for d@0: complex inputs/outputs is entirely eliminated. Moreover, one gains a d@0: factor of two in speed/space from the fact that the data are real, and d@0: an additional factor of two from the even/odd symmetry: only the d@0: non-redundant (first) half of the array need be stored. The result is d@0: the real-even DFT (REDFT) and the real-odd DFT (RODFT), also d@0: known as the discrete cosine and sine transforms (DCT and d@0: DST), respectively. d@0: d@0: (In this section, we describe the 1d transforms; multi-dimensional d@0: transforms are just a separable product of these transforms operating d@0: along each dimension.) d@0: d@0:

Because of the discrete sampling, one has an additional choice: is the d@0: data even/odd around a sampling point, or around the point halfway d@0: between two samples? The latter corresponds to shifting the d@0: samples by half an interval, and gives rise to several transform d@0: variants denoted by REDFTab and RODFTab: a and d@0: b are 0 or 1, and indicate whether the input d@0: (a) and/or output (b) are shifted by half a sample d@0: (1 means it is shifted). These are also known as types I-IV of d@0: the DCT and DST, and all four types are supported by FFTW's r2r d@0: interface.¹ d@0: d@0:

The r2r kinds for the various REDFT and RODFT types supported by FFTW, d@0: along with the boundary conditions at both ends of the input d@0: array (n real numbers in[j=0..n-1]), are: d@0: d@0:

Note that these symmetries apply to the “logical” array being d@0: transformed; there are no constraints on your physical input d@0: data. So, for example, if you specify a size-5 REDFT00 (DCT-I) of the d@0: data abcde, it corresponds to the DFT of the logical even array d@0: abcdedcb of size 8. A size-4 REDFT10 (DCT-II) of the data d@0: abcd corresponds to the size-8 logical DFT of the even array d@0: abcddcba, shifted by half a sample. d@0: d@0:

All of these transforms are invertible. The inverse of R*DFT00 is d@0: R*DFT00; of R*DFT10 is R*DFT01 and vice versa (these are often called d@0: simply “the” DCT and IDCT, respectively); and of R*DFT11 is R*DFT11. d@0: However, the transforms computed by FFTW are unnormalized, exactly d@0: like the corresponding real and complex DFTs, so computing a transform d@0: followed by its inverse yields the original array scaled by N, d@0: where N is the logical DFT size. For REDFT00, d@0: N=2(n-1); for RODFT00, N=2(n+1); otherwise, N=2n. d@0: d@0: Note that the boundary conditions of the transform output array are d@0: given by the input boundary conditions of the inverse transform. d@0: Thus, the above transforms are all inequivalent in terms of d@0: input/output boundary conditions, even neglecting the 0.5 shift d@0: difference. d@0: d@0:

FFTW is most efficient when N is a product of small factors; note d@0: that this differs from the factorization of the physical size d@0: n for REDFT00 and RODFT00! There is another oddity: n=1 d@0: REDFT00 transforms correspond to N=0, and so are not d@0: defined (the planner will return NULL). Otherwise, any positive d@0: n is supported. d@0: d@0:

For the precise mathematical definitions of these transforms as used by d@0: FFTW, see What FFTW Really Computes. (For people accustomed to d@0: the DCT/DST, FFTW's definitions have a coefficient of 2 in front d@0: of the cos/sin functions so that they correspond precisely to an d@0: even/odd DFT of size N. Some authors also include additional d@0: multiplicative factors of d@0: √2for selected inputs and outputs; this makes d@0: the transform orthogonal, but sacrifices the direct equivalence to a d@0: symmetric DFT.) d@0: d@0:

Which type do you need?

Since the required flavor of even/odd DFT depends upon your problem, d@0: you are the best judge of this choice, but we can make a few comments d@0: on relative efficiency to help you in your selection. In particular, d@0: R*DFT01 and R*DFT10 tend to be slightly faster than R*DFT11 d@0: (especially for odd sizes), while the R*DFT00 transforms are sometimes d@0: significantly slower (especially for even sizes).² d@0: d@0:

Thus, if only the boundary conditions on the transform inputs are d@0: specified, we generally recommend R*DFT10 over R*DFT00 and R*DFT01 over d@0: R*DFT11 (unless the half-sample shift or the self-inverse property is d@0: significant for your problem). d@0: d@0:

If performance is important to you and you are using only small sizes d@0: (say n<200), e.g. for multi-dimensional transforms, then you d@0: might consider generating hard-coded transforms of those sizes and types d@0: that you are interested in (see Generating your own code). d@0: d@0:

We are interested in hearing what types of symmetric transforms you find d@0: most useful. d@0: d@0: d@0:

d@0:

Footnotes

[1] There are also type V-VIII transforms, which d@0: correspond to a logical DFT of odd size N, independent of d@0: whether the physical size n is odd, but we do not support these d@0: variants.

d@0: d@0:

[2] R*DFT00 is d@0: sometimes slower in FFTW because we discovered that the standard d@0: algorithm for computing this by a pre/post-processed real DFT—the d@0: algorithm used in FFTPACK, Numerical Recipes, and other sources for d@0: decades now—has serious numerical problems: it already loses several d@0: decimal places of accuracy for 16k sizes. There seem to be only two d@0: alternatives in the literature that do not suffer similarly: a d@0: recursive decomposition into smaller DCTs, which would require a large d@0: set of codelets for efficiency and generality, or sacrificing a factor of d@0: ~2 d@0: in speed to use a real DFT of twice the size. We currently d@0: employ the latter technique for general n, as well as a limited d@0: form of the former method: a split-radix decomposition when n d@0: is odd (N a multiple of 4). For N containing many d@0: factors of 2, the split-radix method seems to recover most of the d@0: speed of the standard algorithm without the accuracy tradeoff.

d@0: d@0:

2.5.2 Real even/odd DFTs (cosine/sine transforms)

Which type do you need?

Footnotes