Chris@19: TODO before FFTW-$2\pi$:
Chris@19: 
Chris@19: * Wisdom: make it clear that it is specific to the exact fftw version
Chris@19:   and configuration.  Report error codes when reading wisdom.  Maybe
Chris@19:   have multiple system wisdom files, one per version?
Chris@19: 
Chris@19: * DCT/DST codelets?  which kinds?
Chris@19: 
Chris@19: * investigate the addition-chain trig computation
Chris@19: 
Chris@19: * I can't believe that there isn't a closed form for the omega
Chris@19:   array in Rader.
Chris@19: 
Chris@19: * convolution problem type(s)
Chris@19: 
Chris@19: * Explore the idea of having n < 0 in tensors, possibly to mean
Chris@19:   inverse DFT.
Chris@19: 
Chris@19: * better estimator: possibly, let "other" cost be coef * n, where
Chris@19:   coef is a per-solver constant determined via some big numerical
Chris@19:   optimization/fit.
Chris@19: 
Chris@19: * vector radix, multidimensional codelets
Chris@19: 
Chris@19: * it may be a good idea to unify all those little loops that do
Chris@19:   copying, (X[i], X[n-i]) <- (X[i] + X[n-i], X[i] - X[n-i]),
Chris@19:   and multiplication of vectors by twiddle factors.
Chris@19: 
Chris@19: * Pruned FFTs (basically, a vecloop that skips zeros).
Chris@19: 
Chris@19: * Try FFTPACK-style back-and-forth (Stockham) FFT.  (We tried this a
Chris@19:   few years ago and it was slower, but perhaps matters have changed.)
Chris@19: 
Chris@19: * Generate assembly directly for more processors, or maybe fork gcc.  =)
Chris@19: 
Chris@19: * ensure that threaded solvers generate (block_size % 4 == 0)
Chris@19:   to allow SIMD to be used.
Chris@19: 
Chris@19: * memoize triggen.