Chris@184: KISS FFT - A mixed-radix Fast Fourier Transform based up on the principle, Chris@184: "Keep It Simple, Stupid." Chris@184: Chris@184: There are many great fft libraries already around. Kiss FFT is not trying Chris@184: to be better than any of them. It only attempts to be a reasonably efficient, Chris@184: moderately useful FFT that can use fixed or floating data types and can be Chris@184: incorporated into someone's C program in a few minutes with trivial licensing. Chris@184: Chris@184: USAGE: Chris@184: Chris@184: The basic usage for 1-d complex FFT is: Chris@184: Chris@184: #include "kiss_fft.h" Chris@184: Chris@184: kiss_fft_cfg cfg = kiss_fft_alloc( nfft ,is_inverse_fft ,0,0 ); Chris@184: Chris@184: while ... Chris@184: Chris@184: ... // put kth sample in cx_in[k].r and cx_in[k].i Chris@184: Chris@184: kiss_fft( cfg , cx_in , cx_out ); Chris@184: Chris@184: ... // transformed. DC is in cx_out[0].r and cx_out[0].i Chris@184: Chris@184: free(cfg); Chris@184: Chris@184: Note: frequency-domain data is stored from dc up to 2pi. Chris@184: so cx_out[0] is the dc bin of the FFT Chris@184: and cx_out[nfft/2] is the Nyquist bin (if exists) Chris@184: Chris@184: Declarations are in "kiss_fft.h", along with a brief description of the Chris@184: functions you'll need to use. Chris@184: Chris@184: Code definitions for 1d complex FFTs are in kiss_fft.c. Chris@184: Chris@184: You can do other cool stuff with the extras you'll find in tools/ Chris@184: Chris@184: * multi-dimensional FFTs Chris@184: * real-optimized FFTs (returns the positive half-spectrum: (nfft/2+1) complex frequency bins) Chris@184: * fast convolution FIR filtering (not available for fixed point) Chris@184: * spectrum image creation Chris@184: Chris@184: The core fft and most tools/ code can be compiled to use float, double, Chris@184: Q15 short or Q31 samples. The default is float. Chris@184: Chris@184: Chris@184: BACKGROUND: Chris@184: Chris@184: I started coding this because I couldn't find a fixed point FFT that didn't Chris@184: use assembly code. I started with floating point numbers so I could get the Chris@184: theory straight before working on fixed point issues. In the end, I had a Chris@184: little bit of code that could be recompiled easily to do ffts with short, float Chris@184: or double (other types should be easy too). Chris@184: Chris@184: Once I got my FFT working, I was curious about the speed compared to Chris@184: a well respected and highly optimized fft library. I don't want to criticize Chris@184: this great library, so let's call it FFT_BRANDX. Chris@184: During this process, I learned: Chris@184: Chris@184: 1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d). Chris@184: 2. It took me an embarrassingly long time to get FFT_BRANDX working. Chris@184: 3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size). Chris@184: 4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode. Chris@184: Chris@184: It is wonderful that free, highly optimized libraries like FFT_BRANDX exist. Chris@184: But such libraries carry a huge burden of complexity necessary to extract every Chris@184: last bit of performance. Chris@184: Chris@184: Sometimes simpler is better, even if it's not better. Chris@184: Chris@184: FREQUENTLY ASKED QUESTIONS: Chris@184: Q: Can I use kissfft in a project with a ___ license? Chris@184: A: Yes. See LICENSE below. Chris@184: Chris@184: Q: Why don't I get the output I expect? Chris@184: A: The two most common causes of this are Chris@184: 1) scaling : is there a constant multiplier between what you got and what you want? Chris@184: 2) mixed build environment -- all code must be compiled with same preprocessor Chris@184: definitions for FIXED_POINT and kiss_fft_scalar Chris@184: Chris@184: Q: Will you write/debug my code for me? Chris@184: A: Probably not unless you pay me. I am happy to answer pointed and topical questions, but Chris@184: I may refer you to a book, a forum, or some other resource. Chris@184: Chris@184: Chris@184: PERFORMANCE: Chris@184: (on Athlon XP 2100+, with gcc 2.96, float data type) Chris@184: Chris@184: Kiss performed 10000 1024-pt cpx ffts in .63 s of cpu time. Chris@184: For comparison, it took md5sum twice as long to process the same amount of data. Chris@184: Chris@184: Transforming 5 minutes of CD quality audio takes less than a second (nfft=1024). Chris@184: Chris@184: DO NOT: Chris@184: ... use Kiss if you need the Fastest Fourier Transform in the World Chris@184: ... ask me to add features that will bloat the code Chris@184: Chris@184: UNDER THE HOOD: Chris@184: Chris@184: Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer Chris@184: and output buffer that are the same, a temporary buffer will be created to hold the data. Chris@184: Chris@184: No static data is used. The core routines of kiss_fft are thread-safe (but not all of the tools directory). Chris@184: Chris@184: No scaling is done for the floating point version (for speed). Chris@184: Scaling is done both ways for the fixed-point version (for overflow prevention). Chris@184: Chris@184: Optimized butterflies are used for factors 2,3,4, and 5. Chris@184: Chris@184: The real (i.e. not complex) optimization code only works for even length ffts. It does two half-length Chris@184: FFTs in parallel (packed into real&imag), and then combines them via twiddling. The result is Chris@184: nfft/2+1 complex frequency bins from DC to Nyquist. If you don't know what this means, search the web. Chris@184: Chris@184: The fast convolution filtering uses the overlap-scrap method, slightly Chris@184: modified to put the scrap at the tail. Chris@184: Chris@184: LICENSE: Chris@184: Revised BSD License, see COPYING for verbiage. Chris@184: Basically, "free to use&change, give credit where due, no guarantees" Chris@184: Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at Chris@184: the other end. See http://www.fsf.org/licensing/licenses Chris@184: Chris@184: A commercial license is available which removes the requirement for attribution. Contact me for details. Chris@184: Chris@184: Chris@184: TODO: Chris@184: *) Add real optimization for odd length FFTs Chris@184: *) Document/revisit the input/output fft scaling Chris@184: *) Make doc describing the overlap (tail) scrap fast convolution filtering in kiss_fastfir.c Chris@184: *) Test all the ./tools/ code with fixed point (kiss_fastfir.c doesn't work, maybe others) Chris@184: Chris@184: AUTHOR: Chris@184: Mark Borgerding Chris@184: Mark@Borgerding.net