Wiki » History » Version 15
Version 14 (Chris Cannam, 2013-10-22 02:34 PM) → Version 15/22 (Chris Cannam, 2013-10-22 02:36 PM)
h1. Summary of results
{{>toc}}
h2. What we're looking at
Here we're only looking at causal methods, so no forward/backward filtering. The question in my head is really whether faster IIR filters are still so much faster as to be worth using in preference to linear-phase methods with better (?) theoretical quality. Of course that would always depend on the application, but it's interesting to compare.
We compared
* @decimate@: the "Decimator":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimator.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses an IIR lowpass filter (perhaps an elliptical filter?) with 8 coefficient pairs;
* @decimate_b@: the "DecimatorB":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimatorB.html class in the "qm-dsp":/projects/qm-dsp library, which uses a Butterworth IIR lowpass filter of order 6;
* @resample_hq@, @resample_mq@, @resample_lq@: the "Resampler":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classResampler.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses a lengthy Kaiser-windowed sinc filter, at three different quality settings;
* @src@: the sndfile-resample program which uses "libsamplerate":http://mega-nerd.com/SRC/, a well trusted resampler also using a Kaiser-windowed sinc implementation, at its default quality setting;
* @zoh@: the sndfile-resample zero-order hold resampler, which just takes every Nth sample without any filtering, serving as a baseline.
h2. Speed
Input is 11520000 For 5292000 input frames (two minutes at 96kHz sample rate). Listed kfps on a Core i3-3229Y low-voltage CPU. (Frames-per-second values correspond to 1000s of are for input frames. frames.)
CPU is a Core 2 Quad Q9550 at 2.83GHz. All code is 64-bit. The qm-dsp implementations (resample_* and decimate) were compiled with -O3 -ffast-math while the libsamplerate implementations (src and zoh) were standard Ubuntu packages, so probably -O2. This is likely to make a very significant difference, so these results are more useful for comparison among the qm-dsp implementations than between qm-dsp and libsamplerate.
The decimate implementation supports factors up to 8 only, so 16x, 32x and 64x are handled in two passes.
All implementations use libsndfile for audio file I/O, so that should not be a factor in overall speed.
These frames-per-second figures look terribly precise, but I imagine there's a good 10% margin of error (run to run or whatever).
h3. Implementations by decimation factor
For 11520000 input frames.
h4. Factor 02
|Kfps|Clock time|Implementation|
| 61604|0.187|zoh|
| 52602|0.219|decimate_b|
| 52363|0.220|decimate|
| 17668|0.652|resample_lq|
| 9365|1.230|resample_mq|
| 4768|2.416|resample_hq|
| 2176|5.294|src|
h4. Factor 04
|Kfps|Clock time|Implementation|
| 93658|0.123|zoh|
| 58181|0.198|decimate|
| 47213|0.244|decimate_b|
| 19896|0.579|resample_lq|
| 9982|1.154|resample_mq|
| 4965|2.320|resample_hq|
| 2292|5.026|src|
h4. Factor 08
|Kfps|Clock time|Implementation|
| 128000|0.090|zoh|
| 60952|0.189|decimate|
| 44651|0.258|decimate_b|
| 21215|0.543|resample_lq|
| 10331|1.115|resample_mq|
| 3480|3.310|resample_hq|
| 2361|4.879|src|
h4. Factor 16
|Kfps|Clock time|Implementation|
| 160000|0.072|zoh|
| 46080|0.250|decimate|
| 43636|0.264|decimate_b|
| 22068|0.522|resample_lq|
| 7700|1.496|resample_mq|
| 3529|3.264|resample_hq|
| 2119|5.435|src|
h4. Factor 32
|Kfps|Clock time|Implementation|
| 182857|0.063|zoh|
| 53333|0.216|decimate|
| 42825|0.269|decimate_b|
| 21021|0.548|resample_lq|
| 7379|1.561|resample_mq|
| 3443|3.345|resample_hq|
| 2179|5.286|src|
h4. Factor 64
|Kfps|Clock time|Implementation|
| 188852|0.061|zoh|
| 53581|0.215|decimate|
| 42666|0.270|decimate_b|
| 16202|0.711|resample_lq|
| 7417|1.553|resample_mq|
| 3489|3.301|resample_hq|
| 2423|4.753|src|
h3. Decimation factors by implementation
For 11520000 input frames.
h4. Implementation zoh
|Kfps|Clock time|Factor|
| 188852|0.061|64|
| 182857|0.063|32|
| 160000|0.072|16|
| 128000|0.090|08|
| 93658|0.123|04|
| 61604|0.187|02|
h4. Implementation decimate
|Kfps|Clock time|Factor|
| 60952|0.189|08|
| 58181|0.198|04|
| 53581|0.215|64|
| 53333|0.216|32|
| 52363|0.220|02|
| 46080|0.250|16|
h4. Implementation decimate_b
|Kfps|Clock time|Factor|
| 52602|0.219|02|
| 47213|0.244|04|
| 44651|0.258|08|
| 43636|0.264|16|
| 42825|0.269|32|
| 42666|0.270|64|
h4. Implementation resample_hq
|Kfps|Clock time|Factor|
| 4965|2.320|04|
| 4768|2.416|02|
| 3529|3.264|16|
| 3489|3.301|64|
| 3480|3.310|08|
| 3443|3.345|32|
h4. Implementation resample_mq
|Kfps|Clock time|Factor|
| 10331|1.115|08|
| 9982|1.154|04|
| 9365|1.230|02|
| 7700|1.496|16|
| 7417|1.553|64|
| 7379|1.561|32|
h4. Implementation resample_lq
|Kfps|Clock time|Factor|
| 22068|0.522|16|
| 21215|0.543|08|
| 21021|0.548|32|
| 19896|0.579|04|
| 17668|0.652|02|
| 16202|0.711|64|
h4. Implementation src
|Kfps|Clock time|Factor|
| 2423|4.753|64|
| 2361|4.879|08|
| 2292|5.026|04|
| 2179|5.286|32|
| 2176|5.294|02|
| 2119|5.435|16|
h3. Resampler filter lengths
Filter lengths the qm-dsp Resamplers decided to use:
|Factor|Length (hq)|Length (mq)|Length (lq)|
|2|643|291|119|
|4|1285|579|237|
|8|2567|1155|471|
|16|5131|2307|939|
|32|10261|4613|1877|
|64|20519|9223|3751|
{{>toc}}
h2. What we're looking at
Here we're only looking at causal methods, so no forward/backward filtering. The question in my head is really whether faster IIR filters are still so much faster as to be worth using in preference to linear-phase methods with better (?) theoretical quality. Of course that would always depend on the application, but it's interesting to compare.
We compared
* @decimate@: the "Decimator":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimator.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses an IIR lowpass filter (perhaps an elliptical filter?) with 8 coefficient pairs;
* @decimate_b@: the "DecimatorB":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimatorB.html class in the "qm-dsp":/projects/qm-dsp library, which uses a Butterworth IIR lowpass filter of order 6;
* @resample_hq@, @resample_mq@, @resample_lq@: the "Resampler":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classResampler.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses a lengthy Kaiser-windowed sinc filter, at three different quality settings;
* @src@: the sndfile-resample program which uses "libsamplerate":http://mega-nerd.com/SRC/, a well trusted resampler also using a Kaiser-windowed sinc implementation, at its default quality setting;
* @zoh@: the sndfile-resample zero-order hold resampler, which just takes every Nth sample without any filtering, serving as a baseline.
h2. Speed
Input is 11520000 For 5292000 input frames (two minutes at 96kHz sample rate). Listed kfps on a Core i3-3229Y low-voltage CPU. (Frames-per-second values correspond to 1000s of are for input frames. frames.)
CPU is a Core 2 Quad Q9550 at 2.83GHz. All code is 64-bit. The qm-dsp implementations (resample_* and decimate) were compiled with -O3 -ffast-math while the libsamplerate implementations (src and zoh) were standard Ubuntu packages, so probably -O2. This is likely to make a very significant difference, so these results are more useful for comparison among the qm-dsp implementations than between qm-dsp and libsamplerate.
The decimate implementation supports factors up to 8 only, so 16x, 32x and 64x are handled in two passes.
All implementations use libsndfile for audio file I/O, so that should not be a factor in overall speed.
These frames-per-second figures look terribly precise, but I imagine there's a good 10% margin of error (run to run or whatever).
h3. Implementations by decimation factor
For 11520000 input frames.
h4. Factor 02
|Kfps|Clock time|Implementation|
| 61604|0.187|zoh|
| 52602|0.219|decimate_b|
| 52363|0.220|decimate|
| 17668|0.652|resample_lq|
| 9365|1.230|resample_mq|
| 4768|2.416|resample_hq|
| 2176|5.294|src|
h4. Factor 04
|Kfps|Clock time|Implementation|
| 93658|0.123|zoh|
| 58181|0.198|decimate|
| 47213|0.244|decimate_b|
| 19896|0.579|resample_lq|
| 9982|1.154|resample_mq|
| 4965|2.320|resample_hq|
| 2292|5.026|src|
h4. Factor 08
|Kfps|Clock time|Implementation|
| 128000|0.090|zoh|
| 60952|0.189|decimate|
| 44651|0.258|decimate_b|
| 21215|0.543|resample_lq|
| 10331|1.115|resample_mq|
| 3480|3.310|resample_hq|
| 2361|4.879|src|
h4. Factor 16
|Kfps|Clock time|Implementation|
| 160000|0.072|zoh|
| 46080|0.250|decimate|
| 43636|0.264|decimate_b|
| 22068|0.522|resample_lq|
| 7700|1.496|resample_mq|
| 3529|3.264|resample_hq|
| 2119|5.435|src|
h4. Factor 32
|Kfps|Clock time|Implementation|
| 182857|0.063|zoh|
| 53333|0.216|decimate|
| 42825|0.269|decimate_b|
| 21021|0.548|resample_lq|
| 7379|1.561|resample_mq|
| 3443|3.345|resample_hq|
| 2179|5.286|src|
h4. Factor 64
|Kfps|Clock time|Implementation|
| 188852|0.061|zoh|
| 53581|0.215|decimate|
| 42666|0.270|decimate_b|
| 16202|0.711|resample_lq|
| 7417|1.553|resample_mq|
| 3489|3.301|resample_hq|
| 2423|4.753|src|
h3. Decimation factors by implementation
For 11520000 input frames.
h4. Implementation zoh
|Kfps|Clock time|Factor|
| 188852|0.061|64|
| 182857|0.063|32|
| 160000|0.072|16|
| 128000|0.090|08|
| 93658|0.123|04|
| 61604|0.187|02|
h4. Implementation decimate
|Kfps|Clock time|Factor|
| 60952|0.189|08|
| 58181|0.198|04|
| 53581|0.215|64|
| 53333|0.216|32|
| 52363|0.220|02|
| 46080|0.250|16|
h4. Implementation decimate_b
|Kfps|Clock time|Factor|
| 52602|0.219|02|
| 47213|0.244|04|
| 44651|0.258|08|
| 43636|0.264|16|
| 42825|0.269|32|
| 42666|0.270|64|
h4. Implementation resample_hq
|Kfps|Clock time|Factor|
| 4965|2.320|04|
| 4768|2.416|02|
| 3529|3.264|16|
| 3489|3.301|64|
| 3480|3.310|08|
| 3443|3.345|32|
h4. Implementation resample_mq
|Kfps|Clock time|Factor|
| 10331|1.115|08|
| 9982|1.154|04|
| 9365|1.230|02|
| 7700|1.496|16|
| 7417|1.553|64|
| 7379|1.561|32|
h4. Implementation resample_lq
|Kfps|Clock time|Factor|
| 22068|0.522|16|
| 21215|0.543|08|
| 21021|0.548|32|
| 19896|0.579|04|
| 17668|0.652|02|
| 16202|0.711|64|
h4. Implementation src
|Kfps|Clock time|Factor|
| 2423|4.753|64|
| 2361|4.879|08|
| 2292|5.026|04|
| 2179|5.286|32|
| 2176|5.294|02|
| 2119|5.435|16|
h3. Resampler filter lengths
Filter lengths the qm-dsp Resamplers decided to use:
|Factor|Length (hq)|Length (mq)|Length (lq)|
|2|643|291|119|
|4|1285|579|237|
|8|2567|1155|471|
|16|5131|2307|939|
|32|10261|4613|1877|
|64|20519|9223|3751|