Wiki » History » Version 13

Version 12 (Chris Cannam, 2013-10-22 09:17 AM) → Version 13/22 (Chris Cannam, 2013-10-22 09:19 AM)

h1. Summary of results

{{>toc}}

h2. What we're looking at

Here we're only looking at causal methods, so no forward/backward filtering. The question in my head is really whether faster IIR filters are still so much faster as to be worth using in preference to linear-phase methods with better (?) theoretical quality. Of course that would always depend on the application, but it's interesting to compare.

We compared

* @decimate@: the "Decimator":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimator.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses an IIR lowpass filter (perhaps an elliptical filter?) with 8 coefficient pairs;

* @decimate_b@: the "DecimatorB":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classDecimatorB.html class in the "qm-dsp":/projects/qm-dsp library, which uses a Butterworth IIR lowpass filter of order 6;

* @resample_hq@, @resample_mq@, @resample_lq@: the "Resampler":http://code.soundsoftware.ac.uk/projects/qm-dsp/embedded/classResampler.html implementation in the "qm-dsp":/projects/qm-dsp library, which uses a lengthy Kaiser-windowed sinc filter, at three different quality settings;

* @src@: the sndfile-resample program which uses "libsamplerate":http://mega-nerd.com/SRC/, a well trusted resampler also using a Kaiser-windowed sinc implementation, at its default quality setting;

* @zoh@: the sndfile-resample zero-order hold resampler, which just takes every Nth sample without any filtering, serving as a baseline.



h2. Speed

For 5292000 input frames on a Core i3-3229Y low-voltage CPU. (Frames-per-second values are for input frames.)

All code is 64-bit. The qm-dsp implementations (resample_* and decimate) were compiled with -O3 -ffast-math while the libsamplerate implementations (src and zoh) were standard Ubuntu packages, so probably -O2. This is likely to make a very significant difference, so these results are more useful for comparison among the qm-dsp implementations than between qm-dsp and libsamplerate.

The decimate implementation supports factors up to 8 only, so 16x, 32x and 64x are handled in two passes.

All implementations use libsndfile for audio file I/O, so that should not be a factor in overall speed.

These frames-per-second figures look terribly precise, but I imagine there's a good 10% margin of error (run to run or whatever).

h3. Implementations by decimation factor

h4. Factor 2

|Frames per second|Clock time|Implementation|
| 14225806|0.372|decimate|
| 14187667|0.373|decimate_b|
| 8939189|0.592|zoh|
| 3732016|1.418|resample_lq|
| 1856842|2.850|resample_mq|
| 989158|5.350|resample_hq|
| 516141|10.253|src|

h4. Factor 4

|Frames per second|Clock time|Implementation|
|17070967|0.310|zoh|
|14659279|0.361|decimate|
|13465648|0.393|decimate_b|
| 4285020|1.235|resample_lq|
| 2186776|2.420|resample_mq|
| 1056287|5.010|resample_hq|
| 610099|8.674|src|

h4. Factor 8

|Frames per second|Clock time|Implementation|
|26328358|0.201|zoh|
|13926315|0.380|decimate|
|12027272|0.440|decimate_b|
| 4895467|1.081|resample_lq|
| 2470588|2.142|resample_mq|
| 1166409|4.537|resample_hq|
| 614919|8.606|src|

h4. Factor 16

|Frames per second|Clock time|Implementation|
|33493670|0.158|zoh|
|12721153|0.416|decimate|
|11141052|0.475|decimate_b|
| 5093358|1.039|resample_lq|
| 2515209|2.104|resample_mq|
| 1182041|4.477|resample_hq|
| 668857|7.912|src|

h4. Factor 32

|Frames per second|Clock time|Implementation|
|41669291|0.127|zoh|
|14498630|0.365|decimate|
|12540284|0.422|decimate_b|
| 5318592|0.995|resample_lq|
| 2312937|2.288|resample_mq|
| 1148936|4.606|resample_hq|
| 670467|7.893|src|

h4. Factor 64

|Frames per second|Clock time|Implementation|
|42000000|0.126|zoh|
|13397468|0.395|decimate|
|11972850|0.442|decimate_b|
| 5040000|1.050|resample_lq|
| 2365668|2.237|resample_mq|
| 1232704|4.293|resample_hq|
| 636057|8.320|src|

h3. Decimation factors by implementation

h4. Implementation zoh

|Frames per second|Clock time|Factor|
|42000000|0.126|factor 64|
|41669291|0.127|factor 32|
|33493670|0.158|factor 16|
|26328358|0.201|factor 8|
|17070967|0.310|factor 4|
| 8939189|0.592|factor 2|

h4. Implementation decimate

|Frames per second|Clock time|Factor|
|14659279|0.361|factor 4|
|14498630|0.365|factor 32|
|14225806|0.372|factor 2|
|13926315|0.380|factor 8|
|13397468|0.395|factor 64|
|12721153|0.416|factor 16|

h4. Implementation decimate_b

|Frames per second|Clock time|Factor|
|14187667|0.373|factor 2|
|13465648|0.393|factor 4|
|12540284|0.422|factor 32|
|12027272|0.440|factor 8|
|11972850|0.442|factor 64|
|11141052|0.475|factor 16|

h4. Implementation resample_hq

|Frames per second|Clock time|Factor|
| 1232704|4.293|factor 64|
| 1182041|4.477|factor 16|
| 1166409|4.537|factor 8|
| 1148936|4.606|factor 32|
| 1056287|5.010|factor 4|
| 989158|5.350|factor 2|

h4. Implementation resample_mq

|Frames per second|Clock time|Factor|
| 2515209|2.104|factor 16|
| 2470588|2.142|factor 8|
| 2365668|2.237|factor 64|
| 2312937|2.288|factor 32|
| 2186776|2.420|factor 4|
| 1856842|2.850|factor 2|

h4. Implementation resample_lq

|Frames per second|Clock time|Factor|
| 5318592|0.995|factor 32|
| 5093358|1.039|factor 16|
| 5040000|1.050|factor 64|
| 4895467|1.081|factor 8|
| 4285020|1.235|factor 4|
| 3732016|1.418|factor 2|

h4. Implementation src

|Frames per second|Clock time|Factor|
| 670467|7.893|factor 32|
| 668857|7.912|factor 16|
| 636057|8.320|factor 64|
| 614919|8.606|factor 8|
| 610099|8.674|factor 4|
| 516141|10.253|factor 2|

h3. Resampler filter lengths

Filter lengths the qm-dsp Resamplers decided to use:

|Factor|Length (hq)|Length (mq)|Length (lq)|
|2|643|291|119|
|4|1285|579|237|
|8|2567|1155|471|
|16|5131|2307|939|
|32|10261|4613|1877|
|64|20519|9223|3751|