Mercurial > hg > silvet
changeset 10:dc1b5f16d000
More notes (about the frequency range retained from the constant Q)
author | Chris Cannam |
---|---|
date | Thu, 20 Mar 2014 17:47:14 +0000 |
parents | 6d4df772f108 |
children | f268212ac499 |
files | notes/transcriptionMultipleTemplates-annotated.m |
diffstat | 1 files changed, 16 insertions(+), 1 deletions(-) [+] |
line wrap: on
line diff
--- a/notes/transcriptionMultipleTemplates-annotated.m Thu Mar 20 11:18:22 2014 +0000 +++ b/notes/transcriptionMultipleTemplates-annotated.m Thu Mar 20 17:47:14 2014 +0000 @@ -36,7 +36,22 @@ %% range 27.5 -> samplerate/3, 60 bins per octave, a 'q' of 0.8 (lower %% than the maximum, and default, value of 1), 'atomHopFactor' 0.3 %% rather than the default 0.25 (why?), Hann window, default sparsity -%% threshold. +%% threshold. The CQT obtained is the interpolated real-valued +%% magnitude spectrogram rather than the complex output. + +%% The audio is always resampled to 44100Hz (if it isn't at that rate +%% already) and mixed down to mono. + +%% The computed CQT parameters actually obtained are: +%% 10 octaves +%% highest frequency 14700Hz +%% lowest frequency 14.5223Hz +%% column height 600 (60 bpo * 10 oct) +%% But only bins 56:600 are used, the first 55 are dropped, leaving +%% 545 bins per column. I *think* the spectrogram is the "right" way +%% up at this point so those first 55 bins are the lowest-frequency +%% ones, meaning the frequency range actually returned is 27.4144Hz +%% to 14700Hz. %% for a 43.5 second 44.1 KHz audio file, intCQT will be a 545x30941 %% array, one column every 0.0014 seconds.