Audio Features and Vamp Plugins » History » Version 13

Chris Cannam, 2015-07-15 02:22 PM

1 1 Chris Cannam
h1. Audio Features and Vamp Plugins
2 1 Chris Cannam
3 1 Chris Cannam
h3. General outline
4 1 Chris Cannam
5 13 Chris Cannam
To be propagated to / synchronised with https://www.doc.gold.ac.uk/~mas01cr/teaching/dhoxss15/
6 1 Chris Cannam
7 13 Chris Cannam
# Introductory notes and slides on acoustics and audio (CR, 25 min)
8 13 Chris Cannam
# Sonic Visualiser - hands on with waveform and spectrograms (CC, 25 min)
9 13 Chris Cannam
# Introductory notes and slides on audio features (CC, 5 min)
10 13 Chris Cannam
# Sonic Visualiser - hands on with Vamp plugins (CC, 20 min)
11 13 Chris Cannam
# Python/IPython intro (CR, 35 min)
12 13 Chris Cannam
# Break (20 min)
13 13 Chris Cannam
# Feature Extraction using Vamp Plugins in Python (IPython Notebook) (CC, 40 min)
14 13 Chris Cannam
# Audio Indexing and Search in Python (IPython Notebook) (CR, 40 min)
15 3 Chris Cannam
16 3 Chris Cannam
h3. Breakdown of CC sections
17 3 Chris Cannam
18 4 Chris Cannam
h5. Sonic Visualiser - hands on with waveform and spectrograms
19 3 Chris Cannam
20 7 Chris Cannam
* Waveform
21 5 Chris Cannam
## Start Sonic Visualiser and open "A Friendly Warning"
22 6 Chris Cannam
## Show dragging through the file using Navigate tool, and also using the overview at bottom
23 6 Chris Cannam
## Play from the start, just to get an idea what it sounds like
24 5 Chris Cannam
## Return to the start and zoom in (using the zoom wheel, but noting that the mouse wheel also works)
25 6 Chris Cannam
## Notice the different shapes in waveform resulting from different types of synthetic percussive sound (low-frequency kick drum / higher frequency cymbal-type sounds) - refer back to Christophe's notes about correspondence between e.g. signal voltage and speaker cone deflection
26 1 Chris Cannam
## Continue until the vocal starts, and observe that we can see very little that relates to e.g. sung pitch, although if we zoom in we can quite clearly see sibilance (these frequencies around 10kHz are pretty much the sweet spot for visibility in a 44.1kHz waveform)
27 6 Chris Cannam
## We will return to this track
28 6 Chris Cannam
29 8 Chris Cannam
* Spectrogram
30 1 Chris Cannam
## New session, open piano-scale.wav and play it
31 9 Chris Cannam
## Some information can sort-of be perceived and measured from the waveform here: we can see when the notes start, and can get simple fundamental frequency estimate - zoom in to the first note, switch to Select mode, drag out one cycle - it's about 170 samples, so 44100/170 = 259 Hz - the note is a middle C so true value should be nearer to 261, but this is a fair approximation. (But this is a very simple example!)
32 9 Chris Cannam
## Now open a plain spectrogram - Pane -> Add Spectrogram (or G key). Observe full range on frequency scale; x axis is time, this is a simple time-frequency breakdown.
33 8 Chris Cannam
## Notice that, for each note, we can see the fundamental frequency most strongly and then the harmonic sequence. The harmonics are spaced more widely for higher notes because they are multiples of the fundamental frequency, which is larger. The noise floor is visible because we're using a dB scale, can switch to Linear to isolate only the strong frequencies. There isn't really enough detail to measure much here. (NB the default colour scheme is unhelpful to colour blind users, so it might be worth changing to Sunset scheme.)
34 9 Chris Cannam
## Close that pane and open a "melodic-range spectrogram" - Pane -> Add Melodic Range Spectrogram (or M key). Observe the much more limited frequency range and the fact that this spectrogram uses both Linear colour and Sunset scheme by default.
35 10 Chris Cannam
## Although the higher harmonics quickly disappear off the top of the scale, we can clearly see that the spacing between harmonics is now the same for each note, and that the semitones are equally-spaced (note spacing corresponding to the major scale intervals). This is because the vertical scale is now logarithmic in frequency, which makes it (fudging the issue a little) linear in pitch. Correspondingly there is now a little representation of a piano keyboard shown at left, with middle C highlighted. (This assumes A=440Hz, what if that's not true? We'll come back to that in a mo)
36 10 Chris Cannam
## Select the Measure tool and show that we can get a frequency readout with harmonic markers. Return to the Navigate tool and contrast with the readout that is displayed as you move the pointer over the pane.
37 10 Chris Cannam
## Close that pane and open the "peak-frequency spectrogram" - Pane -> Add Peak Frequency Spectrogram (or K key). Notice that here we can just wave the Navigate tool over a bin to get an estimate of the instantaneous frequency there.
38 11 Chris Cannam
## New session, open A Friendly Warning again and open both the plain spectrogram and the melodic-range one -- observe and contrast the various visible elements, in particular vertical lines in full frequency range for noisy percussion, relative invisibility of such broadband sounds in the melodic-range spectrogram, curved slides in vocal etc.
39 12 Chris Cannam
## Go to File -> Replace Main Audio, open King Henry. Note among other things that we need to increase the gain on the melodic-range spectrogram, and that the vibrato is visible and things like vibrato rate could be approximately measured. Leave this file open.
40 11 Chris Cannam
41 11 Chris Cannam
42 5 Chris Cannam
43 4 Chris Cannam
h5. Introductory notes and slides on audio features
44 3 Chris Cannam
45 4 Chris Cannam
h5. Sonic Visualiser - hands on with Vamp plugins
46 3 Chris Cannam
47 3 Chris Cannam
h5. Feature Extraction using Vamp Plugins in Python
48 3 Chris Cannam
49 3 Chris Cannam
h3. Materials
50 3 Chris Cannam
51 3 Chris Cannam
* Audio features slides: "PDF":https://code.soundsoftware.ac.uk/projects/dhoxss15/repository/raw/vamp-sv.pdf, "PowerPoint":https://code.soundsoftware.ac.uk/projects/dhoxss15/repository/raw/vamp-sv.pptx
52 3 Chris Cannam
53 3 Chris Cannam
* "IPython Notebook for Vamp Plugins in Python":https://code.soundsoftware.ac.uk/projects/dhoxss15/repository/raw/Vamp.v3.ipynb