MeasureTool » History » Version 30
Chris Cannam, 2013-03-05 01:24 PM
1 | 1 | Chris Cannam | h1. About the Measure tool and its limitations |
---|---|---|---|
2 | 1 | Chris Cannam | |
3 | 1 | Chris Cannam | The Sonic Visualiser Help reference "describes the Measure tool":http://www.sonicvisualiser.org/doc/reference/2.0/en/#measurements like this: |
4 | 1 | Chris Cannam | |
5 | 5 | Chris Cannam | > The measure tool enables you to obtain measurements in scale units (such as time in the X coordinate, or whatever the Y coordinate of the current layer represents) corresponding to certain pixel positions. To measure a region, just click and drag a rectangle covering it, using the left mouse button with the measure tool selected ... |
6 | 4 | Chris Cannam | > It's important to note that the measurements shown in this way are based entirely on the pixel coordinates of the measurement rectangle, not on properties of the data being displayed. |
7 | 2 | Chris Cannam | |
8 | 6 | Chris Cannam | The measure tool also has the ability to produce an automatic bounding box for a graphical feature, when double-clicked: |
9 | 2 | Chris Cannam | |
10 | 2 | Chris Cannam | > The area enclosed by the rectangle is based on the extent of similarly-coloured pixels surrounding the click position: it is entirely graphical, involving no audio analysis, and so depends on the gain and colour scheme in use in the spectrogram. |
11 | 7 | Chris Cannam | |
12 | 17 | Chris Cannam | Here's an example of what that means in terms of the practical limitations of the tool. |
13 | 7 | Chris Cannam | |
14 | 24 | Chris Cannam | h2. Measurement box examples |
15 | 24 | Chris Cannam | |
16 | 10 | Chris Cannam | !>measure.png! |
17 | 1 | Chris Cannam | |
18 | 26 | Chris Cannam | To the right is an image showing two spectrogram panes of a recording of an amateur male singer. |
19 | 26 | Chris Cannam | |
20 | 26 | Chris Cannam | For this illustration I have switched off all of the spectrogram interpolation options in the preferences, so as to be able to talk about individual spectrogram bins with more precision. The image shown here has three measure boxes across the two panes -- this is actually a composite image, because it isn't actually possible to highlight all three at once in SV. |
21 | 26 | Chris Cannam | |
22 | 27 | Chris Cannam | Let's imagine we want to measure the variation in pitch of the singer's vibrato (or wobble, or what have you). |
23 | 13 | Chris Cannam | |
24 | 28 | Chris Cannam | The *bottom* measurement box shows a naive measurement. |
25 | 16 | Chris Cannam | |
26 | 21 | Chris Cannam | The box is shown with a frequency-scale extent from 268.3 to 296.2Hz. These are the values that you would find if you took the green-line pixel positions and read them off against the scale on the left, interpolating appropriately (and taking into account that it's a log frequency scale). The frequency difference from top to bottom is 27.9Hz, but we want it in semitones, which corresponds to a frequency ratio rather than a frequency difference. Our ratio here is 296.2/268.3 = 1.104, for a semitone range of 12 * log2(1.104) = 1.712 semitones. |
27 | 1 | Chris Cannam | |
28 | 30 | Chris Cannam | How accurate is that as an estimate of the singer's vibrato range? This is an 8192-bin spectrogram; the bin resolution of 44100/8192 = 5.38Hz corresponds to 12 * log2((268.3 + 5.38) / 268.3) = 0.34 semitones at this pitch. Probably not very accurate, and are these lines in the right places anyway? |
29 | 21 | Chris Cannam | |
30 | 17 | Chris Cannam | Moving up, the *middle* measurement box shows the same rectangle around a higher harmonic of the same note. This time the range from 537.9 to 581.8Hz gives a 1.36 semitone range. This harmonic has double the frequency of the first; if we just doubled the frequency limits from the first box, we'd have 536.6 to 592.4Hz so this is tighter at both top and bottom, but particularly at the top where our placement of the green line was a bit optimistic in the first box. |
31 | 18 | Chris Cannam | |
32 | 20 | Chris Cannam | However, this still overestimates the vibrato range. That's partly because the spectrogram still has limited bin resolution here: we're at double the earlier frequency, so it now has about 0.17 semitones per bin. It's partly because of the width of the central spectral lobe: even a pure sine tone gets smudged a bit across three visible bins at this colour scale, when the transform is windowed with any window that produces less leakage elsewhere than a rectangular one does. And it's partly because the signal is moving, so it contributes to more than one vertical bin per time division and so looks "thicker". |
33 | 20 | Chris Cannam | |
34 | 1 | Chris Cannam | The *top* measurement box appears around a far higher harmonic (the 23rd) of this note, in a 2048-bin spectrogram. Up here there is much higher resolution available in semitone terms. Also, I've made more effort to place the green lines through the "middle" of the trace rather than simply drawing around it. Here the frequency range from 6229.8 to 6580.2Hz gives us a semitone range of 0.95, a lot more conservative than the other estimates. |
35 | 1 | Chris Cannam | |
36 | 24 | Chris Cannam | h2. Some conclusions |
37 | 1 | Chris Cannam | |
38 | 24 | Chris Cannam | # Even a high resolution spectrogram has surprisingly limited discrimination around the fundamental of a typical singing voice. |
39 | 1 | Chris Cannam | # The "thickness" of a line on the spectrogram is deceptive. In fact, you can make a line appear as thick as you like by simply turning up the gain, because that magnifies the analysis window lobes. To get an estimate of a partial frequency, you need to measure "through" it rather than "around" it. |
40 | 1 | Chris Cannam | # Interpolation can make it easier to "see" features, but doesn't make it any significant difference to measuring them. |
41 | 24 | Chris Cannam | # The measure tool is just a tool -- all it does is map a pixel position on the plot onto a frequency scale. If you double-click with it, it selects some contiguous pixels of similar-looking colour and puts a box around those. Whether any of this means anything musically is really up to the user. |
42 | 1 | Chris Cannam | |
43 | 24 | Chris Cannam | h2. In light of this... |
44 | 22 | Chris Cannam | |
45 | 24 | Chris Cannam | Returning to the bottom measurement box, this is what the fundamental "wiggle" looks like with the default settings, i.e. with interpolation switched on in an 8192-bin spectrogram. |
46 | 24 | Chris Cannam | |
47 | 22 | Chris Cannam | !measure2.png! |
48 | 24 | Chris Cannam | |
49 | 25 | Chris Cannam | This time, I've just aimed the measurement box roughly "through" the lines rather than "around" them. With an estimated range of 0.95 semitones, this is much closer to the estimate obtained from the high overtone above. (Well, actually it's the same value, but that's a bit lucky.) Using interpolation makes this sort of estimate by eye slightly easier to do, but it's still only an estimate. |