cannam@6
|
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
cannam@6
|
2 <html>
|
cannam@6
|
3 <head>
|
cannam@6
|
4 <link rel="stylesheet" media="screen" type="text/css" href="/screen.css"/>
|
cannam@6
|
5 <link rel="icon" type="image/png" href="/images/waveform.png"/>
|
cannam@6
|
6 <link rel="shortcut" type="image/png" href="/images/waveform.png"/>
|
cannam@6
|
7 <title>Vamp Example Plugins: User Documentation</title>
|
cannam@6
|
8 <meta name="robots" content="index"/>
|
cannam@6
|
9 </head>
|
cannam@6
|
10 <body>
|
cannam@8
|
11 <h1 id="header"><span>Vamp Plugins</span></h1>
|
cannam@8
|
12
|
cannam@8
|
13 <h2>Vamp Example Plugins</h2>
|
cannam@6
|
14
|
cannam@7
|
15 <p>The “vamp-example-plugins” library contains a number of
|
cannam@6
|
16 <a href="http://www.vamp-plugins.org/">Vamp audio analysis
|
cannam@6
|
17 plugins</a> provided as part of the Vamp plugin SDK.
|
cannam@6
|
18
|
cannam@6
|
19 </p>
|
cannam@6
|
20 <p>These are simple, but sometimes useful, plugins whose source code you
|
cannam@6
|
21 are free to study and reuse in any proprietary or non-proprietary
|
cannam@6
|
22 plugins of your own without any licensing obligation.
|
cannam@6
|
23 </p>
|
cannam@6
|
24 <p>User documentation for the individual plugins in this library follows.
|
cannam@6
|
25 </p>
|
cannam@6
|
26 <div class="toc2">1. <a href="#amplitudefollower">Amplitude Follower</a></div>
|
cannam@6
|
27 <div class="toc2">2. <a href="#fixedtempo">Simple Fixed Tempo Estimator</a></div>
|
cannam@6
|
28 <div class="toc2">3. <a href="#percussiononsets">Simple Percussion Onset Detector</a></div>
|
cannam@6
|
29 <div class="toc2">4. <a href="#powerspectrum">Simple Power Spectrum</a></div>
|
cannam@6
|
30 <div class="toc2">5. <a href="#spectralcentroid">Spectral Centroid</a></div>
|
cannam@6
|
31 <div class="toc2">6. <a href="#zerocrossing">Zero Crossings</a></div>
|
cannam@6
|
32
|
cannam@6
|
33 <div class="oddcontent"><a name="amplitudefollower"></a><h2>1. Amplitude Follower</h2>
|
cannam@6
|
34
|
cannam@6
|
35 <p><b>System identifier</b> – <code>vamp-example-plugins:amplitudefollower</code><br>
|
cannam@6
|
36 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#amplitudefollower">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#amplitudefollower</a>
|
cannam@6
|
37 </p>
|
cannam@6
|
38 <p>Amplitude Follower tracks and returns the amplitude of the audio
|
cannam@6
|
39 signal sample by sample, returning peak values block by block.
|
cannam@6
|
40 </p>
|
cannam@6
|
41 </div><div class="evencontent"><a name="toc2"></a><h3>1.1. Parameters</h3>
|
cannam@6
|
42
|
cannam@6
|
43 <p><b>Attack time</b> (seconds) – The 60dB convergence time for an increase in amplitude.<br>
|
cannam@6
|
44 <b>Release time</b> (seconds) – The 60dB convergence time for a decrease in amplitude.
|
cannam@6
|
45 </p>
|
cannam@6
|
46 <p>For example, if you feed the plugin with a simple step function that
|
cannam@6
|
47 jumps from level A to level B, then the output will start off as A,
|
cannam@6
|
48 then at the moment of stepping it will start to converge exponentially
|
cannam@6
|
49 to B, reaching with 60dB of the actual value within the time specified
|
cannam@6
|
50 by the Attack time parameter.
|
cannam@6
|
51 </p>
|
cannam@6
|
52 <p>Similarly, if the plugin's input then steps down from B to A, the
|
cannam@6
|
53 output will start converging at the moment of stepping, reaching
|
cannam@6
|
54 within 60dB of the new value within the time specified by the Release
|
cannam@6
|
55 time parameter.
|
cannam@6
|
56 </p>
|
cannam@6
|
57 </div><div class="oddcontent"><a name="toc3"></a><h3>1.2. Outputs</h3>
|
cannam@6
|
58
|
cannam@6
|
59 </div><div class="evencontent"><a name="toc4"></a><h4>1.2.1. Amplitude</h4>
|
cannam@6
|
60
|
cannam@6
|
61 <p>The peak tracked amplitude (in volts) for the current processing block.
|
cannam@6
|
62 </p>
|
cannam@6
|
63 </div><div class="oddcontent"><a name="toc5"></a><h3>1.3. References and Credits</h3>
|
cannam@6
|
64
|
cannam@6
|
65 <p>Amplitude Follower uses a method from the SuperCollider audio
|
cannam@6
|
66 processing language. It was implemented as a Vamp plugin by Dan
|
cannam@6
|
67 Stowell.
|
cannam@6
|
68 </p>
|
cannam@6
|
69 </div><div class="evencontent"><a name="fixedtempo"></a><h2>2. Simple Fixed Tempo Estimator</h2>
|
cannam@6
|
70
|
cannam@6
|
71 <p><b>System identifier</b> – <code>vamp-example-plugins:fixedtempo</code><br>
|
cannam@6
|
72 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo</a>
|
cannam@6
|
73 </p>
|
cannam@6
|
74 <p>Simple Fixed Tempo Estimator analyses a fragment of audio and
|
cannam@6
|
75 estimates its tempo. It assumes that its input is of fixed tempo, and
|
cannam@6
|
76 it analyses only the first (small but configurable number of) seconds
|
cannam@6
|
77 before returning a result, discarding all subsequent input.
|
cannam@6
|
78 </p>
|
cannam@6
|
79 <p>The plugin calculates an overall energy rise function across a series
|
cannam@6
|
80 of short frequency-domain input frames, takes the autocorrelation of
|
cannam@6
|
81 this function, filters it to stress possible metrical patterns,
|
cannam@6
|
82 locates peaks, and converts from autocorrelation lag to the
|
cannam@6
|
83 corresponding tempo.
|
cannam@6
|
84 </p>
|
cannam@6
|
85 <p>The filtering process involves searching for peaks at simple
|
cannam@6
|
86 metrically related intervals (at a given autocorrelation lag as well
|
cannam@6
|
87 as at 0.5, 2, and 4 times that lag), boosting each peak that shows
|
cannam@6
|
88 strong related peaks. A simplistic perceptual curve is also applied
|
cannam@6
|
89 in order to increase the probability of detecting a "likely" tempo.
|
cannam@6
|
90 For improved tempo precision, each tempo with strong related peaks is
|
cannam@6
|
91 averaged with the tempi calculated from those peaks.
|
cannam@6
|
92 </p>
|
cannam@6
|
93 <p>The method is best suited for 4/4 pop and dance rhythms.
|
cannam@6
|
94 </p>
|
cannam@6
|
95 <p>This plugin returns many of its intermediate calculations as
|
cannam@6
|
96 additional outputs, as well as the most favoured tempo. Although as a
|
cannam@6
|
97 tempo estimator it's still fairly primitive, it is intended to provide
|
cannam@6
|
98 a useful example of a slightly more complex feature extraction plugin
|
cannam@6
|
99 than the other examples, as well as one that returns several different
|
cannam@6
|
100 types of output at a time.
|
cannam@6
|
101 </p>
|
cannam@6
|
102 </div><div class="oddcontent"><a name="toc7"></a><h3>2.1. Parameters</h3>
|
cannam@6
|
103
|
cannam@6
|
104 <p><b>Minimum estimated tempo</b>, <b>Maximum estimated tempo</b> (bpm) – These
|
cannam@6
|
105 parameters control the range of values within which the tempo
|
cannam@6
|
106 estimator will return its estimate.
|
cannam@6
|
107 </p>
|
cannam@6
|
108 <p><b>Input duration to study</b> (seconds) – The tempo estimator uses only the
|
cannam@6
|
109 first part of its input, discarding any that follows. This parameter
|
cannam@6
|
110 controls how much input it will use. There is no value in increasing
|
cannam@6
|
111 this beyond 8x the duration of the slowest returned beat. The default
|
cannam@6
|
112 of 10 seconds is likely to be appropriate for most purposes.
|
cannam@6
|
113 </p>
|
cannam@6
|
114 </div><div class="evencontent"><a name="toc8"></a><h3>2.2. Outputs</h3>
|
cannam@6
|
115
|
cannam@6
|
116 </div><div class="oddcontent"><a name="toc9"></a><h4>2.2.1. Tempo</h4>
|
cannam@6
|
117
|
cannam@6
|
118 <p>The tempo estimator's best guess at the tempo of its input, in beats
|
cannam@6
|
119 per minute.
|
cannam@6
|
120 </p>
|
cannam@6
|
121 <p>This is returned as a feature whose timestamp and duration cover the
|
cannam@6
|
122 range of the input which was used in estimating the tempo, with a
|
cannam@6
|
123 single value containing the tempo.
|
cannam@6
|
124 </p>
|
cannam@6
|
125 </div><div class="evencontent"><a name="toc10"></a><h4>2.2.2. Tempo candidates</h4>
|
cannam@6
|
126
|
cannam@6
|
127 <p>Several guesses at the possible tempo. This output is returned as a
|
cannam@6
|
128 single feature whose timestamp and duration cover the range of the
|
cannam@6
|
129 input which was used in estimating the tempo, with up to 10 bins
|
cannam@6
|
130 containing one tempo value in each bin, with the "best guess" tempo in
|
cannam@6
|
131 bin 0.
|
cannam@6
|
132 </p>
|
cannam@6
|
133 </div><div class="oddcontent"><a name="toc11"></a><h4>2.2.3. Detection function</h4>
|
cannam@6
|
134
|
cannam@6
|
135 <p>The basic onset detection function used in tempo estimation.
|
cannam@6
|
136 </p>
|
cannam@6
|
137 </div><div class="evencontent"><a name="toc12"></a><h4>2.2.4. Autocorrelation function</h4>
|
cannam@6
|
138
|
cannam@6
|
139 <p>The autocorrelation of the onset detection function.
|
cannam@6
|
140 </p>
|
cannam@6
|
141 </div><div class="oddcontent"><a name="toc13"></a><h4>2.2.5. Filtered Autocorrelation</h4>
|
cannam@6
|
142
|
cannam@6
|
143 <p>The autocorrelation after filtering to boost values with possible
|
cannam@6
|
144 metrically related peaks and to apply perceptual weighting. The peak
|
cannam@6
|
145 value of this function is the one that will be used as the "best
|
cannam@6
|
146 guess".
|
cannam@6
|
147 </p>
|
cannam@6
|
148 </div><div class="evencontent"><a name="toc14"></a><h3>2.3. References and Credits</h3>
|
cannam@6
|
149
|
cannam@6
|
150 <p>Simple Fixed Tempo Estimator uses a method derived from work by
|
cannam@6
|
151 Matthew Davies: see for example M. E. P. Davies and M. D. Plumbley,
|
cannam@6
|
152 <i>Beat Tracking With A Two State Model</i>, in Proceedings of the IEEE
|
cannam@6
|
153 International Conference on Acoustics, Speech and Signal Processing
|
cannam@6
|
154 2005. This plugin, made by Chris Cannam, is only an unsubtle
|
cannam@6
|
155 simplification of a small part of the published method.
|
cannam@6
|
156 </p>
|
cannam@6
|
157 <p>The Queen Mary plugin set
|
cannam@6
|
158 (<a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins</a>)
|
cannam@6
|
159 contains a Tempo and Beat Tracker plugin by Matthew Davies providing a
|
cannam@6
|
160 more realistic implementation.
|
cannam@6
|
161 </p>
|
cannam@6
|
162 </div><div class="oddcontent"><a name="percussiononsets"></a><h2>3. Simple Percussion Onset Detector</h2>
|
cannam@6
|
163
|
cannam@6
|
164 <p><b>System identifier</b> – <code>vamp-example-plugins:percussiononsets</code><br>
|
cannam@6
|
165 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#percussiononsets">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#percussiononsets</a>
|
cannam@6
|
166 </p>
|
cannam@6
|
167 <p>Simple Percussion Onset Detector estimates the locations of percussive
|
cannam@6
|
168 onsets in the audio signal.
|
cannam@6
|
169 </p>
|
cannam@6
|
170 <p>The principle is to exploit the broadband nature of noisy percussive
|
cannam@6
|
171 onsets by identifying only those frames in which the energy rise shows
|
cannam@6
|
172 a broadband profile.
|
cannam@6
|
173 </p>
|
cannam@6
|
174 <p>The plugin takes a series of frequency domain frames, and examines
|
cannam@6
|
175 each frame to count the number of bins whose energy content has
|
cannam@6
|
176 increased by more than a certain threshold since the prior frame.
|
cannam@6
|
177 Frames in which this number is at a peak relative to prior and
|
cannam@6
|
178 following frames and also exceeds another threshold value are
|
cannam@6
|
179 classified as percussive onsets.
|
cannam@6
|
180 </p>
|
cannam@6
|
181 </div><div class="evencontent"><a name="toc16"></a><h3>3.1. Parameters</h3>
|
cannam@6
|
182
|
cannam@6
|
183 <p><b>Energy rise threshold</b> (dB) – The rise in energy within a bin from one
|
cannam@6
|
184 frame to the next that is required for a bin to be counted toward the
|
cannam@6
|
185 detection function's bin count. This roughly corresponds to how
|
cannam@6
|
186 "loud" a percussive sound must be in order to be detected.
|
cannam@6
|
187 </p>
|
cannam@6
|
188 <p><b>Sensitivity</b> (%) – The proportion of bins that must exceed the energy
|
cannam@6
|
189 rise threshold in order for an onset to be detected (at frames in
|
cannam@6
|
190 which the detection function peaks). This roughly corresponds to how
|
cannam@6
|
191 "noisy" a percussive sound must be in order to be detected.
|
cannam@6
|
192 </p>
|
cannam@6
|
193 </div><div class="oddcontent"><a name="toc17"></a><h3>3.2. Outputs</h3>
|
cannam@6
|
194
|
cannam@6
|
195 </div><div class="evencontent"><a name="toc18"></a><h4>3.2.1. Onsets</h4>
|
cannam@6
|
196
|
cannam@6
|
197 <p>The estimated onset locations.
|
cannam@6
|
198 </p>
|
cannam@6
|
199 </div><div class="oddcontent"><a name="toc19"></a><h4>3.2.2. Detection Function</h4>
|
cannam@6
|
200
|
cannam@6
|
201 <p>The energy rise detection function whose peaks were used to estimate
|
cannam@6
|
202 onset locations.
|
cannam@6
|
203 </p>
|
cannam@6
|
204 </div><div class="evencontent"><a name="toc20"></a><h3>3.3. References and Credits</h3>
|
cannam@6
|
205
|
cannam@6
|
206 <p>The method used in Simple Percussion Onset Detector was described in
|
cannam@6
|
207 Dan Barry, Derry Fitzgerald, Eugene Coyle and
|
cannam@6
|
208 Bob Lawlor, <i>Drum Source Separation using Percussive Feature Detection and
|
cannam@6
|
209 Spectral Modulation</i>, ISSC 2005. The plugin was made by Chris Cannam.
|
cannam@6
|
210 </p>
|
cannam@6
|
211 </div><div class="oddcontent"><a name="powerspectrum"></a><h2>4. Simple Power Spectrum</h2>
|
cannam@6
|
212
|
cannam@6
|
213 <p><b>System identifier</b> – <code>vamp-example-plugins:powerspectrum</code><br>
|
cannam@6
|
214 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#powerspectrum">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#powerspectrum</a>
|
cannam@6
|
215 </p>
|
cannam@6
|
216 <p>Simple Power Spectrum returns a power spectrum calculated from
|
cannam@6
|
217 windowed short-time Fourier transforms of the input audio. (The power
|
cannam@6
|
218 spectrum for a frame consists of a sequence of the squares of the
|
cannam@6
|
219 magnitudes of the complex values for each frequency bin in the result
|
cannam@6
|
220 of the Fourier transform.)
|
cannam@6
|
221 </p>
|
cannam@6
|
222 <p>This very simple plugin is an illustration of the fact that if a
|
cannam@6
|
223 plugin requests frequency-domain input, its input will already be in
|
cannam@6
|
224 the form needed for a spectrum such as this. The plugin has no work
|
cannam@6
|
225 left to do except to calculate the squared magnitude from the
|
cannam@6
|
226 cartesian complex representation.
|
cannam@6
|
227 </p>
|
cannam@6
|
228 <p>This plugin also illustrates how to return "grid-type" visualisation
|
cannam@6
|
229 data from a Vamp plugin.
|
cannam@6
|
230 </p>
|
cannam@6
|
231 </div><div class="evencontent"><a name="toc22"></a><h3>4.1. Parameters</h3>
|
cannam@6
|
232
|
cannam@6
|
233 <p>None.
|
cannam@6
|
234 </p>
|
cannam@6
|
235 </div><div class="oddcontent"><a name="toc23"></a><h3>4.2. Outputs</h3>
|
cannam@6
|
236
|
cannam@6
|
237 </div><div class="evencontent"><a name="toc24"></a><h4>4.2.1. Power Spectrum</h4>
|
cannam@6
|
238
|
cannam@6
|
239 <p>The power spectrum calculated from the input frame. This output
|
cannam@6
|
240 returns a single feature per processing block, containing
|
cannam@6
|
241 blocksize/2+1 power values corresponding to the FFT bins from DC to
|
cannam@6
|
242 Nyquist inclusive. The DC bin is always returned.
|
cannam@6
|
243 </p>
|
cannam@6
|
244 </div><div class="oddcontent"><a name="spectralcentroid"></a><h2>5. Spectral Centroid</h2>
|
cannam@6
|
245
|
cannam@6
|
246 <p><b>System identifier</b> – <code>vamp-example-plugins:spectralcentroid</code><br>
|
cannam@6
|
247 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#spectralcentroid">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#spectralcentroid</a>
|
cannam@6
|
248 </p>
|
cannam@6
|
249 <p>Spectral Centroid calculates the "centre of gravity" of the frequency
|
cannam@6
|
250 spectrum for each input frame.
|
cannam@6
|
251 </p>
|
cannam@6
|
252 </div><div class="evencontent"><a name="toc26"></a><h3>5.1. Parameters</h3>
|
cannam@6
|
253
|
cannam@6
|
254 <p>None.
|
cannam@6
|
255 </p>
|
cannam@6
|
256 </div><div class="oddcontent"><a name="toc27"></a><h3>5.2. Outputs</h3>
|
cannam@6
|
257
|
cannam@6
|
258 </div><div class="evencontent"><a name="toc28"></a><h4>5.2.1. Log Frequency Centroid</h4>
|
cannam@6
|
259
|
cannam@6
|
260 <p>The centroid of the log-weighted frequency spectrum. That is, the sum
|
cannam@6
|
261 across Fourier transform output bins of the logarithm of the bin
|
cannam@6
|
262 frequency multiplied by the bin magnitude, divided by the sum of the
|
cannam@6
|
263 bin magnitudes, and the inverse logarithm taken so as to give the
|
cannam@6
|
264 result as a frequency in Hz.
|
cannam@6
|
265 </p>
|
cannam@6
|
266 </div><div class="oddcontent"><a name="toc29"></a><h4>5.2.2. Linear Frequency Centroid</h4>
|
cannam@6
|
267
|
cannam@6
|
268 <p>The centroid of the linear-weighted frequency spectrum. That is, the
|
cannam@6
|
269 sum across Fourier transform output bins of the bin frequency
|
cannam@6
|
270 multiplied by the bin magnitude, divided by the sum of the bin
|
cannam@6
|
271 magnitudes. The result is a frequency in Hz.
|
cannam@6
|
272 </p>
|
cannam@6
|
273 </div><div class="evencontent"><a name="zerocrossing"></a><h2>6. Zero Crossings</h2>
|
cannam@6
|
274
|
cannam@6
|
275 <p><b>System identifier</b> – <code>vamp-example-plugins:zerocrossing</code><br>
|
cannam@6
|
276 <b>RDF URI</b> – <a href="http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#zerocrossing">http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#zerocrossing</a>
|
cannam@6
|
277 </p>
|
cannam@6
|
278 <p>Zero Crossings calculates the positions and density of "zero-crossing"
|
cannam@6
|
279 points in an audio waveform. For the purposes of this plugin, that
|
cannam@6
|
280 means those positions at which the sampled value switches from
|
cannam@6
|
281 zero-or-less to greater-than-zero, or vice versa.
|
cannam@6
|
282 </p>
|
cannam@6
|
283 </div><div class="oddcontent"><a name="toc31"></a><h3>6.1. Parameters</h3>
|
cannam@6
|
284
|
cannam@6
|
285 <p>None.
|
cannam@6
|
286 </p>
|
cannam@6
|
287 </div><div class="evencontent"><a name="toc32"></a><h3>6.2. Outputs</h3>
|
cannam@6
|
288
|
cannam@6
|
289 </div><div class="oddcontent"><a name="toc33"></a><h4>6.2.1. Zero Crossing Counts</h4>
|
cannam@6
|
290
|
cannam@6
|
291 <p>The number of zero-crossing points found in the current block of
|
cannam@6
|
292 samples, as a single-valued feature returned per processing block.
|
cannam@6
|
293 </p>
|
cannam@6
|
294 </div><div class="evencontent"><a name="toc34"></a><h4>6.2.2. Zero Crossings</h4>
|
cannam@6
|
295
|
cannam@6
|
296 <p>The locations of zero-crossing points, returning one feature
|
cannam@6
|
297 timestamped to the zero-crossing location, without values, for each
|
cannam@6
|
298 crossing point.
|
cannam@6
|
299 </p>
|
cannam@6
|
300 <p></p>
|
cannam@6
|
301
|
cannam@6
|
302 </div>
|
cannam@6
|
303 </body>
|
cannam@6
|
304 </html>
|