annotate examples/vamp-example-plugins.txt @ 255:88ef5ffdbe8d

* docs
author cannam
date Wed, 12 Nov 2008 14:11:01 +0000
parents
children f80e34e36a79
rev   line source
cannam@255 1
cannam@255 2 Vamp Example Plugins
cannam@255 3 ====================
cannam@255 4
cannam@255 5 The vamp-example-plugins library contains a number of Vamp audio
cannam@255 6 analysis plugins provided as part of the Vamp plugin SDK.
cannam@255 7
cannam@255 8 These are simple, but sometimes useful, plugins whose source code you
cannam@255 9 are free to study and reuse in any proprietary or non-proprietary
cannam@255 10 plugins of your own without any licensing obligation.
cannam@255 11
cannam@255 12 User documentation for the individual plugins in this library follows.
cannam@255 13
cannam@255 14
cannam@255 15 Amplitude Follower
cannam@255 16 ==================
cannam@255 17
cannam@255 18 System identifier: vamp-example-plugins:amplitudefollower
cannam@255 19 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#amplitudefollower
cannam@255 20
cannam@255 21 Amplitude Follower tracks and returns the amplitude of the audio
cannam@255 22 signal, block by block. It uses a method from the SuperCollider audio
cannam@255 23 processing language, implemented as a Vamp plugin by Dan Stowell.
cannam@255 24
cannam@255 25 Parameters
cannam@255 26 ----------
cannam@255 27
cannam@255 28 Attack time (seconds)
cannam@255 29 Release time (seconds)
cannam@255 30
cannam@255 31 Outputs
cannam@255 32 -------
cannam@255 33
cannam@255 34 Amplitude
cannam@255 35 ~~~~~~~~~
cannam@255 36
cannam@255 37 The estimated peak amplitude (in volts) for the current processing block.
cannam@255 38
cannam@255 39
cannam@255 40 Simple Fixed Tempo Estimator
cannam@255 41 ============================
cannam@255 42
cannam@255 43 System identifier: vamp-example-plugins:fixedtempo
cannam@255 44 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo
cannam@255 45
cannam@255 46 Simple Fixed Tempo Estimator analyses a fragment of audio and
cannam@255 47 estimates its tempo. It assumes that its input is of fixed tempo, and
cannam@255 48 it analyses only the first (small but configurable number of) seconds
cannam@255 49 before returning a result, discarding all subsequent input.
cannam@255 50
cannam@255 51 The plugin calculates an overall energy rise function across a series
cannam@255 52 of short frequency-domain input frames, takes the autocorrelation of
cannam@255 53 this function, filters it to stress possible metrical patterns,
cannam@255 54 locates peaks, and converts from autocorrelation lag to the
cannam@255 55 corresponding tempo.
cannam@255 56
cannam@255 57 The filtering process involves searching for peaks at simple
cannam@255 58 metrically related intervals (at a given autocorrelation lag as well
cannam@255 59 as at 0.5, 2, and 4 times that lag), boosting each peak that shows
cannam@255 60 strong related peaks. A simplistic perceptual curve is also applied
cannam@255 61 in order to increase the probability of detecting a "likely" tempo.
cannam@255 62 For improved tempo precision, each tempo with strong related peaks is
cannam@255 63 averaged with the tempi calculated from those peaks.
cannam@255 64
cannam@255 65 The method is mainly tuned for 4/4 pop and dance rhythms.
cannam@255 66
cannam@255 67 This plugin returns many of its intermediate calculations as
cannam@255 68 additional outputs, as well as the most favoured tempo. Although as a
cannam@255 69 tempo estimator it's still fairly primitive, it is intended to provide
cannam@255 70 a useful example of a slightly more complex feature extraction plugin
cannam@255 71 than the other examples, as well as one that returns several different
cannam@255 72 types of output at a time.
cannam@255 73
cannam@255 74 Parameters
cannam@255 75 ----------
cannam@255 76
cannam@255 77 Minimum estimated tempo, Maximum estimated tempo (bpm) - These
cannam@255 78 parameters control the range of values within which the tempo
cannam@255 79 estimator will return its estimate.
cannam@255 80
cannam@255 81 Input duration to study (seconds) - The tempo estimator uses only the
cannam@255 82 first part of its input, discarding any that follows. This parameter
cannam@255 83 controls how much input it will use. There is no value in increasing
cannam@255 84 this beyond 8x the duration of the slowest returned beat. The default
cannam@255 85 of 10 seconds is likely to be appropriate for most purposes.
cannam@255 86
cannam@255 87 Outputs
cannam@255 88 -------
cannam@255 89
cannam@255 90 Tempo
cannam@255 91 ~~~~~
cannam@255 92
cannam@255 93 The tempo estimator's best guess at the tempo of its input, in beats
cannam@255 94 per minute.
cannam@255 95
cannam@255 96 This is returned as a feature whose timestamp and duration cover the
cannam@255 97 range of the input which was used in estimating the tempo, with a
cannam@255 98 single value containing the tempo.
cannam@255 99
cannam@255 100 Tempo candidates
cannam@255 101 ~~~~~~~~~~~~~~~~
cannam@255 102
cannam@255 103 Several guesses at the possible tempo. This output is returned as a
cannam@255 104 single feature whose timestamp and duration cover the range of the
cannam@255 105 input which was used in estimating the tempo, with up to 10 bins
cannam@255 106 containing one tempo value in each bin, with the "best guess" tempo in
cannam@255 107 bin 0.
cannam@255 108
cannam@255 109 Detection function
cannam@255 110 ~~~~~~~~~~~~~~~~~~
cannam@255 111
cannam@255 112 The basic onset detection function used in tempo estimation.
cannam@255 113
cannam@255 114 Autocorrelation function
cannam@255 115 ~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 116
cannam@255 117 The autocorrelation of the onset detection function.
cannam@255 118
cannam@255 119 Filtered Autocorrelation
cannam@255 120 ~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 121
cannam@255 122 The autocorrelation after filtering to boost values with possible
cannam@255 123 metrically related peaks and to apply perceptual weighting. The peak
cannam@255 124 value of this function is the one that will be used as the "best
cannam@255 125 guess".
cannam@255 126
cannam@255 127
cannam@255 128 Simple Percussion Onset Detector
cannam@255 129 ================================
cannam@255 130
cannam@255 131 System identifier: vamp-example-plugins:percussiononsets
cannam@255 132 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#percussiononsets
cannam@255 133
cannam@255 134 Simple Percussion Onset Detector estimates the locations of percussive
cannam@255 135 onsets in the audio signal. It uses a method described in "Drum
cannam@255 136 Source Separation using Percussive Feature Detection and Spectral
cannam@255 137 Modulation" by Dan Barry, Derry Fitzgerald, Eugene Coyle and Bob
cannam@255 138 Lawlor, ISSC 2005.
cannam@255 139
cannam@255 140 The principle is to exploit the broadband nature of noisy percussive
cannam@255 141 onsets by identifying only those frames in which the energy rise shows
cannam@255 142 a broadband profile.
cannam@255 143
cannam@255 144 The plugin takes a series of frequency domain frames, and examines
cannam@255 145 each frame to count the number of bins whose energy content has
cannam@255 146 increased by more than a certain threshold since the prior frame.
cannam@255 147 Frames in which this number is at a peak relative to prior and
cannam@255 148 following frames and also exceeds another threshold value are
cannam@255 149 classified as percussive onsets.
cannam@255 150
cannam@255 151 Parameters
cannam@255 152 ----------
cannam@255 153
cannam@255 154 Energy rise threshold (dB) - The rise in energy within a bin from one
cannam@255 155 frame to the next that is required for a bin to be counted toward the
cannam@255 156 detection function's bin count. This roughly corresponds to how
cannam@255 157 "loud" a percussive sound must be in order to be detected.
cannam@255 158
cannam@255 159 Sensitivity (%) - The proportion of bins that must exceed the energy
cannam@255 160 rise threshold in order for an onset to be detected (at frames in
cannam@255 161 which the detection function peaks). This roughly corresponds to how
cannam@255 162 "noisy" a percussive sound must be in order to be detected.
cannam@255 163
cannam@255 164 Outputs
cannam@255 165 -------
cannam@255 166
cannam@255 167 Onsets
cannam@255 168 ~~~~~~
cannam@255 169
cannam@255 170 The estimated onset locations.
cannam@255 171
cannam@255 172 Detection Function
cannam@255 173 ~~~~~~~~~~~~~~~~~~
cannam@255 174
cannam@255 175 The energy rise detection function whose peaks were used to estimate
cannam@255 176 onset locations.
cannam@255 177
cannam@255 178
cannam@255 179 Simple Power Spectrum
cannam@255 180 =====================
cannam@255 181
cannam@255 182 System identifier: vamp-example-plugins:powerspectrum
cannam@255 183 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#powerspectrum
cannam@255 184
cannam@255 185 Simple Power Spectrum returns a power spectrum calculated from
cannam@255 186 windowed short-time Fourier transforms of the input audio. (The power
cannam@255 187 spectrum for a frame consists of a sequence of the squares of the
cannam@255 188 magnitudes of the complex values for each frequency bin in the result
cannam@255 189 of the Fourier transform.)
cannam@255 190
cannam@255 191 This very simple plugin is an illustration of the fact that if a
cannam@255 192 plugin requests frequency-domain input, its input will already be in
cannam@255 193 the form needed for a spectrum such as this. The plugin has no work
cannam@255 194 left to do except to calculate the squared magnitude from the
cannam@255 195 cartesian complex representation.
cannam@255 196
cannam@255 197 This plugin also illustrates how to return "grid-type" visualisation
cannam@255 198 data from a Vamp plugin.
cannam@255 199
cannam@255 200 Parameters
cannam@255 201 ----------
cannam@255 202
cannam@255 203 None.
cannam@255 204
cannam@255 205 Outputs
cannam@255 206 -------
cannam@255 207
cannam@255 208 Power Spectrum
cannam@255 209 ~~~~~~~~~~~~~~
cannam@255 210
cannam@255 211 The power spectrum calculated from the input frame. This output
cannam@255 212 returns a single feature per processing block, containing
cannam@255 213 blocksize/2+1 power values corresponding to the FFT bins from DC to
cannam@255 214 Nyquist inclusive. The DC bin is always returned.
cannam@255 215
cannam@255 216
cannam@255 217 Spectral Centroid
cannam@255 218 =================
cannam@255 219
cannam@255 220 System identifier: vamp-example-plugins:spectralcentroid
cannam@255 221 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#spectralcentroid
cannam@255 222
cannam@255 223 Spectral Centroid calculates the "centre of gravity" of the frequency
cannam@255 224 spectrum for each input frame.
cannam@255 225
cannam@255 226 Parameters
cannam@255 227 ----------
cannam@255 228
cannam@255 229 None.
cannam@255 230
cannam@255 231 Outputs
cannam@255 232 -------
cannam@255 233
cannam@255 234 Log Frequency Centroid
cannam@255 235 ~~~~~~~~~~~~~~~~~~~~~~
cannam@255 236
cannam@255 237 The centroid of the log-weighted frequency spectrum. That is, the sum
cannam@255 238 across Fourier transform output bins of the logarithm of the bin
cannam@255 239 frequency multiplied by the bin magnitude, divided by the sum of the
cannam@255 240 bin magnitudes, and the inverse logarithm taken so as to give the
cannam@255 241 result as a frequency in Hz.
cannam@255 242
cannam@255 243 Linear Frequency Centroid
cannam@255 244 ~~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 245
cannam@255 246 The centroid of the linear-weighted frequency spectrum. That is, the
cannam@255 247 sum across Fourier transform output bins of the bin frequency
cannam@255 248 multiplied by the bin magnitude, divided by the sum of the bin
cannam@255 249 magnitudes. The result is a frequency in Hz.
cannam@255 250
cannam@255 251
cannam@255 252 Zero Crossings
cannam@255 253 ==============
cannam@255 254
cannam@255 255 System identifier: vamp-example-plugins:zerocrossing
cannam@255 256 RDF URI: http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#zerocrossing
cannam@255 257
cannam@255 258 Zero Crossings calculates the positions and density of "zero-crossing"
cannam@255 259 points in an audio waveform. For the purposes of this plugin, that
cannam@255 260 means those positions at which the sampled value switches from
cannam@255 261 zero-or-less to greater-than-zero, or vice versa.
cannam@255 262
cannam@255 263 Parameters
cannam@255 264 ----------
cannam@255 265
cannam@255 266 None.
cannam@255 267
cannam@255 268 Outputs
cannam@255 269 -------
cannam@255 270
cannam@255 271 Zero Crossing Counts
cannam@255 272 ~~~~~~~~~~~~~~~~~~~~
cannam@255 273
cannam@255 274 The number of zero-crossing points found in the current block of
cannam@255 275 samples, as a single-valued feature returned per processing block.
cannam@255 276
cannam@255 277 Zero Crossings
cannam@255 278 ~~~~~~~~~~~~~~
cannam@255 279
cannam@255 280 The locations of zero-crossing points, returning one feature
cannam@255 281 timestamped to the zero-crossing location, without values, for each
cannam@255 282 crossing point.
cannam@255 283