annotate README @ 60:400e09d69b8f

Add file: URI for file -- this is stripped out by the test script afterwards, I'm not sure how it got stripped from the source as well but obviously it's not parseable without it
author Chris Cannam
date Thu, 24 May 2012 11:35:54 +0100
parents 6c87f6443fe6
children 52e5e2c03792
rev   line source
Chris@0 1
Chris@0 2 Sonic Annotator
Chris@0 3 ===============
Chris@0 4
Chris@0 5 Sonic Annotator is a utility program for batch feature extraction from
Chris@0 6 audio files. It runs Vamp audio analysis plugins on audio files, and
Chris@0 7 can write the result features in a selection of formats.
Chris@0 8
Chris@2 9 For more information, see
Chris@2 10
Chris@2 11 http://www.omras2.org/SonicAnnotator
Chris@2 12
Chris@2 13 More documentation follows further down this README file, after the
Chris@2 14 credits.
Chris@2 15
Chris@2 16
Chris@2 17 Credits
Chris@2 18 -------
Chris@2 19
Chris@2 20 Sonic Annotator was developed at the Centre for Digital Music,
Chris@2 21 Queen Mary, University of London.
Chris@2 22
Chris@2 23 http://www.elec.qmul.ac.uk/digitalmusic/
Chris@2 24
Chris@2 25 The main program is by Mark Levy, Chris Cannam, and Chris Sutton.
Chris@2 26 Sonic Annotator incorporates library code from the Sonic Visualiser
Chris@2 27 application by Chris Cannam. Code copyright 2005-2007 Chris Cannam,
Chris@49 28 copyright 2006-2011 Queen Mary, University of London, except where
Chris@2 29 indicated in the individual source files.
Chris@2 30
Chris@2 31 This work was funded by the Engineering and Physical Sciences Research
Chris@2 32 Council through the OMRAS2 project EP/E017614/1.
Chris@2 33
Chris@2 34 Sonic Annotator is free software; you can redistribute it and/or
Chris@2 35 modify it under the terms of the GNU General Public License as
Chris@2 36 published by the Free Software Foundation; either version 2 of the
Chris@2 37 License, or (at your option) any later version. See the file COPYING
Chris@2 38 included with this distribution for more information.
Chris@2 39
Chris@2 40 Sonic Annotator may also make use of the following libraries:
Chris@2 41
Chris@2 42 * Qt4 -- Copyright Nokia Corporation, distributed under the GPL
Chris@2 43 * Ogg decoder -- Copyright CSIRO Australia, BSD license
Chris@2 44 * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL
Chris@2 45 * libsamplerate -- Copyright Erik de Castro Lopo, GPL
Chris@2 46 * libsndfile -- Copyright Erik de Castro Lopo, LGPL
Chris@2 47 * FFTW3 -- Copyright Matteo Frigo and MIT, GPL
Chris@2 48 * Vamp plugin SDK -- Copyright Chris Cannam, BSD license
Chris@2 49 * Redland RDF libraries -- Copyright Dave Beckett and the University of Bristol, LGPL/Apache license
Chris@2 50
Chris@2 51 (Some distributions of Sonic Annotator may have one or more of these
Chris@2 52 libraries statically linked.) Many thanks to their authors.
Chris@2 53
Chris@2 54 Sonic Annotator can also use QuickTime for audio file import on OS/X.
Chris@2 55 For licensing reasons, you may not distribute binaries of Sonic
Chris@2 56 Annotator with QuickTime support included for any platform that does
Chris@2 57 not include QuickTime as part of the platform itself (see section 3 of
Chris@2 58 version 2 of the GNU General Public License).
Chris@2 59
Chris@2 60
Chris@2 61 Compiling Sonic Annotator
Chris@2 62 --------------------------
Chris@2 63
Chris@2 64 If you are planning to compile Sonic Annotator from source code,
Chris@2 65 please read the file INSTALL.
Chris@2 66
Chris@0 67
Chris@0 68 A Quick Tutorial
Chris@2 69 ================
Chris@0 70
Chris@0 71 To use Sonic Annotator, you need to tell it three things: what audio
Chris@0 72 files to extract features from; what features to extract; and how and
Chris@0 73 where to write the results. You can also optionally tell it to
Chris@0 74 summarise the features.
Chris@0 75
Chris@0 76
Chris@0 77 1. What audio files to extract features from
Chris@0 78
Chris@0 79 Sonic Annotator accepts a list of audio files on the command line.
Chris@0 80 Any argument that is not understood as a supported command-line option
Chris@0 81 will be taken to be the name of an audio file. Any number of files
Chris@0 82 may be listed.
Chris@0 83
Chris@0 84 Several common audio file formats are supported, including MP3, Ogg,
Chris@0 85 and a number of PCM formats such as WAV and AIFF. AAC is supported on
Chris@0 86 OS/X only, and only if not DRM protected. WMA is not supported.
Chris@0 87
Chris@0 88 File paths do not have to be local; you can also provide remote HTTP
Chris@0 89 or FTP URLs for Sonic Annotator to retrieve.
Chris@0 90
Chris@0 91 Sonic Annotator also accepts the names of playlist files (.m3u
Chris@0 92 extension) and will process every file found in the playlist.
Chris@0 93
Chris@0 94 Finally, you can provide a local directory path instead of a file,
Chris@0 95 together with the -r (recursive) option, for Sonic Annotator to
Chris@0 96 process every audio file found in that directory or any of its
Chris@0 97 subdirectories.
Chris@0 98
Chris@0 99
Chris@0 100 2. What features to extract
Chris@0 101
Chris@0 102 Sonic Annotator applies "transforms" to its input audio files, where a
Chris@0 103 transform (in this terminology) consists of a Vamp plugin together
Chris@0 104 with a certain set of parameters and a specified execution context:
Chris@0 105 step and block size, sample rate, etc.
Chris@0 106
Chris@0 107 (See http://www.vamp-plugins.org/ for more information about Vamp
Chris@0 108 plugins.)
Chris@0 109
Chris@0 110 To use a particular transform, specify its filename on the command
Chris@0 111 line with the -t option.
Chris@0 112
Chris@0 113 Transforms are usually described in RDF, following the transform part
Chris@0 114 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
Chris@0 115 Transform may use any Vamp plugin that is currently installed and
Chris@0 116 available on the system. You can obtain a list of available plugin
Chris@0 117 outputs by running Sonic Annotator with the -l option, and you can
Chris@0 118 obtain a skeleton transform description for one of these plugins with
Chris@0 119 the -s option.
Chris@0 120
Chris@0 121 For example, if the example plugins from the Vamp plugin SDK are
Chris@0 122 available and no other plugins are installed, you might have an
Chris@0 123 exchange like this:
Chris@0 124
Chris@0 125 $ sonic-annotator -l
Chris@0 126 vamp:vamp-example-plugins:amplitudefollower:amplitude
Chris@0 127 vamp:vamp-example-plugins:fixedtempo:acf
Chris@0 128 vamp:vamp-example-plugins:fixedtempo:detectionfunction
Chris@0 129 vamp:vamp-example-plugins:fixedtempo:filtered_acf
Chris@0 130 vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 131 vamp:vamp-example-plugins:fixedtempo:candidates
Chris@0 132 vamp:vamp-example-plugins:percussiononsets:detectionfunction
Chris@0 133 vamp:vamp-example-plugins:percussiononsets:onsets
Chris@0 134 vamp:vamp-example-plugins:powerspectrum:powerspectrum
Chris@0 135 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
Chris@0 136 vamp:vamp-example-plugins:spectralcentroid:logcentroid
Chris@0 137 vamp:vamp-example-plugins:zerocrossing:counts
Chris@0 138 vamp:vamp-example-plugins:zerocrossing:zerocrossings
Chris@0 139 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 140 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
Chris@0 141 @prefix vamp: <http://purl.org/ontology/vamp/> .
Chris@0 142 @prefix : <#> .
Chris@0 143
Chris@0 144 :transform a vamp:Transform ;
Chris@0 145 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
Chris@0 146 vamp:step_size "64"^^xsd:int ;
Chris@0 147 vamp:block_size "256"^^xsd:int ;
Chris@0 148 vamp:parameter_binding [
Chris@0 149 vamp:parameter [ vamp:identifier "maxbpm" ] ;
Chris@0 150 vamp:value "190"^^xsd:float ;
Chris@0 151 ] ;
Chris@0 152 vamp:parameter_binding [
Chris@0 153 vamp:parameter [ vamp:identifier "maxdflen" ] ;
Chris@0 154 vamp:value "10"^^xsd:float ;
Chris@0 155 ] ;
Chris@0 156 vamp:parameter_binding [
Chris@0 157 vamp:parameter [ vamp:identifier "minbpm" ] ;
Chris@0 158 vamp:value "50"^^xsd:float ;
Chris@0 159 ] ;
Chris@0 160 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
Chris@0 161 $
Chris@0 162
Chris@0 163 The output of -s is an RDF/Turtle document describing the default
Chris@0 164 settings for the Tempo output of the Fixed Tempo Estimator plugin in
Chris@0 165 the Vamp plugin SDK.
Chris@0 166
Chris@0 167 (The exact format of the RDF printed may differ -- e.g. if the
Chris@0 168 plugin's RDF description is not installed and so its "home" URI is not
Chris@0 169 known -- but the result should be functionally equivalent to this.)
Chris@0 170
Chris@0 171 You could run this transform by saving the RDF to a file and
Chris@0 172 specifying that file with -t:
Chris@0 173
Chris@0 174 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
Chris@0 175 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
Chris@0 176 (... logging output on stderr, then ...)
Chris@0 177 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 178 $
Chris@0 179
Chris@0 180 The single line of output above consists of the audio file name, the
Chris@0 181 timestamp and duration for a single feature, the value of that feature
Chris@0 182 (the estimated tempo of the given region of time from that file, in
Chris@0 183 bpm -- the plugin in question performs a single tempo estimation and
Chris@0 184 nothing else) and the feature's label.
Chris@0 185
Chris@0 186 A quicker way to achieve the above is to use the -d (default) option
Chris@0 187 to tell Sonic Annotator to use directly the default configuration for
Chris@0 188 a named transform:
Chris@0 189
Chris@0 190 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
Chris@0 191 (... some log output on stderr, then ...)
Chris@0 192 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 193 $
Chris@0 194
Chris@0 195 Although handy for experimentation, the -d option is inadvisable in
Chris@0 196 any "production" situation because the plugin configuration is not
Chris@0 197 guaranteed to be the same each time (for example if an updated version
Chris@0 198 of a plugin changes some of its defaults). It's better to save a
Chris@0 199 well-defined transform to file and refer to that, even if it is simply
Chris@0 200 the transform created by the skeleton option.
Chris@0 201
Chris@0 202 To run more than one transform on the same audio files, just put more
Chris@0 203 than one set of transform RDF descriptions in the same file, or give
Chris@0 204 the -t option more than once with separate transform description
Chris@0 205 files. Remember that if you want to specify more than one transform
Chris@0 206 in the same file, they will need to have distinct URIs (that is, the
Chris@0 207 ":transform" part of the example above, which may be any arbitrary
Chris@0 208 name, must be distinct for each described transform).
Chris@0 209
Chris@0 210
Chris@0 211 3. How and where to write the results
Chris@0 212
Chris@0 213 Sonic Annotator supports various different output modules (and it is
Chris@0 214 fairly easy for the developer to add new ones). You have to choose at
Chris@0 215 least one output module; use the -w (writer) option to do so. Each
Chris@0 216 module has its own set of parameters which can be adjusted on the
Chris@0 217 command line, as well as its own default rules about where to write
Chris@0 218 the results.
Chris@0 219
Chris@0 220 The following writers are currently supported. (Others exist, but are
Chris@0 221 not properly implemented or not supported.)
Chris@0 222
Chris@0 223 * csv
Chris@0 224
Chris@0 225 Writes the results into comma-separated data files.
Chris@0 226
Chris@0 227 One file is created for each transform applied to each input audio
Chris@0 228 file, named after the input audio file and transform name with .csv
Chris@0 229 suffix and ":" replaced by "_" throughout, placed in the same
Chris@0 230 directory as the audio file.
Chris@0 231
Chris@0 232 To instruct Sonic Annotator to place the output files in another
Chris@0 233 location, use --csv-basedir with a directory name.
Chris@0 234
Chris@0 235 To write a single file with all data in it, use --csv-one-file.
Chris@0 236
Chris@0 237 To write all data to stdout instead of to a file, use --csv-stdout.
Chris@0 238
Chris@0 239 Sonic Annotator will not write to an output file that already
Chris@0 240 exists. If you want to make it do this, use --csv-force to
Chris@0 241 overwrite or --csv-append to append to it.
Chris@0 242
Chris@0 243 The data generated consists of one line for each result feature,
Chris@0 244 containing the feature timestamp, feature duration if present, all
Chris@0 245 of the feature's bin values in order, followed by the feature's
Chris@0 246 label if present. If the --csv-one-file or --csv-stdout option is
Chris@0 247 specified, then an additional column will appear before any of the
Chris@0 248 above, containing the audio file name from which the feature was
Chris@0 249 extracted, if it differs from that of the previous row.
Chris@0 250
Chris@0 251 The default column separator is a comma; you can specify a
Chris@0 252 different one with the --csv-separator option.
Chris@0 253
Chris@0 254 * rdf
Chris@0 255
Chris@0 256 Writes the results into RDF/Turtle documents following the Audio
Chris@0 257 Features ontology (http://purl.org/ontology/af/).
Chris@0 258
Chris@0 259 One file is created for each input audio file containing the
Chris@0 260 features extracted by all transforms applied to that file, named
Chris@0 261 after the input audio file with .n3 extension, placed in the same
Chris@0 262 directory as the audio file.
Chris@0 263
Chris@0 264 To instruct Sonic Annotator to place the output files in another
Chris@0 265 location, use --rdf-basedir with a directory name.
Chris@0 266
Chris@0 267 To write a single file with all data (from all input audio files)
Chris@0 268 in it, use --rdf-one-file.
Chris@0 269
Chris@0 270 To write one file for each transform applied to each input audio
Chris@0 271 file, named after the input audio file and transform name with .n3
Chris@0 272 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
Chris@0 273
Chris@0 274 To write all data to stdout instead of to a file, use --rdf-stdout.
Chris@0 275
Chris@0 276 Sonic Annotator will not write to an output file that already
Chris@0 277 exists. If you want to make it do this, use --rdf-force to
Chris@0 278 overwrite or --rdf-append to append to it.
Chris@0 279
Chris@0 280 Sonic Annotator will use plugin description RDF if available to
Chris@0 281 enhance its output (for example identifying note onset times as
Chris@0 282 note onset times, if the plugin's RDF says that is what it
Chris@0 283 produces, rather than writing them as plain events). Best results
Chris@0 284 will be obtained if an RDF document is provided with your plugins
Chris@0 285 (for example, vamp-example-plugins.n3) and you have this installed
Chris@0 286 in the same location as the plugins. To override this enhanced
Chris@0 287 output and write plain events for all features, use --rdf-plain.
Chris@0 288
Chris@0 289 The output RDF will include an available_as property linking the
Chris@0 290 results to the original audio signal URI. By default, this will
Chris@0 291 point to the URI of the file or resource containing the audio that
Chris@0 292 Sonic Annotator processed, such as the file:/// location on disk.
Chris@0 293 To override this, for example to process a local copy of a file
Chris@0 294 while generating RDF that describes a copy of it available on a
Chris@0 295 network, you can use the --rdf-signal-uri option to specify an
Chris@0 296 alternative signal URI.
Chris@0 297
Chris@0 298
Chris@0 299 4. Optionally, how to summarise the features
Chris@0 300
Chris@0 301 Sonic Annotator can also calculate and write summaries of features,
Chris@0 302 such as mean and median values.
Chris@0 303
Chris@0 304 To obtain a summary as well as the feature results, just use the -S
Chris@0 305 option, naming the type of summary you want (min, max, mean, median,
Chris@0 306 mode, sum, variance, sd or count). You can also tell it to produce
Chris@0 307 only the summary, not the individual features, with --summary-only.
Chris@0 308
Chris@0 309 Alternatively, you can specify a summary in a transform description.
Chris@0 310 The following example tells Sonic Annotator to write both the times of
Chris@0 311 note onsets estimated by the simple percussion onset detector example
Chris@0 312 plugin, and the variance of the plugin's onset detection function.
Chris@0 313 (It will only process the audio file and run the plugin once.)
Chris@0 314
Chris@0 315 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
Chris@0 316 @prefix vamp: <http://purl.org/ontology/vamp/>.
Chris@0 317 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
Chris@0 318 @prefix : <#>.
Chris@0 319
Chris@0 320 :transform1 a vamp:Transform;
Chris@0 321 vamp:plugin examples:percussiononsets ;
Chris@0 322 vamp:output examples:percussiononsets_output_onsets .
Chris@0 323
Chris@0 324 :transform0 a vamp:Transform;
Chris@0 325 vamp:plugin examples:percussiononsets ;
Chris@0 326 vamp:output examples:percussiononsets_output_detectionfunction ;
Chris@0 327 vamp:summary_type "variance" .
Chris@0 328
Chris@0 329 Sonic Annotator can also summarise in segments -- if you provide a
Chris@0 330 comma-separated list of times as an argument to the --segments option,
Chris@0 331 it will calculate one summary for each segment bounded by the times
Chris@0 332 you provided. For example,
Chris@0 333
Chris@0 334 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
Chris@0 335 (... some log output on stderr, then ...)
Chris@0 336 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
Chris@0 337 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
Chris@0 338 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
Chris@0 339 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
Chris@0 340
Chris@0 341 Here the first row contains a summary covering the time period from 0
Chris@0 342 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
Chris@0 343 seconds and the fourth from 3 seconds to the end of the (short) audio
Chris@0 344 file.
Chris@0 345