annotate README.md @ 398:30c64a311d9c

Added tag sonic-annotator-1.6 for changeset a82030ec7b1f
author Chris Cannam
date Tue, 09 Jun 2020 17:33:22 +0100
parents e51776328158
children a3912193ce69
rev   line source
Chris@0 1
Chris@0 2 Sonic Annotator
Chris@0 3 ===============
Chris@0 4
Chris@0 5 Sonic Annotator is a utility program for batch feature extraction from
Chris@0 6 audio files. It runs Vamp audio analysis plugins on audio files, and
Chris@0 7 can write the result features in a selection of formats.
Chris@0 8
Chris@2 9 For more information, see
Chris@2 10
Chris@179 11 http://vamp-plugins.org/sonic-annotator
Chris@2 12
Chris@2 13 More documentation follows further down this README file, after the
Chris@2 14 credits.
Chris@2 15
Chris@2 16
Chris@309 17 ### Credits
Chris@2 18
Chris@2 19 Sonic Annotator was developed at the Centre for Digital Music,
Chris@2 20 Queen Mary, University of London.
Chris@2 21
Chris@87 22 http://c4dm.eecs.qmul.ac.uk/
Chris@2 23
Chris@2 24 The main program is by Mark Levy, Chris Cannam, and Chris Sutton.
Chris@2 25 Sonic Annotator incorporates library code from the Sonic Visualiser
Chris@2 26 application by Chris Cannam. Code copyright 2005-2007 Chris Cannam,
Chris@396 27 copyright 2006-2020 Queen Mary, University of London, except where
Chris@2 28 indicated in the individual source files.
Chris@2 29
Chris@2 30 This work was funded by the Engineering and Physical Sciences Research
Chris@2 31 Council through the OMRAS2 project EP/E017614/1.
Chris@2 32
Chris@2 33 Sonic Annotator is free software; you can redistribute it and/or
Chris@2 34 modify it under the terms of the GNU General Public License as
Chris@2 35 published by the Free Software Foundation; either version 2 of the
Chris@2 36 License, or (at your option) any later version. See the file COPYING
Chris@2 37 included with this distribution for more information.
Chris@2 38
Chris@2 39 Sonic Annotator may also make use of the following libraries:
Chris@2 40
Chris@87 41 * Qt5 -- Copyright Digia Oyj, distributed under the LGPL
Chris@2 42 * Ogg decoder -- Copyright CSIRO Australia, BSD license
Chris@2 43 * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL
Chris@2 44 * libsamplerate -- Copyright Erik de Castro Lopo, GPL
Chris@2 45 * libsndfile -- Copyright Erik de Castro Lopo, LGPL
Chris@2 46 * FFTW3 -- Copyright Matteo Frigo and MIT, GPL
Chris@87 47 * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license
Chris@87 48 * Dataquay -- Copyright Breakfast Quay, BSD license
Chris@87 49 * Sord and Serd -- Copyright David Robillard, BSD license
Chris@2 50
Chris@2 51 (Some distributions of Sonic Annotator may have one or more of these
Chris@2 52 libraries statically linked.) Many thanks to their authors.
Chris@2 53
Chris@0 54
Chris@0 55 A Quick Tutorial
Chris@309 56 ----------------
Chris@0 57
Chris@0 58 To use Sonic Annotator, you need to tell it three things: what audio
Chris@0 59 files to extract features from; what features to extract; and how and
Chris@0 60 where to write the results. You can also optionally tell it to
Chris@0 61 summarise the features.
Chris@0 62
Chris@0 63
Chris@309 64 ### 1. What audio files to extract features from
Chris@0 65
Chris@0 66 Sonic Annotator accepts a list of audio files on the command line.
Chris@0 67 Any argument that is not understood as a supported command-line option
Chris@0 68 will be taken to be the name of an audio file. Any number of files
Chris@0 69 may be listed.
Chris@0 70
Chris@0 71 Several common audio file formats are supported, including MP3, Ogg,
Chris@0 72 and a number of PCM formats such as WAV and AIFF. AAC is supported on
Chris@0 73 OS/X only, and only if not DRM protected. WMA is not supported.
Chris@0 74
Chris@0 75 File paths do not have to be local; you can also provide remote HTTP
Chris@0 76 or FTP URLs for Sonic Annotator to retrieve.
Chris@0 77
Chris@0 78 Sonic Annotator also accepts the names of playlist files (.m3u
Chris@0 79 extension) and will process every file found in the playlist.
Chris@0 80
Chris@0 81 Finally, you can provide a local directory path instead of a file,
Chris@0 82 together with the -r (recursive) option, for Sonic Annotator to
Chris@0 83 process every audio file found in that directory or any of its
Chris@0 84 subdirectories.
Chris@0 85
Chris@0 86
Chris@309 87 ### 2. What features to extract
Chris@0 88
Chris@0 89 Sonic Annotator applies "transforms" to its input audio files, where a
Chris@0 90 transform (in this terminology) consists of a Vamp plugin together
Chris@0 91 with a certain set of parameters and a specified execution context:
Chris@0 92 step and block size, sample rate, etc.
Chris@0 93
Chris@0 94 (See http://www.vamp-plugins.org/ for more information about Vamp
Chris@0 95 plugins.)
Chris@0 96
Chris@0 97 To use a particular transform, specify its filename on the command
Chris@0 98 line with the -t option.
Chris@0 99
Chris@0 100 Transforms are usually described in RDF, following the transform part
Chris@0 101 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
Chris@0 102 Transform may use any Vamp plugin that is currently installed and
Chris@0 103 available on the system. You can obtain a list of available plugin
Chris@0 104 outputs by running Sonic Annotator with the -l option, and you can
Chris@0 105 obtain a skeleton transform description for one of these plugins with
Chris@0 106 the -s option.
Chris@0 107
Chris@0 108 For example, if the example plugins from the Vamp plugin SDK are
Chris@0 109 available and no other plugins are installed, you might have an
Chris@0 110 exchange like this:
Chris@0 111
Chris@310 112 ```
Chris@0 113 $ sonic-annotator -l
Chris@0 114 vamp:vamp-example-plugins:amplitudefollower:amplitude
Chris@0 115 vamp:vamp-example-plugins:fixedtempo:acf
Chris@0 116 vamp:vamp-example-plugins:fixedtempo:detectionfunction
Chris@0 117 vamp:vamp-example-plugins:fixedtempo:filtered_acf
Chris@0 118 vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 119 vamp:vamp-example-plugins:fixedtempo:candidates
Chris@0 120 vamp:vamp-example-plugins:percussiononsets:detectionfunction
Chris@0 121 vamp:vamp-example-plugins:percussiononsets:onsets
Chris@0 122 vamp:vamp-example-plugins:powerspectrum:powerspectrum
Chris@0 123 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
Chris@0 124 vamp:vamp-example-plugins:spectralcentroid:logcentroid
Chris@0 125 vamp:vamp-example-plugins:zerocrossing:counts
Chris@0 126 vamp:vamp-example-plugins:zerocrossing:zerocrossings
Chris@0 127 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 128 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
Chris@0 129 @prefix vamp: <http://purl.org/ontology/vamp/> .
Chris@0 130 @prefix : <#> .
Chris@0 131
Chris@0 132 :transform a vamp:Transform ;
Chris@0 133 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
Chris@0 134 vamp:step_size "64"^^xsd:int ;
Chris@0 135 vamp:block_size "256"^^xsd:int ;
Chris@0 136 vamp:parameter_binding [
Chris@0 137 vamp:parameter [ vamp:identifier "maxbpm" ] ;
Chris@0 138 vamp:value "190"^^xsd:float ;
Chris@0 139 ] ;
Chris@0 140 vamp:parameter_binding [
Chris@0 141 vamp:parameter [ vamp:identifier "maxdflen" ] ;
Chris@0 142 vamp:value "10"^^xsd:float ;
Chris@0 143 ] ;
Chris@0 144 vamp:parameter_binding [
Chris@0 145 vamp:parameter [ vamp:identifier "minbpm" ] ;
Chris@0 146 vamp:value "50"^^xsd:float ;
Chris@0 147 ] ;
Chris@0 148 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
Chris@0 149 $
Chris@310 150 ```
Chris@0 151
Chris@0 152 The output of -s is an RDF/Turtle document describing the default
Chris@0 153 settings for the Tempo output of the Fixed Tempo Estimator plugin in
Chris@0 154 the Vamp plugin SDK.
Chris@0 155
Chris@0 156 (The exact format of the RDF printed may differ -- e.g. if the
Chris@0 157 plugin's RDF description is not installed and so its "home" URI is not
Chris@0 158 known -- but the result should be functionally equivalent to this.)
Chris@0 159
Chris@0 160 You could run this transform by saving the RDF to a file and
Chris@0 161 specifying that file with -t:
Chris@0 162
Chris@310 163 ```
Chris@0 164 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
Chris@0 165 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
Chris@0 166 (... logging output on stderr, then ...)
Chris@0 167 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 168 $
Chris@310 169 ```
Chris@0 170
Chris@0 171 The single line of output above consists of the audio file name, the
Chris@0 172 timestamp and duration for a single feature, the value of that feature
Chris@0 173 (the estimated tempo of the given region of time from that file, in
Chris@0 174 bpm -- the plugin in question performs a single tempo estimation and
Chris@0 175 nothing else) and the feature's label.
Chris@0 176
Chris@0 177 A quicker way to achieve the above is to use the -d (default) option
Chris@0 178 to tell Sonic Annotator to use directly the default configuration for
Chris@0 179 a named transform:
Chris@0 180
Chris@310 181 ```
Chris@0 182 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
Chris@0 183 (... some log output on stderr, then ...)
Chris@0 184 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 185 $
Chris@310 186 ```
Chris@0 187
Chris@0 188 Although handy for experimentation, the -d option is inadvisable in
Chris@0 189 any "production" situation because the plugin configuration is not
Chris@0 190 guaranteed to be the same each time (for example if an updated version
Chris@0 191 of a plugin changes some of its defaults). It's better to save a
Chris@0 192 well-defined transform to file and refer to that, even if it is simply
Chris@0 193 the transform created by the skeleton option.
Chris@0 194
Chris@0 195 To run more than one transform on the same audio files, just put more
Chris@0 196 than one set of transform RDF descriptions in the same file, or give
Chris@0 197 the -t option more than once with separate transform description
Chris@0 198 files. Remember that if you want to specify more than one transform
Chris@0 199 in the same file, they will need to have distinct URIs (that is, the
Chris@0 200 ":transform" part of the example above, which may be any arbitrary
Chris@0 201 name, must be distinct for each described transform).
Chris@0 202
Chris@0 203
Chris@309 204 ### 3. How and where to write the results
Chris@0 205
Chris@0 206 Sonic Annotator supports various different output modules (and it is
Chris@0 207 fairly easy for the developer to add new ones). You have to choose at
Chris@0 208 least one output module; use the -w (writer) option to do so. Each
Chris@0 209 module has its own set of parameters which can be adjusted on the
Chris@0 210 command line, as well as its own default rules about where to write
Chris@0 211 the results.
Chris@0 212
Chris@174 213 To get help on a specific writer, run Sonic Annotator with the -h
Chris@174 214 option followed by the writer name (e.g. "-h csv").
Chris@174 215
Chris@0 216 The following writers are currently supported. (Others exist, but are
Chris@0 217 not properly implemented or not supported.)
Chris@0 218
Chris@0 219 * csv
Chris@0 220
Chris@0 221 Writes the results into comma-separated data files.
Chris@0 222
Chris@0 223 One file is created for each transform applied to each input audio
Chris@0 224 file, named after the input audio file and transform name with .csv
Chris@0 225 suffix and ":" replaced by "_" throughout, placed in the same
Chris@0 226 directory as the audio file.
Chris@0 227
Chris@0 228 To instruct Sonic Annotator to place the output files in another
Chris@0 229 location, use --csv-basedir with a directory name.
Chris@0 230
Chris@0 231 To write a single file with all data in it, use --csv-one-file.
Chris@0 232
Chris@0 233 To write all data to stdout instead of to a file, use --csv-stdout.
Chris@0 234
Chris@0 235 Sonic Annotator will not write to an output file that already
Chris@0 236 exists. If you want to make it do this, use --csv-force to
Chris@0 237 overwrite or --csv-append to append to it.
Chris@0 238
Chris@0 239 The data generated consists of one line for each result feature,
Chris@0 240 containing the feature timestamp, feature duration if present, all
Chris@0 241 of the feature's bin values in order, followed by the feature's
Chris@0 242 label if present. If the --csv-one-file or --csv-stdout option is
Chris@0 243 specified, then an additional column will appear before any of the
Chris@0 244 above, containing the audio file name from which the feature was
Chris@174 245 extracted, if it differs from that of the previous row. To suppress
Chris@174 246 this additional column, use the --csv-omit-filenames option.
Chris@174 247
Chris@174 248 To make the CSV writer emit the end time instead of the duration
Chris@174 249 (for features with duration) use the --csv-end-times option.
Chris@174 250
Chris@174 251 To make the writer always emit end time or duration, even when the
Chris@174 252 feature lacks duration, by using the time of the following feature
Chris@174 253 as the end time, use the --csv-fill-ends option.
Chris@0 254
Chris@0 255 The default column separator is a comma; you can specify a
Chris@0 256 different one with the --csv-separator option.
Chris@0 257
Chris@174 258 * lab
Chris@174 259
Chris@174 260 Writes the results into a tab-separated label file (.lab).
Chris@174 261
Chris@174 262 This is equivalent to using the CSV writer with a tab separator and
Chris@174 263 the options --csv-end-times --csv-omit-filenames.
Chris@174 264
Chris@359 265 It supports the --lab-basedir, --lab-stdout, --lab-force,
Chris@359 266 --lab-append, and --lab-fill-ends options, which all behave
Chris@359 267 similarly to their CSV writer equivalents.
Chris@174 268
Chris@0 269 * rdf
Chris@0 270
Chris@0 271 Writes the results into RDF/Turtle documents following the Audio
Chris@0 272 Features ontology (http://purl.org/ontology/af/).
Chris@0 273
Chris@0 274 One file is created for each input audio file containing the
Chris@0 275 features extracted by all transforms applied to that file, named
Chris@0 276 after the input audio file with .n3 extension, placed in the same
Chris@0 277 directory as the audio file.
Chris@0 278
Chris@0 279 To instruct Sonic Annotator to place the output files in another
Chris@0 280 location, use --rdf-basedir with a directory name.
Chris@0 281
Chris@0 282 To write a single file with all data (from all input audio files)
Chris@0 283 in it, use --rdf-one-file.
Chris@0 284
Chris@0 285 To write one file for each transform applied to each input audio
Chris@0 286 file, named after the input audio file and transform name with .n3
Chris@0 287 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
Chris@0 288
Chris@0 289 To write all data to stdout instead of to a file, use --rdf-stdout.
Chris@0 290
Chris@0 291 Sonic Annotator will not write to an output file that already
Chris@0 292 exists. If you want to make it do this, use --rdf-force to
Chris@0 293 overwrite or --rdf-append to append to it.
Chris@0 294
Chris@0 295 Sonic Annotator will use plugin description RDF if available to
Chris@0 296 enhance its output (for example identifying note onset times as
Chris@0 297 note onset times, if the plugin's RDF says that is what it
Chris@0 298 produces, rather than writing them as plain events). Best results
Chris@0 299 will be obtained if an RDF document is provided with your plugins
Chris@0 300 (for example, vamp-example-plugins.n3) and you have this installed
Chris@0 301 in the same location as the plugins. To override this enhanced
Chris@0 302 output and write plain events for all features, use --rdf-plain.
Chris@0 303
Chris@0 304 The output RDF will include an available_as property linking the
Chris@0 305 results to the original audio signal URI. By default, this will
Chris@0 306 point to the URI of the file or resource containing the audio that
Chris@0 307 Sonic Annotator processed, such as the file:/// location on disk.
Chris@0 308 To override this, for example to process a local copy of a file
Chris@0 309 while generating RDF that describes a copy of it available on a
Chris@0 310 network, you can use the --rdf-signal-uri option to specify an
Chris@0 311 alternative signal URI.
Chris@0 312
Chris@174 313 * json
Chris@174 314
Chris@174 315 Writes the results into JSON format following JAMS, the JSON
Chris@174 316 Annotated Music Specification. This writer is provisional as of
Chris@174 317 Sonic Annotator v1.1.
Chris@174 318
Chris@174 319 * midi
Chris@174 320
Chris@174 321 Writes the results to MIDI files. All features are written as MIDI
Chris@174 322 notes.
Chris@174 323
Chris@174 324 If a feature has at least one value, its first value will be used
Chris@174 325 as the note pitch, the second value (if present) for velocity. If a
Chris@174 326 feature has units of Hz, then its pitch will be converted from
Chris@174 327 frequency to an integer value in MIDI range, otherwise it will be
Chris@174 328 written directly.
Chris@174 329
Chris@174 330 Multiple (up to 16) transforms can be written to a single MIDI
Chris@174 331 file, where they will be given separate MIDI channel numbers.
Chris@174 332
Chris@0 333
Chris@309 334 ### 4. Optionally, how to summarise the features
Chris@0 335
Chris@0 336 Sonic Annotator can also calculate and write summaries of features,
Chris@0 337 such as mean and median values.
Chris@0 338
Chris@0 339 To obtain a summary as well as the feature results, just use the -S
Chris@0 340 option, naming the type of summary you want (min, max, mean, median,
Chris@0 341 mode, sum, variance, sd or count). You can also tell it to produce
Chris@0 342 only the summary, not the individual features, with --summary-only.
Chris@0 343
Chris@0 344 Alternatively, you can specify a summary in a transform description.
Chris@0 345 The following example tells Sonic Annotator to write both the times of
Chris@0 346 note onsets estimated by the simple percussion onset detector example
Chris@0 347 plugin, and the variance of the plugin's onset detection function.
Chris@0 348 (It will only process the audio file and run the plugin once.)
Chris@0 349
Chris@310 350 ```
Chris@0 351 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
Chris@0 352 @prefix vamp: <http://purl.org/ontology/vamp/>.
Chris@0 353 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
Chris@0 354 @prefix : <#>.
Chris@0 355
Chris@0 356 :transform1 a vamp:Transform;
Chris@0 357 vamp:plugin examples:percussiononsets ;
Chris@0 358 vamp:output examples:percussiononsets_output_onsets .
Chris@0 359
Chris@0 360 :transform0 a vamp:Transform;
Chris@0 361 vamp:plugin examples:percussiononsets ;
Chris@0 362 vamp:output examples:percussiononsets_output_detectionfunction ;
Chris@0 363 vamp:summary_type "variance" .
Chris@310 364 ```
Chris@0 365
Chris@0 366 Sonic Annotator can also summarise in segments -- if you provide a
Chris@0 367 comma-separated list of times as an argument to the --segments option,
Chris@0 368 it will calculate one summary for each segment bounded by the times
Chris@0 369 you provided. For example,
Chris@0 370
Chris@310 371 ```
Chris@0 372 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
Chris@0 373 (... some log output on stderr, then ...)
Chris@0 374 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
Chris@0 375 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
Chris@0 376 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
Chris@0 377 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
Chris@310 378 ```
Chris@0 379
Chris@0 380 Here the first row contains a summary covering the time period from 0
Chris@0 381 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
Chris@0 382 seconds and the fourth from 3 seconds to the end of the (short) audio
Chris@0 383 file.
Chris@0 384
Chris@309 385
Chris@309 386 Automated build reports
Chris@309 387 -----------------------
Chris@309 388
Chris@309 389 * Linux and macOS CI build: [![Build Status](https://travis-ci.org/sonic-visualiser/sonic-annotator.svg?branch=master)](https://travis-ci.org/sonic-visualiser/sonic-annotator)
Chris@309 390 * Windows CI build: [![Build status](https://ci.appveyor.com/api/projects/status/26pygienkigw39p7?svg=true)](https://ci.appveyor.com/project/cannam/sonic-annotator)