annotate README.md @ 399:a3912193ce69 tip

Default branch is now named default on git as well as hg, in case we ever want to switch to mirroring in the other direction
author Chris Cannam
date Thu, 27 Aug 2020 15:57:37 +0100
parents e51776328158
children
rev   line source
Chris@0 1
Chris@0 2 Sonic Annotator
Chris@0 3 ===============
Chris@0 4
Chris@0 5 Sonic Annotator is a utility program for batch feature extraction from
Chris@0 6 audio files. It runs Vamp audio analysis plugins on audio files, and
Chris@0 7 can write the result features in a selection of formats.
Chris@0 8
Chris@2 9 For more information, see
Chris@2 10
Chris@179 11 http://vamp-plugins.org/sonic-annotator
Chris@2 12
Chris@2 13 More documentation follows further down this README file, after the
Chris@2 14 credits.
Chris@2 15
Chris@2 16
Chris@309 17 ### Credits
Chris@2 18
Chris@2 19 Sonic Annotator was developed at the Centre for Digital Music,
Chris@2 20 Queen Mary, University of London.
Chris@2 21
Chris@87 22 http://c4dm.eecs.qmul.ac.uk/
Chris@2 23
Chris@2 24 The main program is by Mark Levy, Chris Cannam, and Chris Sutton.
Chris@2 25 Sonic Annotator incorporates library code from the Sonic Visualiser
Chris@2 26 application by Chris Cannam. Code copyright 2005-2007 Chris Cannam,
Chris@396 27 copyright 2006-2020 Queen Mary, University of London, except where
Chris@2 28 indicated in the individual source files.
Chris@2 29
Chris@2 30 This work was funded by the Engineering and Physical Sciences Research
Chris@2 31 Council through the OMRAS2 project EP/E017614/1.
Chris@2 32
Chris@2 33 Sonic Annotator is free software; you can redistribute it and/or
Chris@2 34 modify it under the terms of the GNU General Public License as
Chris@2 35 published by the Free Software Foundation; either version 2 of the
Chris@2 36 License, or (at your option) any later version. See the file COPYING
Chris@2 37 included with this distribution for more information.
Chris@2 38
Chris@2 39 Sonic Annotator may also make use of the following libraries:
Chris@2 40
Chris@87 41 * Qt5 -- Copyright Digia Oyj, distributed under the LGPL
Chris@2 42 * Ogg decoder -- Copyright CSIRO Australia, BSD license
Chris@2 43 * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL
Chris@2 44 * libsamplerate -- Copyright Erik de Castro Lopo, GPL
Chris@2 45 * libsndfile -- Copyright Erik de Castro Lopo, LGPL
Chris@2 46 * FFTW3 -- Copyright Matteo Frigo and MIT, GPL
Chris@87 47 * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license
Chris@87 48 * Dataquay -- Copyright Breakfast Quay, BSD license
Chris@87 49 * Sord and Serd -- Copyright David Robillard, BSD license
Chris@2 50
Chris@2 51 (Some distributions of Sonic Annotator may have one or more of these
Chris@2 52 libraries statically linked.) Many thanks to their authors.
Chris@2 53
Chris@0 54
Chris@0 55 A Quick Tutorial
Chris@309 56 ----------------
Chris@0 57
Chris@0 58 To use Sonic Annotator, you need to tell it three things: what audio
Chris@0 59 files to extract features from; what features to extract; and how and
Chris@0 60 where to write the results. You can also optionally tell it to
Chris@0 61 summarise the features.
Chris@0 62
Chris@0 63
Chris@309 64 ### 1. What audio files to extract features from
Chris@0 65
Chris@0 66 Sonic Annotator accepts a list of audio files on the command line.
Chris@0 67 Any argument that is not understood as a supported command-line option
Chris@0 68 will be taken to be the name of an audio file. Any number of files
Chris@0 69 may be listed.
Chris@0 70
Chris@0 71 Several common audio file formats are supported, including MP3, Ogg,
Chris@0 72 and a number of PCM formats such as WAV and AIFF. AAC is supported on
Chris@0 73 OS/X only, and only if not DRM protected. WMA is not supported.
Chris@0 74
Chris@0 75 File paths do not have to be local; you can also provide remote HTTP
Chris@0 76 or FTP URLs for Sonic Annotator to retrieve.
Chris@0 77
Chris@0 78 Sonic Annotator also accepts the names of playlist files (.m3u
Chris@0 79 extension) and will process every file found in the playlist.
Chris@0 80
Chris@0 81 Finally, you can provide a local directory path instead of a file,
Chris@0 82 together with the -r (recursive) option, for Sonic Annotator to
Chris@0 83 process every audio file found in that directory or any of its
Chris@0 84 subdirectories.
Chris@0 85
Chris@0 86
Chris@309 87 ### 2. What features to extract
Chris@0 88
Chris@0 89 Sonic Annotator applies "transforms" to its input audio files, where a
Chris@0 90 transform (in this terminology) consists of a Vamp plugin together
Chris@0 91 with a certain set of parameters and a specified execution context:
Chris@0 92 step and block size, sample rate, etc.
Chris@0 93
Chris@0 94 (See http://www.vamp-plugins.org/ for more information about Vamp
Chris@0 95 plugins.)
Chris@0 96
Chris@0 97 To use a particular transform, specify its filename on the command
Chris@0 98 line with the -t option.
Chris@0 99
Chris@0 100 Transforms are usually described in RDF, following the transform part
Chris@0 101 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
Chris@0 102 Transform may use any Vamp plugin that is currently installed and
Chris@0 103 available on the system. You can obtain a list of available plugin
Chris@0 104 outputs by running Sonic Annotator with the -l option, and you can
Chris@0 105 obtain a skeleton transform description for one of these plugins with
Chris@0 106 the -s option.
Chris@0 107
Chris@0 108 For example, if the example plugins from the Vamp plugin SDK are
Chris@0 109 available and no other plugins are installed, you might have an
Chris@0 110 exchange like this:
Chris@0 111
Chris@310 112 ```
Chris@0 113 $ sonic-annotator -l
Chris@0 114 vamp:vamp-example-plugins:amplitudefollower:amplitude
Chris@0 115 vamp:vamp-example-plugins:fixedtempo:acf
Chris@0 116 vamp:vamp-example-plugins:fixedtempo:detectionfunction
Chris@0 117 vamp:vamp-example-plugins:fixedtempo:filtered_acf
Chris@0 118 vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 119 vamp:vamp-example-plugins:fixedtempo:candidates
Chris@0 120 vamp:vamp-example-plugins:percussiononsets:detectionfunction
Chris@0 121 vamp:vamp-example-plugins:percussiononsets:onsets
Chris@0 122 vamp:vamp-example-plugins:powerspectrum:powerspectrum
Chris@0 123 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
Chris@0 124 vamp:vamp-example-plugins:spectralcentroid:logcentroid
Chris@0 125 vamp:vamp-example-plugins:zerocrossing:counts
Chris@0 126 vamp:vamp-example-plugins:zerocrossing:zerocrossings
Chris@0 127 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0 128 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
Chris@0 129 @prefix vamp: <http://purl.org/ontology/vamp/> .
Chris@0 130 @prefix : <#> .
Chris@0 131
Chris@0 132 :transform a vamp:Transform ;
Chris@0 133 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
Chris@0 134 vamp:step_size "64"^^xsd:int ;
Chris@0 135 vamp:block_size "256"^^xsd:int ;
Chris@0 136 vamp:parameter_binding [
Chris@0 137 vamp:parameter [ vamp:identifier "maxbpm" ] ;
Chris@0 138 vamp:value "190"^^xsd:float ;
Chris@0 139 ] ;
Chris@0 140 vamp:parameter_binding [
Chris@0 141 vamp:parameter [ vamp:identifier "maxdflen" ] ;
Chris@0 142 vamp:value "10"^^xsd:float ;
Chris@0 143 ] ;
Chris@0 144 vamp:parameter_binding [
Chris@0 145 vamp:parameter [ vamp:identifier "minbpm" ] ;
Chris@0 146 vamp:value "50"^^xsd:float ;
Chris@0 147 ] ;
Chris@0 148 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
Chris@0 149 $
Chris@310 150 ```
Chris@0 151
Chris@0 152 The output of -s is an RDF/Turtle document describing the default
Chris@0 153 settings for the Tempo output of the Fixed Tempo Estimator plugin in
Chris@0 154 the Vamp plugin SDK.
Chris@0 155
Chris@0 156 (The exact format of the RDF printed may differ -- e.g. if the
Chris@0 157 plugin's RDF description is not installed and so its "home" URI is not
Chris@0 158 known -- but the result should be functionally equivalent to this.)
Chris@0 159
Chris@0 160 You could run this transform by saving the RDF to a file and
Chris@0 161 specifying that file with -t:
Chris@0 162
Chris@310 163 ```
Chris@0 164 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
Chris@0 165 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
Chris@0 166 (... logging output on stderr, then ...)
Chris@0 167 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 168 $
Chris@310 169 ```
Chris@0 170
Chris@0 171 The single line of output above consists of the audio file name, the
Chris@0 172 timestamp and duration for a single feature, the value of that feature
Chris@0 173 (the estimated tempo of the given region of time from that file, in
Chris@0 174 bpm -- the plugin in question performs a single tempo estimation and
Chris@0 175 nothing else) and the feature's label.
Chris@0 176
Chris@0 177 A quicker way to achieve the above is to use the -d (default) option
Chris@0 178 to tell Sonic Annotator to use directly the default configuration for
Chris@0 179 a named transform:
Chris@0 180
Chris@310 181 ```
Chris@0 182 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
Chris@0 183 (... some log output on stderr, then ...)
Chris@0 184 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0 185 $
Chris@310 186 ```
Chris@0 187
Chris@0 188 Although handy for experimentation, the -d option is inadvisable in
Chris@0 189 any "production" situation because the plugin configuration is not
Chris@0 190 guaranteed to be the same each time (for example if an updated version
Chris@0 191 of a plugin changes some of its defaults). It's better to save a
Chris@0 192 well-defined transform to file and refer to that, even if it is simply
Chris@0 193 the transform created by the skeleton option.
Chris@0 194
Chris@0 195 To run more than one transform on the same audio files, just put more
Chris@0 196 than one set of transform RDF descriptions in the same file, or give
Chris@0 197 the -t option more than once with separate transform description
Chris@0 198 files. Remember that if you want to specify more than one transform
Chris@0 199 in the same file, they will need to have distinct URIs (that is, the
Chris@0 200 ":transform" part of the example above, which may be any arbitrary
Chris@0 201 name, must be distinct for each described transform).
Chris@0 202
Chris@0 203
Chris@309 204 ### 3. How and where to write the results
Chris@0 205
Chris@0 206 Sonic Annotator supports various different output modules (and it is
Chris@0 207 fairly easy for the developer to add new ones). You have to choose at
Chris@0 208 least one output module; use the -w (writer) option to do so. Each
Chris@0 209 module has its own set of parameters which can be adjusted on the
Chris@0 210 command line, as well as its own default rules about where to write
Chris@0 211 the results.
Chris@0 212
Chris@174 213 To get help on a specific writer, run Sonic Annotator with the -h
Chris@174 214 option followed by the writer name (e.g. "-h csv").
Chris@174 215
Chris@0 216 The following writers are currently supported. (Others exist, but are
Chris@0 217 not properly implemented or not supported.)
Chris@0 218
Chris@0 219 * csv
Chris@0 220
Chris@0 221 Writes the results into comma-separated data files.
Chris@0 222
Chris@0 223 One file is created for each transform applied to each input audio
Chris@0 224 file, named after the input audio file and transform name with .csv
Chris@0 225 suffix and ":" replaced by "_" throughout, placed in the same
Chris@0 226 directory as the audio file.
Chris@0 227
Chris@0 228 To instruct Sonic Annotator to place the output files in another
Chris@0 229 location, use --csv-basedir with a directory name.
Chris@0 230
Chris@0 231 To write a single file with all data in it, use --csv-one-file.
Chris@0 232
Chris@0 233 To write all data to stdout instead of to a file, use --csv-stdout.
Chris@0 234
Chris@0 235 Sonic Annotator will not write to an output file that already
Chris@0 236 exists. If you want to make it do this, use --csv-force to
Chris@0 237 overwrite or --csv-append to append to it.
Chris@0 238
Chris@0 239 The data generated consists of one line for each result feature,
Chris@0 240 containing the feature timestamp, feature duration if present, all
Chris@0 241 of the feature's bin values in order, followed by the feature's
Chris@0 242 label if present. If the --csv-one-file or --csv-stdout option is
Chris@0 243 specified, then an additional column will appear before any of the
Chris@0 244 above, containing the audio file name from which the feature was
Chris@174 245 extracted, if it differs from that of the previous row. To suppress
Chris@174 246 this additional column, use the --csv-omit-filenames option.
Chris@174 247
Chris@174 248 To make the CSV writer emit the end time instead of the duration
Chris@174 249 (for features with duration) use the --csv-end-times option.
Chris@174 250
Chris@174 251 To make the writer always emit end time or duration, even when the
Chris@174 252 feature lacks duration, by using the time of the following feature
Chris@174 253 as the end time, use the --csv-fill-ends option.
Chris@0 254
Chris@0 255 The default column separator is a comma; you can specify a
Chris@0 256 different one with the --csv-separator option.
Chris@0 257
Chris@174 258 * lab
Chris@174 259
Chris@174 260 Writes the results into a tab-separated label file (.lab).
Chris@174 261
Chris@174 262 This is equivalent to using the CSV writer with a tab separator and
Chris@174 263 the options --csv-end-times --csv-omit-filenames.
Chris@174 264
Chris@359 265 It supports the --lab-basedir, --lab-stdout, --lab-force,
Chris@359 266 --lab-append, and --lab-fill-ends options, which all behave
Chris@359 267 similarly to their CSV writer equivalents.
Chris@174 268
Chris@0 269 * rdf
Chris@0 270
Chris@0 271 Writes the results into RDF/Turtle documents following the Audio
Chris@0 272 Features ontology (http://purl.org/ontology/af/).
Chris@0 273
Chris@0 274 One file is created for each input audio file containing the
Chris@0 275 features extracted by all transforms applied to that file, named
Chris@0 276 after the input audio file with .n3 extension, placed in the same
Chris@0 277 directory as the audio file.
Chris@0 278
Chris@0 279 To instruct Sonic Annotator to place the output files in another
Chris@0 280 location, use --rdf-basedir with a directory name.
Chris@0 281
Chris@0 282 To write a single file with all data (from all input audio files)
Chris@0 283 in it, use --rdf-one-file.
Chris@0 284
Chris@0 285 To write one file for each transform applied to each input audio
Chris@0 286 file, named after the input audio file and transform name with .n3
Chris@0 287 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
Chris@0 288
Chris@0 289 To write all data to stdout instead of to a file, use --rdf-stdout.
Chris@0 290
Chris@0 291 Sonic Annotator will not write to an output file that already
Chris@0 292 exists. If you want to make it do this, use --rdf-force to
Chris@0 293 overwrite or --rdf-append to append to it.
Chris@0 294
Chris@0 295 Sonic Annotator will use plugin description RDF if available to
Chris@0 296 enhance its output (for example identifying note onset times as
Chris@0 297 note onset times, if the plugin's RDF says that is what it
Chris@0 298 produces, rather than writing them as plain events). Best results
Chris@0 299 will be obtained if an RDF document is provided with your plugins
Chris@0 300 (for example, vamp-example-plugins.n3) and you have this installed
Chris@0 301 in the same location as the plugins. To override this enhanced
Chris@0 302 output and write plain events for all features, use --rdf-plain.
Chris@0 303
Chris@0 304 The output RDF will include an available_as property linking the
Chris@0 305 results to the original audio signal URI. By default, this will
Chris@0 306 point to the URI of the file or resource containing the audio that
Chris@0 307 Sonic Annotator processed, such as the file:/// location on disk.
Chris@0 308 To override this, for example to process a local copy of a file
Chris@0 309 while generating RDF that describes a copy of it available on a
Chris@0 310 network, you can use the --rdf-signal-uri option to specify an
Chris@0 311 alternative signal URI.
Chris@0 312
Chris@174 313 * json
Chris@174 314
Chris@174 315 Writes the results into JSON format following JAMS, the JSON
Chris@174 316 Annotated Music Specification. This writer is provisional as of
Chris@174 317 Sonic Annotator v1.1.
Chris@174 318
Chris@174 319 * midi
Chris@174 320
Chris@174 321 Writes the results to MIDI files. All features are written as MIDI
Chris@174 322 notes.
Chris@174 323
Chris@174 324 If a feature has at least one value, its first value will be used
Chris@174 325 as the note pitch, the second value (if present) for velocity. If a
Chris@174 326 feature has units of Hz, then its pitch will be converted from
Chris@174 327 frequency to an integer value in MIDI range, otherwise it will be
Chris@174 328 written directly.
Chris@174 329
Chris@174 330 Multiple (up to 16) transforms can be written to a single MIDI
Chris@174 331 file, where they will be given separate MIDI channel numbers.
Chris@174 332
Chris@0 333
Chris@309 334 ### 4. Optionally, how to summarise the features
Chris@0 335
Chris@0 336 Sonic Annotator can also calculate and write summaries of features,
Chris@0 337 such as mean and median values.
Chris@0 338
Chris@0 339 To obtain a summary as well as the feature results, just use the -S
Chris@0 340 option, naming the type of summary you want (min, max, mean, median,
Chris@0 341 mode, sum, variance, sd or count). You can also tell it to produce
Chris@0 342 only the summary, not the individual features, with --summary-only.
Chris@0 343
Chris@0 344 Alternatively, you can specify a summary in a transform description.
Chris@0 345 The following example tells Sonic Annotator to write both the times of
Chris@0 346 note onsets estimated by the simple percussion onset detector example
Chris@0 347 plugin, and the variance of the plugin's onset detection function.
Chris@0 348 (It will only process the audio file and run the plugin once.)
Chris@0 349
Chris@310 350 ```
Chris@0 351 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
Chris@0 352 @prefix vamp: <http://purl.org/ontology/vamp/>.
Chris@0 353 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
Chris@0 354 @prefix : <#>.
Chris@0 355
Chris@0 356 :transform1 a vamp:Transform;
Chris@0 357 vamp:plugin examples:percussiononsets ;
Chris@0 358 vamp:output examples:percussiononsets_output_onsets .
Chris@0 359
Chris@0 360 :transform0 a vamp:Transform;
Chris@0 361 vamp:plugin examples:percussiononsets ;
Chris@0 362 vamp:output examples:percussiononsets_output_detectionfunction ;
Chris@0 363 vamp:summary_type "variance" .
Chris@310 364 ```
Chris@0 365
Chris@0 366 Sonic Annotator can also summarise in segments -- if you provide a
Chris@0 367 comma-separated list of times as an argument to the --segments option,
Chris@0 368 it will calculate one summary for each segment bounded by the times
Chris@0 369 you provided. For example,
Chris@0 370
Chris@310 371 ```
Chris@0 372 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
Chris@0 373 (... some log output on stderr, then ...)
Chris@0 374 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
Chris@0 375 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
Chris@0 376 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
Chris@0 377 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
Chris@310 378 ```
Chris@0 379
Chris@0 380 Here the first row contains a summary covering the time period from 0
Chris@0 381 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
Chris@0 382 seconds and the fourth from 3 seconds to the end of the (short) audio
Chris@0 383 file.
Chris@0 384
Chris@309 385
Chris@309 386 Automated build reports
Chris@309 387 -----------------------
Chris@309 388
Chris@399 389 * Linux and macOS CI build: [![Build Status](https://travis-ci.org/sonic-visualiser/sonic-annotator.svg?branch=default)](https://travis-ci.org/sonic-visualiser/sonic-annotator)
Chris@309 390 * Windows CI build: [![Build status](https://ci.appveyor.com/api/projects/status/26pygienkigw39p7?svg=true)](https://ci.appveyor.com/project/cannam/sonic-annotator)