| Chris@0 | 1 | 
| Chris@0 | 2 Sonic Annotator | 
| Chris@0 | 3 =============== | 
| Chris@0 | 4 | 
| Chris@0 | 5 Sonic Annotator is a utility program for batch feature extraction from | 
| Chris@0 | 6 audio files.  It runs Vamp audio analysis plugins on audio files, and | 
| Chris@0 | 7 can write the result features in a selection of formats. | 
| Chris@0 | 8 | 
| Chris@2 | 9 For more information, see | 
| Chris@2 | 10 | 
| Chris@2 | 11   http://www.omras2.org/SonicAnnotator | 
| Chris@2 | 12 | 
| Chris@2 | 13 More documentation follows further down this README file, after the | 
| Chris@2 | 14 credits. | 
| Chris@2 | 15 | 
| Chris@2 | 16 | 
| Chris@2 | 17 Credits | 
| Chris@2 | 18 ------- | 
| Chris@2 | 19 | 
| Chris@2 | 20 Sonic Annotator was developed at the Centre for Digital Music, | 
| Chris@2 | 21 Queen Mary, University of London. | 
| Chris@2 | 22 | 
| Chris@87 | 23   http://c4dm.eecs.qmul.ac.uk/ | 
| Chris@2 | 24 | 
| Chris@2 | 25 The main program is by Mark Levy, Chris Cannam, and Chris Sutton. | 
| Chris@2 | 26 Sonic Annotator incorporates library code from the Sonic Visualiser | 
| Chris@2 | 27 application by Chris Cannam.  Code copyright 2005-2007 Chris Cannam, | 
| Chris@95 | 28 copyright 2006-2014 Queen Mary, University of London, except where | 
| Chris@2 | 29 indicated in the individual source files. | 
| Chris@2 | 30 | 
| Chris@2 | 31 This work was funded by the Engineering and Physical Sciences Research | 
| Chris@2 | 32 Council through the OMRAS2 project EP/E017614/1. | 
| Chris@2 | 33 | 
| Chris@2 | 34 Sonic Annotator is free software; you can redistribute it and/or | 
| Chris@2 | 35 modify it under the terms of the GNU General Public License as | 
| Chris@2 | 36 published by the Free Software Foundation; either version 2 of the | 
| Chris@2 | 37 License, or (at your option) any later version.  See the file COPYING | 
| Chris@2 | 38 included with this distribution for more information. | 
| Chris@2 | 39 | 
| Chris@2 | 40 Sonic Annotator may also make use of the following libraries: | 
| Chris@2 | 41 | 
| Chris@87 | 42  * Qt5 -- Copyright Digia Oyj, distributed under the LGPL | 
| Chris@2 | 43  * Ogg decoder -- Copyright CSIRO Australia, BSD license | 
| Chris@2 | 44  * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL | 
| Chris@2 | 45  * libsamplerate -- Copyright Erik de Castro Lopo, GPL | 
| Chris@2 | 46  * libsndfile -- Copyright Erik de Castro Lopo, LGPL | 
| Chris@2 | 47  * FFTW3 -- Copyright Matteo Frigo and MIT, GPL | 
| Chris@87 | 48  * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license | 
| Chris@87 | 49  * Dataquay -- Copyright Breakfast Quay, BSD license | 
| Chris@87 | 50  * Sord and Serd -- Copyright David Robillard, BSD license | 
| Chris@2 | 51 | 
| Chris@2 | 52 (Some distributions of Sonic Annotator may have one or more of these | 
| Chris@2 | 53 libraries statically linked.)  Many thanks to their authors. | 
| Chris@2 | 54 | 
| Chris@0 | 55 | 
| Chris@0 | 56 A Quick Tutorial | 
| Chris@2 | 57 ================ | 
| Chris@0 | 58 | 
| Chris@0 | 59 To use Sonic Annotator, you need to tell it three things: what audio | 
| Chris@0 | 60 files to extract features from; what features to extract; and how and | 
| Chris@0 | 61 where to write the results.  You can also optionally tell it to | 
| Chris@0 | 62 summarise the features. | 
| Chris@0 | 63 | 
| Chris@0 | 64 | 
| Chris@0 | 65 1. What audio files to extract features from | 
| Chris@0 | 66 | 
| Chris@0 | 67 Sonic Annotator accepts a list of audio files on the command line. | 
| Chris@0 | 68 Any argument that is not understood as a supported command-line option | 
| Chris@0 | 69 will be taken to be the name of an audio file.  Any number of files | 
| Chris@0 | 70 may be listed. | 
| Chris@0 | 71 | 
| Chris@0 | 72 Several common audio file formats are supported, including MP3, Ogg, | 
| Chris@0 | 73 and a number of PCM formats such as WAV and AIFF.  AAC is supported on | 
| Chris@0 | 74 OS/X only, and only if not DRM protected.  WMA is not supported. | 
| Chris@0 | 75 | 
| Chris@0 | 76 File paths do not have to be local; you can also provide remote HTTP | 
| Chris@0 | 77 or FTP URLs for Sonic Annotator to retrieve. | 
| Chris@0 | 78 | 
| Chris@0 | 79 Sonic Annotator also accepts the names of playlist files (.m3u | 
| Chris@0 | 80 extension) and will process every file found in the playlist. | 
| Chris@0 | 81 | 
| Chris@0 | 82 Finally, you can provide a local directory path instead of a file, | 
| Chris@0 | 83 together with the -r (recursive) option, for Sonic Annotator to | 
| Chris@0 | 84 process every audio file found in that directory or any of its | 
| Chris@0 | 85 subdirectories. | 
| Chris@0 | 86 | 
| Chris@0 | 87 | 
| Chris@0 | 88 2. What features to extract | 
| Chris@0 | 89 | 
| Chris@0 | 90 Sonic Annotator applies "transforms" to its input audio files, where a | 
| Chris@0 | 91 transform (in this terminology) consists of a Vamp plugin together | 
| Chris@0 | 92 with a certain set of parameters and a specified execution context: | 
| Chris@0 | 93 step and block size, sample rate, etc. | 
| Chris@0 | 94 | 
| Chris@0 | 95 (See http://www.vamp-plugins.org/ for more information about Vamp | 
| Chris@0 | 96 plugins.) | 
| Chris@0 | 97 | 
| Chris@0 | 98 To use a particular transform, specify its filename on the command | 
| Chris@0 | 99 line with the -t option. | 
| Chris@0 | 100 | 
| Chris@0 | 101 Transforms are usually described in RDF, following the transform part | 
| Chris@0 | 102 of the Vamp plugin ontology (http://purl.org/ontology/vamp/).  A | 
| Chris@0 | 103 Transform may use any Vamp plugin that is currently installed and | 
| Chris@0 | 104 available on the system.  You can obtain a list of available plugin | 
| Chris@0 | 105 outputs by running Sonic Annotator with the -l option, and you can | 
| Chris@0 | 106 obtain a skeleton transform description for one of these plugins with | 
| Chris@0 | 107 the -s option. | 
| Chris@0 | 108 | 
| Chris@0 | 109 For example, if the example plugins from the Vamp plugin SDK are | 
| Chris@0 | 110 available and no other plugins are installed, you might have an | 
| Chris@0 | 111 exchange like this: | 
| Chris@0 | 112 | 
| Chris@0 | 113   $ sonic-annotator -l | 
| Chris@0 | 114   vamp:vamp-example-plugins:amplitudefollower:amplitude | 
| Chris@0 | 115   vamp:vamp-example-plugins:fixedtempo:acf | 
| Chris@0 | 116   vamp:vamp-example-plugins:fixedtempo:detectionfunction | 
| Chris@0 | 117   vamp:vamp-example-plugins:fixedtempo:filtered_acf | 
| Chris@0 | 118   vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 119   vamp:vamp-example-plugins:fixedtempo:candidates | 
| Chris@0 | 120   vamp:vamp-example-plugins:percussiononsets:detectionfunction | 
| Chris@0 | 121   vamp:vamp-example-plugins:percussiononsets:onsets | 
| Chris@0 | 122   vamp:vamp-example-plugins:powerspectrum:powerspectrum | 
| Chris@0 | 123   vamp:vamp-example-plugins:spectralcentroid:linearcentroid | 
| Chris@0 | 124   vamp:vamp-example-plugins:spectralcentroid:logcentroid | 
| Chris@0 | 125   vamp:vamp-example-plugins:zerocrossing:counts | 
| Chris@0 | 126   vamp:vamp-example-plugins:zerocrossing:zerocrossings | 
| Chris@0 | 127   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 128   @prefix xsd:      <http://www.w3.org/2001/XMLSchema#> . | 
| Chris@0 | 129   @prefix vamp:     <http://purl.org/ontology/vamp/> . | 
| Chris@0 | 130   @prefix :         <#> . | 
| Chris@0 | 131 | 
| Chris@0 | 132   :transform a vamp:Transform ; | 
| Chris@0 | 133       vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ; | 
| Chris@0 | 134       vamp:step_size "64"^^xsd:int ; | 
| Chris@0 | 135       vamp:block_size "256"^^xsd:int ; | 
| Chris@0 | 136       vamp:parameter_binding [ | 
| Chris@0 | 137           vamp:parameter [ vamp:identifier "maxbpm" ] ; | 
| Chris@0 | 138           vamp:value "190"^^xsd:float ; | 
| Chris@0 | 139       ] ; | 
| Chris@0 | 140       vamp:parameter_binding [ | 
| Chris@0 | 141           vamp:parameter [ vamp:identifier "maxdflen" ] ; | 
| Chris@0 | 142           vamp:value "10"^^xsd:float ; | 
| Chris@0 | 143       ] ; | 
| Chris@0 | 144       vamp:parameter_binding [ | 
| Chris@0 | 145           vamp:parameter [ vamp:identifier "minbpm" ] ; | 
| Chris@0 | 146           vamp:value "50"^^xsd:float ; | 
| Chris@0 | 147       ] ; | 
| Chris@0 | 148       vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> . | 
| Chris@0 | 149   $ | 
| Chris@0 | 150 | 
| Chris@0 | 151 The output of -s is an RDF/Turtle document describing the default | 
| Chris@0 | 152 settings for the Tempo output of the Fixed Tempo Estimator plugin in | 
| Chris@0 | 153 the Vamp plugin SDK. | 
| Chris@0 | 154 | 
| Chris@0 | 155 (The exact format of the RDF printed may differ -- e.g. if the | 
| Chris@0 | 156 plugin's RDF description is not installed and so its "home" URI is not | 
| Chris@0 | 157 known -- but the result should be functionally equivalent to this.) | 
| Chris@0 | 158 | 
| Chris@0 | 159 You could run this transform by saving the RDF to a file and | 
| Chris@0 | 160 specifying that file with -t: | 
| Chris@0 | 161 | 
| Chris@0 | 162   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3 | 
| Chris@0 | 163   $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout | 
| Chris@0 | 164   (... logging output on stderr, then ...) | 
| Chris@0 | 165   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 166   $ | 
| Chris@0 | 167 | 
| Chris@0 | 168 The single line of output above consists of the audio file name, the | 
| Chris@0 | 169 timestamp and duration for a single feature, the value of that feature | 
| Chris@0 | 170 (the estimated tempo of the given region of time from that file, in | 
| Chris@0 | 171 bpm -- the plugin in question performs a single tempo estimation and | 
| Chris@0 | 172 nothing else) and the feature's label. | 
| Chris@0 | 173 | 
| Chris@0 | 174 A quicker way to achieve the above is to use the -d (default) option | 
| Chris@0 | 175 to tell Sonic Annotator to use directly the default configuration for | 
| Chris@0 | 176 a named transform: | 
| Chris@0 | 177 | 
| Chris@0 | 178   $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout | 
| Chris@0 | 179   (... some log output on stderr, then ...) | 
| Chris@0 | 180   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 181   $ | 
| Chris@0 | 182 | 
| Chris@0 | 183 Although handy for experimentation, the -d option is inadvisable in | 
| Chris@0 | 184 any "production" situation because the plugin configuration is not | 
| Chris@0 | 185 guaranteed to be the same each time (for example if an updated version | 
| Chris@0 | 186 of a plugin changes some of its defaults).  It's better to save a | 
| Chris@0 | 187 well-defined transform to file and refer to that, even if it is simply | 
| Chris@0 | 188 the transform created by the skeleton option. | 
| Chris@0 | 189 | 
| Chris@0 | 190 To run more than one transform on the same audio files, just put more | 
| Chris@0 | 191 than one set of transform RDF descriptions in the same file, or give | 
| Chris@0 | 192 the -t option more than once with separate transform description | 
| Chris@0 | 193 files.  Remember that if you want to specify more than one transform | 
| Chris@0 | 194 in the same file, they will need to have distinct URIs (that is, the | 
| Chris@0 | 195 ":transform" part of the example above, which may be any arbitrary | 
| Chris@0 | 196 name, must be distinct for each described transform). | 
| Chris@0 | 197 | 
| Chris@0 | 198 | 
| Chris@0 | 199 3. How and where to write the results | 
| Chris@0 | 200 | 
| Chris@0 | 201 Sonic Annotator supports various different output modules (and it is | 
| Chris@0 | 202 fairly easy for the developer to add new ones).  You have to choose at | 
| Chris@0 | 203 least one output module; use the -w (writer) option to do so.  Each | 
| Chris@0 | 204 module has its own set of parameters which can be adjusted on the | 
| Chris@0 | 205 command line, as well as its own default rules about where to write | 
| Chris@0 | 206 the results. | 
| Chris@0 | 207 | 
| Chris@0 | 208 The following writers are currently supported.  (Others exist, but are | 
| Chris@0 | 209 not properly implemented or not supported.) | 
| Chris@0 | 210 | 
| Chris@0 | 211  * csv | 
| Chris@0 | 212 | 
| Chris@0 | 213    Writes the results into comma-separated data files. | 
| Chris@0 | 214 | 
| Chris@0 | 215    One file is created for each transform applied to each input audio | 
| Chris@0 | 216    file, named after the input audio file and transform name with .csv | 
| Chris@0 | 217    suffix and ":" replaced by "_" throughout, placed in the same | 
| Chris@0 | 218    directory as the audio file. | 
| Chris@0 | 219 | 
| Chris@0 | 220    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 221    location, use --csv-basedir with a directory name. | 
| Chris@0 | 222 | 
| Chris@0 | 223    To write a single file with all data in it, use --csv-one-file. | 
| Chris@0 | 224 | 
| Chris@0 | 225    To write all data to stdout instead of to a file, use --csv-stdout. | 
| Chris@0 | 226 | 
| Chris@0 | 227    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 228    exists.  If you want to make it do this, use --csv-force to | 
| Chris@0 | 229    overwrite or --csv-append to append to it. | 
| Chris@0 | 230 | 
| Chris@0 | 231    The data generated consists of one line for each result feature, | 
| Chris@0 | 232    containing the feature timestamp, feature duration if present, all | 
| Chris@0 | 233    of the feature's bin values in order, followed by the feature's | 
| Chris@0 | 234    label if present.  If the --csv-one-file or --csv-stdout option is | 
| Chris@0 | 235    specified, then an additional column will appear before any of the | 
| Chris@0 | 236    above, containing the audio file name from which the feature was | 
| Chris@0 | 237    extracted, if it differs from that of the previous row. | 
| Chris@0 | 238 | 
| Chris@0 | 239    The default column separator is a comma; you can specify a | 
| Chris@0 | 240    different one with the --csv-separator option. | 
| Chris@0 | 241 | 
| Chris@0 | 242  * rdf | 
| Chris@0 | 243 | 
| Chris@0 | 244    Writes the results into RDF/Turtle documents following the Audio | 
| Chris@0 | 245    Features ontology (http://purl.org/ontology/af/). | 
| Chris@0 | 246 | 
| Chris@0 | 247    One file is created for each input audio file containing the | 
| Chris@0 | 248    features extracted by all transforms applied to that file, named | 
| Chris@0 | 249    after the input audio file with .n3 extension, placed in the same | 
| Chris@0 | 250    directory as the audio file. | 
| Chris@0 | 251 | 
| Chris@0 | 252    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 253    location, use --rdf-basedir with a directory name. | 
| Chris@0 | 254 | 
| Chris@0 | 255    To write a single file with all data (from all input audio files) | 
| Chris@0 | 256    in it, use --rdf-one-file. | 
| Chris@0 | 257 | 
| Chris@0 | 258    To write one file for each transform applied to each input audio | 
| Chris@0 | 259    file, named after the input audio file and transform name with .n3 | 
| Chris@0 | 260    suffix and ":" replaced by "_" throughout, use --rdf-many-files. | 
| Chris@0 | 261 | 
| Chris@0 | 262    To write all data to stdout instead of to a file, use --rdf-stdout. | 
| Chris@0 | 263 | 
| Chris@0 | 264    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 265    exists.  If you want to make it do this, use --rdf-force to | 
| Chris@0 | 266    overwrite or --rdf-append to append to it. | 
| Chris@0 | 267 | 
| Chris@0 | 268    Sonic Annotator will use plugin description RDF if available to | 
| Chris@0 | 269    enhance its output (for example identifying note onset times as | 
| Chris@0 | 270    note onset times, if the plugin's RDF says that is what it | 
| Chris@0 | 271    produces, rather than writing them as plain events).  Best results | 
| Chris@0 | 272    will be obtained if an RDF document is provided with your plugins | 
| Chris@0 | 273    (for example, vamp-example-plugins.n3) and you have this installed | 
| Chris@0 | 274    in the same location as the plugins.  To override this enhanced | 
| Chris@0 | 275    output and write plain events for all features, use --rdf-plain. | 
| Chris@0 | 276 | 
| Chris@0 | 277    The output RDF will include an available_as property linking the | 
| Chris@0 | 278    results to the original audio signal URI.  By default, this will | 
| Chris@0 | 279    point to the URI of the file or resource containing the audio that | 
| Chris@0 | 280    Sonic Annotator processed, such as the file:/// location on disk. | 
| Chris@0 | 281    To override this, for example to process a local copy of a file | 
| Chris@0 | 282    while generating RDF that describes a copy of it available on a | 
| Chris@0 | 283    network, you can use the --rdf-signal-uri option to specify an | 
| Chris@0 | 284    alternative signal URI. | 
| Chris@0 | 285 | 
| Chris@0 | 286 | 
| Chris@0 | 287 4. Optionally, how to summarise the features | 
| Chris@0 | 288 | 
| Chris@0 | 289 Sonic Annotator can also calculate and write summaries of features, | 
| Chris@0 | 290 such as mean and median values. | 
| Chris@0 | 291 | 
| Chris@0 | 292 To obtain a summary as well as the feature results, just use the -S | 
| Chris@0 | 293 option, naming the type of summary you want (min, max, mean, median, | 
| Chris@0 | 294 mode, sum, variance, sd or count).  You can also tell it to produce | 
| Chris@0 | 295 only the summary, not the individual features, with --summary-only. | 
| Chris@0 | 296 | 
| Chris@0 | 297 Alternatively, you can specify a summary in a transform description. | 
| Chris@0 | 298 The following example tells Sonic Annotator to write both the times of | 
| Chris@0 | 299 note onsets estimated by the simple percussion onset detector example | 
| Chris@0 | 300 plugin, and the variance of the plugin's onset detection function. | 
| Chris@0 | 301 (It will only process the audio file and run the plugin once.) | 
| Chris@0 | 302 | 
| Chris@0 | 303   @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. | 
| Chris@0 | 304   @prefix vamp: <http://purl.org/ontology/vamp/>. | 
| Chris@0 | 305   @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>. | 
| Chris@0 | 306   @prefix : <#>. | 
| Chris@0 | 307 | 
| Chris@0 | 308   :transform1 a vamp:Transform; | 
| Chris@0 | 309      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 310      vamp:output examples:percussiononsets_output_onsets . | 
| Chris@0 | 311 | 
| Chris@0 | 312   :transform0 a vamp:Transform; | 
| Chris@0 | 313      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 314      vamp:output examples:percussiononsets_output_detectionfunction ; | 
| Chris@0 | 315      vamp:summary_type "variance" . | 
| Chris@0 | 316 | 
| Chris@0 | 317 Sonic Annotator can also summarise in segments -- if you provide a | 
| Chris@0 | 318 comma-separated list of times as an argument to the --segments option, | 
| Chris@0 | 319 it will calculate one summary for each segment bounded by the times | 
| Chris@0 | 320 you provided.  For example, | 
| Chris@0 | 321 | 
| Chris@0 | 322   $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav | 
| Chris@0 | 323   (... some log output on stderr, then ...) | 
| Chris@0 | 324   ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)" | 
| Chris@0 | 325   ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)" | 
| Chris@0 | 326   ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)" | 
| Chris@0 | 327   ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)" | 
| Chris@0 | 328 | 
| Chris@0 | 329 Here the first row contains a summary covering the time period from 0 | 
| Chris@0 | 330 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3 | 
| Chris@0 | 331 seconds and the fourth from 3 seconds to the end of the (short) audio | 
| Chris@0 | 332 file. | 
| Chris@0 | 333 |