| Chris@0 | 1 | 
| Chris@0 | 2 Sonic Annotator | 
| Chris@0 | 3 =============== | 
| Chris@0 | 4 | 
| Chris@0 | 5 Sonic Annotator is a utility program for batch feature extraction from | 
| Chris@0 | 6 audio files.  It runs Vamp audio analysis plugins on audio files, and | 
| Chris@0 | 7 can write the result features in a selection of formats. | 
| Chris@0 | 8 | 
| Chris@2 | 9 For more information, see | 
| Chris@2 | 10 | 
| Chris@2 | 11   http://www.omras2.org/SonicAnnotator | 
| Chris@2 | 12 | 
| Chris@2 | 13 More documentation follows further down this README file, after the | 
| Chris@2 | 14 credits. | 
| Chris@2 | 15 | 
| Chris@2 | 16 | 
| Chris@2 | 17 Credits | 
| Chris@2 | 18 ------- | 
| Chris@2 | 19 | 
| Chris@2 | 20 Sonic Annotator was developed at the Centre for Digital Music, | 
| Chris@2 | 21 Queen Mary, University of London. | 
| Chris@2 | 22 | 
| Chris@2 | 23   http://www.elec.qmul.ac.uk/digitalmusic/ | 
| Chris@2 | 24 | 
| Chris@2 | 25 The main program is by Mark Levy, Chris Cannam, and Chris Sutton. | 
| Chris@2 | 26 Sonic Annotator incorporates library code from the Sonic Visualiser | 
| Chris@2 | 27 application by Chris Cannam.  Code copyright 2005-2007 Chris Cannam, | 
| Chris@2 | 28 copyright 2006-2008 Queen Mary, University of London, except where | 
| Chris@2 | 29 indicated in the individual source files. | 
| Chris@2 | 30 | 
| Chris@2 | 31 This work was funded by the Engineering and Physical Sciences Research | 
| Chris@2 | 32 Council through the OMRAS2 project EP/E017614/1. | 
| Chris@2 | 33 | 
| Chris@2 | 34 Sonic Annotator is free software; you can redistribute it and/or | 
| Chris@2 | 35 modify it under the terms of the GNU General Public License as | 
| Chris@2 | 36 published by the Free Software Foundation; either version 2 of the | 
| Chris@2 | 37 License, or (at your option) any later version.  See the file COPYING | 
| Chris@2 | 38 included with this distribution for more information. | 
| Chris@2 | 39 | 
| Chris@2 | 40 Sonic Annotator may also make use of the following libraries: | 
| Chris@2 | 41 | 
| Chris@2 | 42  * Qt4 -- Copyright Nokia Corporation, distributed under the GPL | 
| Chris@2 | 43  * Ogg decoder -- Copyright CSIRO Australia, BSD license | 
| Chris@2 | 44  * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL | 
| Chris@2 | 45  * libsamplerate -- Copyright Erik de Castro Lopo, GPL | 
| Chris@2 | 46  * libsndfile -- Copyright Erik de Castro Lopo, LGPL | 
| Chris@2 | 47  * FFTW3 -- Copyright Matteo Frigo and MIT, GPL | 
| Chris@2 | 48  * Vamp plugin SDK -- Copyright Chris Cannam, BSD license | 
| Chris@2 | 49  * Redland RDF libraries -- Copyright Dave Beckett and the University of Bristol, LGPL/Apache license | 
| Chris@2 | 50 | 
| Chris@2 | 51 (Some distributions of Sonic Annotator may have one or more of these | 
| Chris@2 | 52 libraries statically linked.)  Many thanks to their authors. | 
| Chris@2 | 53 | 
| Chris@2 | 54 Sonic Annotator can also use QuickTime for audio file import on OS/X. | 
| Chris@2 | 55 For licensing reasons, you may not distribute binaries of Sonic | 
| Chris@2 | 56 Annotator with QuickTime support included for any platform that does | 
| Chris@2 | 57 not include QuickTime as part of the platform itself (see section 3 of | 
| Chris@2 | 58 version 2 of the GNU General Public License). | 
| Chris@2 | 59 | 
| Chris@2 | 60 | 
| Chris@2 | 61 Compiling Sonic Annotator | 
| Chris@2 | 62 -------------------------- | 
| Chris@2 | 63 | 
| Chris@2 | 64 If you are planning to compile Sonic Annotator from source code, | 
| Chris@2 | 65 please read the file INSTALL. | 
| Chris@2 | 66 | 
| Chris@0 | 67 | 
| Chris@0 | 68 A Quick Tutorial | 
| Chris@2 | 69 ================ | 
| Chris@0 | 70 | 
| Chris@0 | 71 To use Sonic Annotator, you need to tell it three things: what audio | 
| Chris@0 | 72 files to extract features from; what features to extract; and how and | 
| Chris@0 | 73 where to write the results.  You can also optionally tell it to | 
| Chris@0 | 74 summarise the features. | 
| Chris@0 | 75 | 
| Chris@0 | 76 | 
| Chris@0 | 77 1. What audio files to extract features from | 
| Chris@0 | 78 | 
| Chris@0 | 79 Sonic Annotator accepts a list of audio files on the command line. | 
| Chris@0 | 80 Any argument that is not understood as a supported command-line option | 
| Chris@0 | 81 will be taken to be the name of an audio file.  Any number of files | 
| Chris@0 | 82 may be listed. | 
| Chris@0 | 83 | 
| Chris@0 | 84 Several common audio file formats are supported, including MP3, Ogg, | 
| Chris@0 | 85 and a number of PCM formats such as WAV and AIFF.  AAC is supported on | 
| Chris@0 | 86 OS/X only, and only if not DRM protected.  WMA is not supported. | 
| Chris@0 | 87 | 
| Chris@0 | 88 File paths do not have to be local; you can also provide remote HTTP | 
| Chris@0 | 89 or FTP URLs for Sonic Annotator to retrieve. | 
| Chris@0 | 90 | 
| Chris@0 | 91 Sonic Annotator also accepts the names of playlist files (.m3u | 
| Chris@0 | 92 extension) and will process every file found in the playlist. | 
| Chris@0 | 93 | 
| Chris@0 | 94 Finally, you can provide a local directory path instead of a file, | 
| Chris@0 | 95 together with the -r (recursive) option, for Sonic Annotator to | 
| Chris@0 | 96 process every audio file found in that directory or any of its | 
| Chris@0 | 97 subdirectories. | 
| Chris@0 | 98 | 
| Chris@0 | 99 | 
| Chris@0 | 100 2. What features to extract | 
| Chris@0 | 101 | 
| Chris@0 | 102 Sonic Annotator applies "transforms" to its input audio files, where a | 
| Chris@0 | 103 transform (in this terminology) consists of a Vamp plugin together | 
| Chris@0 | 104 with a certain set of parameters and a specified execution context: | 
| Chris@0 | 105 step and block size, sample rate, etc. | 
| Chris@0 | 106 | 
| Chris@0 | 107 (See http://www.vamp-plugins.org/ for more information about Vamp | 
| Chris@0 | 108 plugins.) | 
| Chris@0 | 109 | 
| Chris@0 | 110 To use a particular transform, specify its filename on the command | 
| Chris@0 | 111 line with the -t option. | 
| Chris@0 | 112 | 
| Chris@0 | 113 Transforms are usually described in RDF, following the transform part | 
| Chris@0 | 114 of the Vamp plugin ontology (http://purl.org/ontology/vamp/).  A | 
| Chris@0 | 115 Transform may use any Vamp plugin that is currently installed and | 
| Chris@0 | 116 available on the system.  You can obtain a list of available plugin | 
| Chris@0 | 117 outputs by running Sonic Annotator with the -l option, and you can | 
| Chris@0 | 118 obtain a skeleton transform description for one of these plugins with | 
| Chris@0 | 119 the -s option. | 
| Chris@0 | 120 | 
| Chris@0 | 121 For example, if the example plugins from the Vamp plugin SDK are | 
| Chris@0 | 122 available and no other plugins are installed, you might have an | 
| Chris@0 | 123 exchange like this: | 
| Chris@0 | 124 | 
| Chris@0 | 125   $ sonic-annotator -l | 
| Chris@0 | 126   vamp:vamp-example-plugins:amplitudefollower:amplitude | 
| Chris@0 | 127   vamp:vamp-example-plugins:fixedtempo:acf | 
| Chris@0 | 128   vamp:vamp-example-plugins:fixedtempo:detectionfunction | 
| Chris@0 | 129   vamp:vamp-example-plugins:fixedtempo:filtered_acf | 
| Chris@0 | 130   vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 131   vamp:vamp-example-plugins:fixedtempo:candidates | 
| Chris@0 | 132   vamp:vamp-example-plugins:percussiononsets:detectionfunction | 
| Chris@0 | 133   vamp:vamp-example-plugins:percussiononsets:onsets | 
| Chris@0 | 134   vamp:vamp-example-plugins:powerspectrum:powerspectrum | 
| Chris@0 | 135   vamp:vamp-example-plugins:spectralcentroid:linearcentroid | 
| Chris@0 | 136   vamp:vamp-example-plugins:spectralcentroid:logcentroid | 
| Chris@0 | 137   vamp:vamp-example-plugins:zerocrossing:counts | 
| Chris@0 | 138   vamp:vamp-example-plugins:zerocrossing:zerocrossings | 
| Chris@0 | 139   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 140   @prefix xsd:      <http://www.w3.org/2001/XMLSchema#> . | 
| Chris@0 | 141   @prefix vamp:     <http://purl.org/ontology/vamp/> . | 
| Chris@0 | 142   @prefix :         <#> . | 
| Chris@0 | 143 | 
| Chris@0 | 144   :transform a vamp:Transform ; | 
| Chris@0 | 145       vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ; | 
| Chris@0 | 146       vamp:step_size "64"^^xsd:int ; | 
| Chris@0 | 147       vamp:block_size "256"^^xsd:int ; | 
| Chris@0 | 148       vamp:parameter_binding [ | 
| Chris@0 | 149           vamp:parameter [ vamp:identifier "maxbpm" ] ; | 
| Chris@0 | 150           vamp:value "190"^^xsd:float ; | 
| Chris@0 | 151       ] ; | 
| Chris@0 | 152       vamp:parameter_binding [ | 
| Chris@0 | 153           vamp:parameter [ vamp:identifier "maxdflen" ] ; | 
| Chris@0 | 154           vamp:value "10"^^xsd:float ; | 
| Chris@0 | 155       ] ; | 
| Chris@0 | 156       vamp:parameter_binding [ | 
| Chris@0 | 157           vamp:parameter [ vamp:identifier "minbpm" ] ; | 
| Chris@0 | 158           vamp:value "50"^^xsd:float ; | 
| Chris@0 | 159       ] ; | 
| Chris@0 | 160       vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> . | 
| Chris@0 | 161   $ | 
| Chris@0 | 162 | 
| Chris@0 | 163 The output of -s is an RDF/Turtle document describing the default | 
| Chris@0 | 164 settings for the Tempo output of the Fixed Tempo Estimator plugin in | 
| Chris@0 | 165 the Vamp plugin SDK. | 
| Chris@0 | 166 | 
| Chris@0 | 167 (The exact format of the RDF printed may differ -- e.g. if the | 
| Chris@0 | 168 plugin's RDF description is not installed and so its "home" URI is not | 
| Chris@0 | 169 known -- but the result should be functionally equivalent to this.) | 
| Chris@0 | 170 | 
| Chris@0 | 171 You could run this transform by saving the RDF to a file and | 
| Chris@0 | 172 specifying that file with -t: | 
| Chris@0 | 173 | 
| Chris@0 | 174   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3 | 
| Chris@0 | 175   $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout | 
| Chris@0 | 176   (... logging output on stderr, then ...) | 
| Chris@0 | 177   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 178   $ | 
| Chris@0 | 179 | 
| Chris@0 | 180 The single line of output above consists of the audio file name, the | 
| Chris@0 | 181 timestamp and duration for a single feature, the value of that feature | 
| Chris@0 | 182 (the estimated tempo of the given region of time from that file, in | 
| Chris@0 | 183 bpm -- the plugin in question performs a single tempo estimation and | 
| Chris@0 | 184 nothing else) and the feature's label. | 
| Chris@0 | 185 | 
| Chris@0 | 186 A quicker way to achieve the above is to use the -d (default) option | 
| Chris@0 | 187 to tell Sonic Annotator to use directly the default configuration for | 
| Chris@0 | 188 a named transform: | 
| Chris@0 | 189 | 
| Chris@0 | 190   $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout | 
| Chris@0 | 191   (... some log output on stderr, then ...) | 
| Chris@0 | 192   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 193   $ | 
| Chris@0 | 194 | 
| Chris@0 | 195 Although handy for experimentation, the -d option is inadvisable in | 
| Chris@0 | 196 any "production" situation because the plugin configuration is not | 
| Chris@0 | 197 guaranteed to be the same each time (for example if an updated version | 
| Chris@0 | 198 of a plugin changes some of its defaults).  It's better to save a | 
| Chris@0 | 199 well-defined transform to file and refer to that, even if it is simply | 
| Chris@0 | 200 the transform created by the skeleton option. | 
| Chris@0 | 201 | 
| Chris@0 | 202 To run more than one transform on the same audio files, just put more | 
| Chris@0 | 203 than one set of transform RDF descriptions in the same file, or give | 
| Chris@0 | 204 the -t option more than once with separate transform description | 
| Chris@0 | 205 files.  Remember that if you want to specify more than one transform | 
| Chris@0 | 206 in the same file, they will need to have distinct URIs (that is, the | 
| Chris@0 | 207 ":transform" part of the example above, which may be any arbitrary | 
| Chris@0 | 208 name, must be distinct for each described transform). | 
| Chris@0 | 209 | 
| Chris@0 | 210 | 
| Chris@0 | 211 3. How and where to write the results | 
| Chris@0 | 212 | 
| Chris@0 | 213 Sonic Annotator supports various different output modules (and it is | 
| Chris@0 | 214 fairly easy for the developer to add new ones).  You have to choose at | 
| Chris@0 | 215 least one output module; use the -w (writer) option to do so.  Each | 
| Chris@0 | 216 module has its own set of parameters which can be adjusted on the | 
| Chris@0 | 217 command line, as well as its own default rules about where to write | 
| Chris@0 | 218 the results. | 
| Chris@0 | 219 | 
| Chris@0 | 220 The following writers are currently supported.  (Others exist, but are | 
| Chris@0 | 221 not properly implemented or not supported.) | 
| Chris@0 | 222 | 
| Chris@0 | 223  * csv | 
| Chris@0 | 224 | 
| Chris@0 | 225    Writes the results into comma-separated data files. | 
| Chris@0 | 226 | 
| Chris@0 | 227    One file is created for each transform applied to each input audio | 
| Chris@0 | 228    file, named after the input audio file and transform name with .csv | 
| Chris@0 | 229    suffix and ":" replaced by "_" throughout, placed in the same | 
| Chris@0 | 230    directory as the audio file. | 
| Chris@0 | 231 | 
| Chris@0 | 232    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 233    location, use --csv-basedir with a directory name. | 
| Chris@0 | 234 | 
| Chris@0 | 235    To write a single file with all data in it, use --csv-one-file. | 
| Chris@0 | 236 | 
| Chris@0 | 237    To write all data to stdout instead of to a file, use --csv-stdout. | 
| Chris@0 | 238 | 
| Chris@0 | 239    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 240    exists.  If you want to make it do this, use --csv-force to | 
| Chris@0 | 241    overwrite or --csv-append to append to it. | 
| Chris@0 | 242 | 
| Chris@0 | 243    The data generated consists of one line for each result feature, | 
| Chris@0 | 244    containing the feature timestamp, feature duration if present, all | 
| Chris@0 | 245    of the feature's bin values in order, followed by the feature's | 
| Chris@0 | 246    label if present.  If the --csv-one-file or --csv-stdout option is | 
| Chris@0 | 247    specified, then an additional column will appear before any of the | 
| Chris@0 | 248    above, containing the audio file name from which the feature was | 
| Chris@0 | 249    extracted, if it differs from that of the previous row. | 
| Chris@0 | 250 | 
| Chris@0 | 251    The default column separator is a comma; you can specify a | 
| Chris@0 | 252    different one with the --csv-separator option. | 
| Chris@0 | 253 | 
| Chris@0 | 254  * rdf | 
| Chris@0 | 255 | 
| Chris@0 | 256    Writes the results into RDF/Turtle documents following the Audio | 
| Chris@0 | 257    Features ontology (http://purl.org/ontology/af/). | 
| Chris@0 | 258 | 
| Chris@0 | 259    One file is created for each input audio file containing the | 
| Chris@0 | 260    features extracted by all transforms applied to that file, named | 
| Chris@0 | 261    after the input audio file with .n3 extension, placed in the same | 
| Chris@0 | 262    directory as the audio file. | 
| Chris@0 | 263 | 
| Chris@0 | 264    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 265    location, use --rdf-basedir with a directory name. | 
| Chris@0 | 266 | 
| Chris@0 | 267    To write a single file with all data (from all input audio files) | 
| Chris@0 | 268    in it, use --rdf-one-file. | 
| Chris@0 | 269 | 
| Chris@0 | 270    To write one file for each transform applied to each input audio | 
| Chris@0 | 271    file, named after the input audio file and transform name with .n3 | 
| Chris@0 | 272    suffix and ":" replaced by "_" throughout, use --rdf-many-files. | 
| Chris@0 | 273 | 
| Chris@0 | 274    To write all data to stdout instead of to a file, use --rdf-stdout. | 
| Chris@0 | 275 | 
| Chris@0 | 276    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 277    exists.  If you want to make it do this, use --rdf-force to | 
| Chris@0 | 278    overwrite or --rdf-append to append to it. | 
| Chris@0 | 279 | 
| Chris@0 | 280    Sonic Annotator will use plugin description RDF if available to | 
| Chris@0 | 281    enhance its output (for example identifying note onset times as | 
| Chris@0 | 282    note onset times, if the plugin's RDF says that is what it | 
| Chris@0 | 283    produces, rather than writing them as plain events).  Best results | 
| Chris@0 | 284    will be obtained if an RDF document is provided with your plugins | 
| Chris@0 | 285    (for example, vamp-example-plugins.n3) and you have this installed | 
| Chris@0 | 286    in the same location as the plugins.  To override this enhanced | 
| Chris@0 | 287    output and write plain events for all features, use --rdf-plain. | 
| Chris@0 | 288 | 
| Chris@0 | 289    The output RDF will include an available_as property linking the | 
| Chris@0 | 290    results to the original audio signal URI.  By default, this will | 
| Chris@0 | 291    point to the URI of the file or resource containing the audio that | 
| Chris@0 | 292    Sonic Annotator processed, such as the file:/// location on disk. | 
| Chris@0 | 293    To override this, for example to process a local copy of a file | 
| Chris@0 | 294    while generating RDF that describes a copy of it available on a | 
| Chris@0 | 295    network, you can use the --rdf-signal-uri option to specify an | 
| Chris@0 | 296    alternative signal URI. | 
| Chris@0 | 297 | 
| Chris@0 | 298 | 
| Chris@0 | 299 4. Optionally, how to summarise the features | 
| Chris@0 | 300 | 
| Chris@0 | 301 Sonic Annotator can also calculate and write summaries of features, | 
| Chris@0 | 302 such as mean and median values. | 
| Chris@0 | 303 | 
| Chris@0 | 304 To obtain a summary as well as the feature results, just use the -S | 
| Chris@0 | 305 option, naming the type of summary you want (min, max, mean, median, | 
| Chris@0 | 306 mode, sum, variance, sd or count).  You can also tell it to produce | 
| Chris@0 | 307 only the summary, not the individual features, with --summary-only. | 
| Chris@0 | 308 | 
| Chris@0 | 309 Alternatively, you can specify a summary in a transform description. | 
| Chris@0 | 310 The following example tells Sonic Annotator to write both the times of | 
| Chris@0 | 311 note onsets estimated by the simple percussion onset detector example | 
| Chris@0 | 312 plugin, and the variance of the plugin's onset detection function. | 
| Chris@0 | 313 (It will only process the audio file and run the plugin once.) | 
| Chris@0 | 314 | 
| Chris@0 | 315   @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. | 
| Chris@0 | 316   @prefix vamp: <http://purl.org/ontology/vamp/>. | 
| Chris@0 | 317   @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>. | 
| Chris@0 | 318   @prefix : <#>. | 
| Chris@0 | 319 | 
| Chris@0 | 320   :transform1 a vamp:Transform; | 
| Chris@0 | 321      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 322      vamp:output examples:percussiononsets_output_onsets . | 
| Chris@0 | 323 | 
| Chris@0 | 324   :transform0 a vamp:Transform; | 
| Chris@0 | 325      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 326      vamp:output examples:percussiononsets_output_detectionfunction ; | 
| Chris@0 | 327      vamp:summary_type "variance" . | 
| Chris@0 | 328 | 
| Chris@0 | 329 Sonic Annotator can also summarise in segments -- if you provide a | 
| Chris@0 | 330 comma-separated list of times as an argument to the --segments option, | 
| Chris@0 | 331 it will calculate one summary for each segment bounded by the times | 
| Chris@0 | 332 you provided.  For example, | 
| Chris@0 | 333 | 
| Chris@0 | 334   $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav | 
| Chris@0 | 335   (... some log output on stderr, then ...) | 
| Chris@0 | 336   ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)" | 
| Chris@0 | 337   ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)" | 
| Chris@0 | 338   ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)" | 
| Chris@0 | 339   ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)" | 
| Chris@0 | 340 | 
| Chris@0 | 341 Here the first row contains a summary covering the time period from 0 | 
| Chris@0 | 342 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3 | 
| Chris@0 | 343 seconds and the fourth from 3 seconds to the end of the (short) audio | 
| Chris@0 | 344 file. | 
| Chris@0 | 345 |