| Chris@0 | 1 | 
| Chris@0 | 2 Sonic Annotator | 
| Chris@0 | 3 =============== | 
| Chris@0 | 4 | 
| Chris@0 | 5 Sonic Annotator is a utility program for batch feature extraction from | 
| Chris@0 | 6 audio files.  It runs Vamp audio analysis plugins on audio files, and | 
| Chris@0 | 7 can write the result features in a selection of formats. | 
| Chris@0 | 8 | 
| Chris@0 | 9 | 
| Chris@0 | 10 A Quick Tutorial | 
| Chris@0 | 11 ---------------- | 
| Chris@0 | 12 | 
| Chris@0 | 13 To use Sonic Annotator, you need to tell it three things: what audio | 
| Chris@0 | 14 files to extract features from; what features to extract; and how and | 
| Chris@0 | 15 where to write the results.  You can also optionally tell it to | 
| Chris@0 | 16 summarise the features. | 
| Chris@0 | 17 | 
| Chris@0 | 18 | 
| Chris@0 | 19 1. What audio files to extract features from | 
| Chris@0 | 20 | 
| Chris@0 | 21 Sonic Annotator accepts a list of audio files on the command line. | 
| Chris@0 | 22 Any argument that is not understood as a supported command-line option | 
| Chris@0 | 23 will be taken to be the name of an audio file.  Any number of files | 
| Chris@0 | 24 may be listed. | 
| Chris@0 | 25 | 
| Chris@0 | 26 Several common audio file formats are supported, including MP3, Ogg, | 
| Chris@0 | 27 and a number of PCM formats such as WAV and AIFF.  AAC is supported on | 
| Chris@0 | 28 OS/X only, and only if not DRM protected.  WMA is not supported. | 
| Chris@0 | 29 | 
| Chris@0 | 30 File paths do not have to be local; you can also provide remote HTTP | 
| Chris@0 | 31 or FTP URLs for Sonic Annotator to retrieve. | 
| Chris@0 | 32 | 
| Chris@0 | 33 Sonic Annotator also accepts the names of playlist files (.m3u | 
| Chris@0 | 34 extension) and will process every file found in the playlist. | 
| Chris@0 | 35 | 
| Chris@0 | 36 Finally, you can provide a local directory path instead of a file, | 
| Chris@0 | 37 together with the -r (recursive) option, for Sonic Annotator to | 
| Chris@0 | 38 process every audio file found in that directory or any of its | 
| Chris@0 | 39 subdirectories. | 
| Chris@0 | 40 | 
| Chris@0 | 41 | 
| Chris@0 | 42 2. What features to extract | 
| Chris@0 | 43 | 
| Chris@0 | 44 Sonic Annotator applies "transforms" to its input audio files, where a | 
| Chris@0 | 45 transform (in this terminology) consists of a Vamp plugin together | 
| Chris@0 | 46 with a certain set of parameters and a specified execution context: | 
| Chris@0 | 47 step and block size, sample rate, etc. | 
| Chris@0 | 48 | 
| Chris@0 | 49 (See http://www.vamp-plugins.org/ for more information about Vamp | 
| Chris@0 | 50 plugins.) | 
| Chris@0 | 51 | 
| Chris@0 | 52 To use a particular transform, specify its filename on the command | 
| Chris@0 | 53 line with the -t option. | 
| Chris@0 | 54 | 
| Chris@0 | 55 Transforms are usually described in RDF, following the transform part | 
| Chris@0 | 56 of the Vamp plugin ontology (http://purl.org/ontology/vamp/).  A | 
| Chris@0 | 57 Transform may use any Vamp plugin that is currently installed and | 
| Chris@0 | 58 available on the system.  You can obtain a list of available plugin | 
| Chris@0 | 59 outputs by running Sonic Annotator with the -l option, and you can | 
| Chris@0 | 60 obtain a skeleton transform description for one of these plugins with | 
| Chris@0 | 61 the -s option. | 
| Chris@0 | 62 | 
| Chris@0 | 63 For example, if the example plugins from the Vamp plugin SDK are | 
| Chris@0 | 64 available and no other plugins are installed, you might have an | 
| Chris@0 | 65 exchange like this: | 
| Chris@0 | 66 | 
| Chris@0 | 67   $ sonic-annotator -l | 
| Chris@0 | 68   vamp:vamp-example-plugins:amplitudefollower:amplitude | 
| Chris@0 | 69   vamp:vamp-example-plugins:fixedtempo:acf | 
| Chris@0 | 70   vamp:vamp-example-plugins:fixedtempo:detectionfunction | 
| Chris@0 | 71   vamp:vamp-example-plugins:fixedtempo:filtered_acf | 
| Chris@0 | 72   vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 73   vamp:vamp-example-plugins:fixedtempo:candidates | 
| Chris@0 | 74   vamp:vamp-example-plugins:percussiononsets:detectionfunction | 
| Chris@0 | 75   vamp:vamp-example-plugins:percussiononsets:onsets | 
| Chris@0 | 76   vamp:vamp-example-plugins:powerspectrum:powerspectrum | 
| Chris@0 | 77   vamp:vamp-example-plugins:spectralcentroid:linearcentroid | 
| Chris@0 | 78   vamp:vamp-example-plugins:spectralcentroid:logcentroid | 
| Chris@0 | 79   vamp:vamp-example-plugins:zerocrossing:counts | 
| Chris@0 | 80   vamp:vamp-example-plugins:zerocrossing:zerocrossings | 
| Chris@0 | 81   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo | 
| Chris@0 | 82   @prefix xsd:      <http://www.w3.org/2001/XMLSchema#> . | 
| Chris@0 | 83   @prefix vamp:     <http://purl.org/ontology/vamp/> . | 
| Chris@0 | 84   @prefix :         <#> . | 
| Chris@0 | 85 | 
| Chris@0 | 86   :transform a vamp:Transform ; | 
| Chris@0 | 87       vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ; | 
| Chris@0 | 88       vamp:step_size "64"^^xsd:int ; | 
| Chris@0 | 89       vamp:block_size "256"^^xsd:int ; | 
| Chris@0 | 90       vamp:parameter_binding [ | 
| Chris@0 | 91           vamp:parameter [ vamp:identifier "maxbpm" ] ; | 
| Chris@0 | 92           vamp:value "190"^^xsd:float ; | 
| Chris@0 | 93       ] ; | 
| Chris@0 | 94       vamp:parameter_binding [ | 
| Chris@0 | 95           vamp:parameter [ vamp:identifier "maxdflen" ] ; | 
| Chris@0 | 96           vamp:value "10"^^xsd:float ; | 
| Chris@0 | 97       ] ; | 
| Chris@0 | 98       vamp:parameter_binding [ | 
| Chris@0 | 99           vamp:parameter [ vamp:identifier "minbpm" ] ; | 
| Chris@0 | 100           vamp:value "50"^^xsd:float ; | 
| Chris@0 | 101       ] ; | 
| Chris@0 | 102       vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> . | 
| Chris@0 | 103   $ | 
| Chris@0 | 104 | 
| Chris@0 | 105 The output of -s is an RDF/Turtle document describing the default | 
| Chris@0 | 106 settings for the Tempo output of the Fixed Tempo Estimator plugin in | 
| Chris@0 | 107 the Vamp plugin SDK. | 
| Chris@0 | 108 | 
| Chris@0 | 109 (The exact format of the RDF printed may differ -- e.g. if the | 
| Chris@0 | 110 plugin's RDF description is not installed and so its "home" URI is not | 
| Chris@0 | 111 known -- but the result should be functionally equivalent to this.) | 
| Chris@0 | 112 | 
| Chris@0 | 113 You could run this transform by saving the RDF to a file and | 
| Chris@0 | 114 specifying that file with -t: | 
| Chris@0 | 115 | 
| Chris@0 | 116   $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3 | 
| Chris@0 | 117   $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout | 
| Chris@0 | 118   (... logging output on stderr, then ...) | 
| Chris@0 | 119   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 120   $ | 
| Chris@0 | 121 | 
| Chris@0 | 122 The single line of output above consists of the audio file name, the | 
| Chris@0 | 123 timestamp and duration for a single feature, the value of that feature | 
| Chris@0 | 124 (the estimated tempo of the given region of time from that file, in | 
| Chris@0 | 125 bpm -- the plugin in question performs a single tempo estimation and | 
| Chris@0 | 126 nothing else) and the feature's label. | 
| Chris@0 | 127 | 
| Chris@0 | 128 A quicker way to achieve the above is to use the -d (default) option | 
| Chris@0 | 129 to tell Sonic Annotator to use directly the default configuration for | 
| Chris@0 | 130 a named transform: | 
| Chris@0 | 131 | 
| Chris@0 | 132   $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout | 
| Chris@0 | 133   (... some log output on stderr, then ...) | 
| Chris@0 | 134   "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | 
| Chris@0 | 135   $ | 
| Chris@0 | 136 | 
| Chris@0 | 137 Although handy for experimentation, the -d option is inadvisable in | 
| Chris@0 | 138 any "production" situation because the plugin configuration is not | 
| Chris@0 | 139 guaranteed to be the same each time (for example if an updated version | 
| Chris@0 | 140 of a plugin changes some of its defaults).  It's better to save a | 
| Chris@0 | 141 well-defined transform to file and refer to that, even if it is simply | 
| Chris@0 | 142 the transform created by the skeleton option. | 
| Chris@0 | 143 | 
| Chris@0 | 144 To run more than one transform on the same audio files, just put more | 
| Chris@0 | 145 than one set of transform RDF descriptions in the same file, or give | 
| Chris@0 | 146 the -t option more than once with separate transform description | 
| Chris@0 | 147 files.  Remember that if you want to specify more than one transform | 
| Chris@0 | 148 in the same file, they will need to have distinct URIs (that is, the | 
| Chris@0 | 149 ":transform" part of the example above, which may be any arbitrary | 
| Chris@0 | 150 name, must be distinct for each described transform). | 
| Chris@0 | 151 | 
| Chris@0 | 152 | 
| Chris@0 | 153 3. How and where to write the results | 
| Chris@0 | 154 | 
| Chris@0 | 155 Sonic Annotator supports various different output modules (and it is | 
| Chris@0 | 156 fairly easy for the developer to add new ones).  You have to choose at | 
| Chris@0 | 157 least one output module; use the -w (writer) option to do so.  Each | 
| Chris@0 | 158 module has its own set of parameters which can be adjusted on the | 
| Chris@0 | 159 command line, as well as its own default rules about where to write | 
| Chris@0 | 160 the results. | 
| Chris@0 | 161 | 
| Chris@0 | 162 The following writers are currently supported.  (Others exist, but are | 
| Chris@0 | 163 not properly implemented or not supported.) | 
| Chris@0 | 164 | 
| Chris@0 | 165  * csv | 
| Chris@0 | 166 | 
| Chris@0 | 167    Writes the results into comma-separated data files. | 
| Chris@0 | 168 | 
| Chris@0 | 169    One file is created for each transform applied to each input audio | 
| Chris@0 | 170    file, named after the input audio file and transform name with .csv | 
| Chris@0 | 171    suffix and ":" replaced by "_" throughout, placed in the same | 
| Chris@0 | 172    directory as the audio file. | 
| Chris@0 | 173 | 
| Chris@0 | 174    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 175    location, use --csv-basedir with a directory name. | 
| Chris@0 | 176 | 
| Chris@0 | 177    To write a single file with all data in it, use --csv-one-file. | 
| Chris@0 | 178 | 
| Chris@0 | 179    To write all data to stdout instead of to a file, use --csv-stdout. | 
| Chris@0 | 180 | 
| Chris@0 | 181    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 182    exists.  If you want to make it do this, use --csv-force to | 
| Chris@0 | 183    overwrite or --csv-append to append to it. | 
| Chris@0 | 184 | 
| Chris@0 | 185    The data generated consists of one line for each result feature, | 
| Chris@0 | 186    containing the feature timestamp, feature duration if present, all | 
| Chris@0 | 187    of the feature's bin values in order, followed by the feature's | 
| Chris@0 | 188    label if present.  If the --csv-one-file or --csv-stdout option is | 
| Chris@0 | 189    specified, then an additional column will appear before any of the | 
| Chris@0 | 190    above, containing the audio file name from which the feature was | 
| Chris@0 | 191    extracted, if it differs from that of the previous row. | 
| Chris@0 | 192 | 
| Chris@0 | 193    The default column separator is a comma; you can specify a | 
| Chris@0 | 194    different one with the --csv-separator option. | 
| Chris@0 | 195 | 
| Chris@0 | 196  * rdf | 
| Chris@0 | 197 | 
| Chris@0 | 198    Writes the results into RDF/Turtle documents following the Audio | 
| Chris@0 | 199    Features ontology (http://purl.org/ontology/af/). | 
| Chris@0 | 200 | 
| Chris@0 | 201    One file is created for each input audio file containing the | 
| Chris@0 | 202    features extracted by all transforms applied to that file, named | 
| Chris@0 | 203    after the input audio file with .n3 extension, placed in the same | 
| Chris@0 | 204    directory as the audio file. | 
| Chris@0 | 205 | 
| Chris@0 | 206    To instruct Sonic Annotator to place the output files in another | 
| Chris@0 | 207    location, use --rdf-basedir with a directory name. | 
| Chris@0 | 208 | 
| Chris@0 | 209    To write a single file with all data (from all input audio files) | 
| Chris@0 | 210    in it, use --rdf-one-file. | 
| Chris@0 | 211 | 
| Chris@0 | 212    To write one file for each transform applied to each input audio | 
| Chris@0 | 213    file, named after the input audio file and transform name with .n3 | 
| Chris@0 | 214    suffix and ":" replaced by "_" throughout, use --rdf-many-files. | 
| Chris@0 | 215 | 
| Chris@0 | 216    To write all data to stdout instead of to a file, use --rdf-stdout. | 
| Chris@0 | 217 | 
| Chris@0 | 218    Sonic Annotator will not write to an output file that already | 
| Chris@0 | 219    exists.  If you want to make it do this, use --rdf-force to | 
| Chris@0 | 220    overwrite or --rdf-append to append to it. | 
| Chris@0 | 221 | 
| Chris@0 | 222    Sonic Annotator will use plugin description RDF if available to | 
| Chris@0 | 223    enhance its output (for example identifying note onset times as | 
| Chris@0 | 224    note onset times, if the plugin's RDF says that is what it | 
| Chris@0 | 225    produces, rather than writing them as plain events).  Best results | 
| Chris@0 | 226    will be obtained if an RDF document is provided with your plugins | 
| Chris@0 | 227    (for example, vamp-example-plugins.n3) and you have this installed | 
| Chris@0 | 228    in the same location as the plugins.  To override this enhanced | 
| Chris@0 | 229    output and write plain events for all features, use --rdf-plain. | 
| Chris@0 | 230 | 
| Chris@0 | 231    The output RDF will include an available_as property linking the | 
| Chris@0 | 232    results to the original audio signal URI.  By default, this will | 
| Chris@0 | 233    point to the URI of the file or resource containing the audio that | 
| Chris@0 | 234    Sonic Annotator processed, such as the file:/// location on disk. | 
| Chris@0 | 235    To override this, for example to process a local copy of a file | 
| Chris@0 | 236    while generating RDF that describes a copy of it available on a | 
| Chris@0 | 237    network, you can use the --rdf-signal-uri option to specify an | 
| Chris@0 | 238    alternative signal URI. | 
| Chris@0 | 239 | 
| Chris@0 | 240 | 
| Chris@0 | 241 4. Optionally, how to summarise the features | 
| Chris@0 | 242 | 
| Chris@0 | 243 Sonic Annotator can also calculate and write summaries of features, | 
| Chris@0 | 244 such as mean and median values. | 
| Chris@0 | 245 | 
| Chris@0 | 246 To obtain a summary as well as the feature results, just use the -S | 
| Chris@0 | 247 option, naming the type of summary you want (min, max, mean, median, | 
| Chris@0 | 248 mode, sum, variance, sd or count).  You can also tell it to produce | 
| Chris@0 | 249 only the summary, not the individual features, with --summary-only. | 
| Chris@0 | 250 | 
| Chris@0 | 251 Alternatively, you can specify a summary in a transform description. | 
| Chris@0 | 252 The following example tells Sonic Annotator to write both the times of | 
| Chris@0 | 253 note onsets estimated by the simple percussion onset detector example | 
| Chris@0 | 254 plugin, and the variance of the plugin's onset detection function. | 
| Chris@0 | 255 (It will only process the audio file and run the plugin once.) | 
| Chris@0 | 256 | 
| Chris@0 | 257   @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. | 
| Chris@0 | 258   @prefix vamp: <http://purl.org/ontology/vamp/>. | 
| Chris@0 | 259   @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>. | 
| Chris@0 | 260   @prefix : <#>. | 
| Chris@0 | 261 | 
| Chris@0 | 262   :transform1 a vamp:Transform; | 
| Chris@0 | 263      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 264      vamp:output examples:percussiononsets_output_onsets . | 
| Chris@0 | 265 | 
| Chris@0 | 266   :transform0 a vamp:Transform; | 
| Chris@0 | 267      vamp:plugin examples:percussiononsets ; | 
| Chris@0 | 268      vamp:output examples:percussiononsets_output_detectionfunction ; | 
| Chris@0 | 269      vamp:summary_type "variance" . | 
| Chris@0 | 270 | 
| Chris@0 | 271 Sonic Annotator can also summarise in segments -- if you provide a | 
| Chris@0 | 272 comma-separated list of times as an argument to the --segments option, | 
| Chris@0 | 273 it will calculate one summary for each segment bounded by the times | 
| Chris@0 | 274 you provided.  For example, | 
| Chris@0 | 275 | 
| Chris@0 | 276   $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav | 
| Chris@0 | 277   (... some log output on stderr, then ...) | 
| Chris@0 | 278   ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)" | 
| Chris@0 | 279   ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)" | 
| Chris@0 | 280   ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)" | 
| Chris@0 | 281   ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)" | 
| Chris@0 | 282 | 
| Chris@0 | 283 Here the first row contains a summary covering the time period from 0 | 
| Chris@0 | 284 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3 | 
| Chris@0 | 285 seconds and the fourth from 3 seconds to the end of the (short) audio | 
| Chris@0 | 286 file. | 
| Chris@0 | 287 |