Mercurial > hg > sonic-annotator
comparison README @ 0:581b1b150a4d
* copy to sonic-annotator
| author | Chris Cannam |
|---|---|
| date | Thu, 11 Dec 2008 10:22:33 +0000 |
| parents | |
| children | 475f4623feba |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:581b1b150a4d |
|---|---|
| 1 | |
| 2 Sonic Annotator | |
| 3 =============== | |
| 4 | |
| 5 Sonic Annotator is a utility program for batch feature extraction from | |
| 6 audio files. It runs Vamp audio analysis plugins on audio files, and | |
| 7 can write the result features in a selection of formats. | |
| 8 | |
| 9 | |
| 10 A Quick Tutorial | |
| 11 ---------------- | |
| 12 | |
| 13 To use Sonic Annotator, you need to tell it three things: what audio | |
| 14 files to extract features from; what features to extract; and how and | |
| 15 where to write the results. You can also optionally tell it to | |
| 16 summarise the features. | |
| 17 | |
| 18 | |
| 19 1. What audio files to extract features from | |
| 20 | |
| 21 Sonic Annotator accepts a list of audio files on the command line. | |
| 22 Any argument that is not understood as a supported command-line option | |
| 23 will be taken to be the name of an audio file. Any number of files | |
| 24 may be listed. | |
| 25 | |
| 26 Several common audio file formats are supported, including MP3, Ogg, | |
| 27 and a number of PCM formats such as WAV and AIFF. AAC is supported on | |
| 28 OS/X only, and only if not DRM protected. WMA is not supported. | |
| 29 | |
| 30 File paths do not have to be local; you can also provide remote HTTP | |
| 31 or FTP URLs for Sonic Annotator to retrieve. | |
| 32 | |
| 33 Sonic Annotator also accepts the names of playlist files (.m3u | |
| 34 extension) and will process every file found in the playlist. | |
| 35 | |
| 36 Finally, you can provide a local directory path instead of a file, | |
| 37 together with the -r (recursive) option, for Sonic Annotator to | |
| 38 process every audio file found in that directory or any of its | |
| 39 subdirectories. | |
| 40 | |
| 41 | |
| 42 2. What features to extract | |
| 43 | |
| 44 Sonic Annotator applies "transforms" to its input audio files, where a | |
| 45 transform (in this terminology) consists of a Vamp plugin together | |
| 46 with a certain set of parameters and a specified execution context: | |
| 47 step and block size, sample rate, etc. | |
| 48 | |
| 49 (See http://www.vamp-plugins.org/ for more information about Vamp | |
| 50 plugins.) | |
| 51 | |
| 52 To use a particular transform, specify its filename on the command | |
| 53 line with the -t option. | |
| 54 | |
| 55 Transforms are usually described in RDF, following the transform part | |
| 56 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A | |
| 57 Transform may use any Vamp plugin that is currently installed and | |
| 58 available on the system. You can obtain a list of available plugin | |
| 59 outputs by running Sonic Annotator with the -l option, and you can | |
| 60 obtain a skeleton transform description for one of these plugins with | |
| 61 the -s option. | |
| 62 | |
| 63 For example, if the example plugins from the Vamp plugin SDK are | |
| 64 available and no other plugins are installed, you might have an | |
| 65 exchange like this: | |
| 66 | |
| 67 $ sonic-annotator -l | |
| 68 vamp:vamp-example-plugins:amplitudefollower:amplitude | |
| 69 vamp:vamp-example-plugins:fixedtempo:acf | |
| 70 vamp:vamp-example-plugins:fixedtempo:detectionfunction | |
| 71 vamp:vamp-example-plugins:fixedtempo:filtered_acf | |
| 72 vamp:vamp-example-plugins:fixedtempo:tempo | |
| 73 vamp:vamp-example-plugins:fixedtempo:candidates | |
| 74 vamp:vamp-example-plugins:percussiononsets:detectionfunction | |
| 75 vamp:vamp-example-plugins:percussiononsets:onsets | |
| 76 vamp:vamp-example-plugins:powerspectrum:powerspectrum | |
| 77 vamp:vamp-example-plugins:spectralcentroid:linearcentroid | |
| 78 vamp:vamp-example-plugins:spectralcentroid:logcentroid | |
| 79 vamp:vamp-example-plugins:zerocrossing:counts | |
| 80 vamp:vamp-example-plugins:zerocrossing:zerocrossings | |
| 81 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo | |
| 82 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . | |
| 83 @prefix vamp: <http://purl.org/ontology/vamp/> . | |
| 84 @prefix : <#> . | |
| 85 | |
| 86 :transform a vamp:Transform ; | |
| 87 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ; | |
| 88 vamp:step_size "64"^^xsd:int ; | |
| 89 vamp:block_size "256"^^xsd:int ; | |
| 90 vamp:parameter_binding [ | |
| 91 vamp:parameter [ vamp:identifier "maxbpm" ] ; | |
| 92 vamp:value "190"^^xsd:float ; | |
| 93 ] ; | |
| 94 vamp:parameter_binding [ | |
| 95 vamp:parameter [ vamp:identifier "maxdflen" ] ; | |
| 96 vamp:value "10"^^xsd:float ; | |
| 97 ] ; | |
| 98 vamp:parameter_binding [ | |
| 99 vamp:parameter [ vamp:identifier "minbpm" ] ; | |
| 100 vamp:value "50"^^xsd:float ; | |
| 101 ] ; | |
| 102 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> . | |
| 103 $ | |
| 104 | |
| 105 The output of -s is an RDF/Turtle document describing the default | |
| 106 settings for the Tempo output of the Fixed Tempo Estimator plugin in | |
| 107 the Vamp plugin SDK. | |
| 108 | |
| 109 (The exact format of the RDF printed may differ -- e.g. if the | |
| 110 plugin's RDF description is not installed and so its "home" URI is not | |
| 111 known -- but the result should be functionally equivalent to this.) | |
| 112 | |
| 113 You could run this transform by saving the RDF to a file and | |
| 114 specifying that file with -t: | |
| 115 | |
| 116 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3 | |
| 117 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout | |
| 118 (... logging output on stderr, then ...) | |
| 119 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | |
| 120 $ | |
| 121 | |
| 122 The single line of output above consists of the audio file name, the | |
| 123 timestamp and duration for a single feature, the value of that feature | |
| 124 (the estimated tempo of the given region of time from that file, in | |
| 125 bpm -- the plugin in question performs a single tempo estimation and | |
| 126 nothing else) and the feature's label. | |
| 127 | |
| 128 A quicker way to achieve the above is to use the -d (default) option | |
| 129 to tell Sonic Annotator to use directly the default configuration for | |
| 130 a named transform: | |
| 131 | |
| 132 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout | |
| 133 (... some log output on stderr, then ...) | |
| 134 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" | |
| 135 $ | |
| 136 | |
| 137 Although handy for experimentation, the -d option is inadvisable in | |
| 138 any "production" situation because the plugin configuration is not | |
| 139 guaranteed to be the same each time (for example if an updated version | |
| 140 of a plugin changes some of its defaults). It's better to save a | |
| 141 well-defined transform to file and refer to that, even if it is simply | |
| 142 the transform created by the skeleton option. | |
| 143 | |
| 144 To run more than one transform on the same audio files, just put more | |
| 145 than one set of transform RDF descriptions in the same file, or give | |
| 146 the -t option more than once with separate transform description | |
| 147 files. Remember that if you want to specify more than one transform | |
| 148 in the same file, they will need to have distinct URIs (that is, the | |
| 149 ":transform" part of the example above, which may be any arbitrary | |
| 150 name, must be distinct for each described transform). | |
| 151 | |
| 152 | |
| 153 3. How and where to write the results | |
| 154 | |
| 155 Sonic Annotator supports various different output modules (and it is | |
| 156 fairly easy for the developer to add new ones). You have to choose at | |
| 157 least one output module; use the -w (writer) option to do so. Each | |
| 158 module has its own set of parameters which can be adjusted on the | |
| 159 command line, as well as its own default rules about where to write | |
| 160 the results. | |
| 161 | |
| 162 The following writers are currently supported. (Others exist, but are | |
| 163 not properly implemented or not supported.) | |
| 164 | |
| 165 * csv | |
| 166 | |
| 167 Writes the results into comma-separated data files. | |
| 168 | |
| 169 One file is created for each transform applied to each input audio | |
| 170 file, named after the input audio file and transform name with .csv | |
| 171 suffix and ":" replaced by "_" throughout, placed in the same | |
| 172 directory as the audio file. | |
| 173 | |
| 174 To instruct Sonic Annotator to place the output files in another | |
| 175 location, use --csv-basedir with a directory name. | |
| 176 | |
| 177 To write a single file with all data in it, use --csv-one-file. | |
| 178 | |
| 179 To write all data to stdout instead of to a file, use --csv-stdout. | |
| 180 | |
| 181 Sonic Annotator will not write to an output file that already | |
| 182 exists. If you want to make it do this, use --csv-force to | |
| 183 overwrite or --csv-append to append to it. | |
| 184 | |
| 185 The data generated consists of one line for each result feature, | |
| 186 containing the feature timestamp, feature duration if present, all | |
| 187 of the feature's bin values in order, followed by the feature's | |
| 188 label if present. If the --csv-one-file or --csv-stdout option is | |
| 189 specified, then an additional column will appear before any of the | |
| 190 above, containing the audio file name from which the feature was | |
| 191 extracted, if it differs from that of the previous row. | |
| 192 | |
| 193 The default column separator is a comma; you can specify a | |
| 194 different one with the --csv-separator option. | |
| 195 | |
| 196 * rdf | |
| 197 | |
| 198 Writes the results into RDF/Turtle documents following the Audio | |
| 199 Features ontology (http://purl.org/ontology/af/). | |
| 200 | |
| 201 One file is created for each input audio file containing the | |
| 202 features extracted by all transforms applied to that file, named | |
| 203 after the input audio file with .n3 extension, placed in the same | |
| 204 directory as the audio file. | |
| 205 | |
| 206 To instruct Sonic Annotator to place the output files in another | |
| 207 location, use --rdf-basedir with a directory name. | |
| 208 | |
| 209 To write a single file with all data (from all input audio files) | |
| 210 in it, use --rdf-one-file. | |
| 211 | |
| 212 To write one file for each transform applied to each input audio | |
| 213 file, named after the input audio file and transform name with .n3 | |
| 214 suffix and ":" replaced by "_" throughout, use --rdf-many-files. | |
| 215 | |
| 216 To write all data to stdout instead of to a file, use --rdf-stdout. | |
| 217 | |
| 218 Sonic Annotator will not write to an output file that already | |
| 219 exists. If you want to make it do this, use --rdf-force to | |
| 220 overwrite or --rdf-append to append to it. | |
| 221 | |
| 222 Sonic Annotator will use plugin description RDF if available to | |
| 223 enhance its output (for example identifying note onset times as | |
| 224 note onset times, if the plugin's RDF says that is what it | |
| 225 produces, rather than writing them as plain events). Best results | |
| 226 will be obtained if an RDF document is provided with your plugins | |
| 227 (for example, vamp-example-plugins.n3) and you have this installed | |
| 228 in the same location as the plugins. To override this enhanced | |
| 229 output and write plain events for all features, use --rdf-plain. | |
| 230 | |
| 231 The output RDF will include an available_as property linking the | |
| 232 results to the original audio signal URI. By default, this will | |
| 233 point to the URI of the file or resource containing the audio that | |
| 234 Sonic Annotator processed, such as the file:/// location on disk. | |
| 235 To override this, for example to process a local copy of a file | |
| 236 while generating RDF that describes a copy of it available on a | |
| 237 network, you can use the --rdf-signal-uri option to specify an | |
| 238 alternative signal URI. | |
| 239 | |
| 240 | |
| 241 4. Optionally, how to summarise the features | |
| 242 | |
| 243 Sonic Annotator can also calculate and write summaries of features, | |
| 244 such as mean and median values. | |
| 245 | |
| 246 To obtain a summary as well as the feature results, just use the -S | |
| 247 option, naming the type of summary you want (min, max, mean, median, | |
| 248 mode, sum, variance, sd or count). You can also tell it to produce | |
| 249 only the summary, not the individual features, with --summary-only. | |
| 250 | |
| 251 Alternatively, you can specify a summary in a transform description. | |
| 252 The following example tells Sonic Annotator to write both the times of | |
| 253 note onsets estimated by the simple percussion onset detector example | |
| 254 plugin, and the variance of the plugin's onset detection function. | |
| 255 (It will only process the audio file and run the plugin once.) | |
| 256 | |
| 257 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. | |
| 258 @prefix vamp: <http://purl.org/ontology/vamp/>. | |
| 259 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>. | |
| 260 @prefix : <#>. | |
| 261 | |
| 262 :transform1 a vamp:Transform; | |
| 263 vamp:plugin examples:percussiononsets ; | |
| 264 vamp:output examples:percussiononsets_output_onsets . | |
| 265 | |
| 266 :transform0 a vamp:Transform; | |
| 267 vamp:plugin examples:percussiononsets ; | |
| 268 vamp:output examples:percussiononsets_output_detectionfunction ; | |
| 269 vamp:summary_type "variance" . | |
| 270 | |
| 271 Sonic Annotator can also summarise in segments -- if you provide a | |
| 272 comma-separated list of times as an argument to the --segments option, | |
| 273 it will calculate one summary for each segment bounded by the times | |
| 274 you provided. For example, | |
| 275 | |
| 276 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav | |
| 277 (... some log output on stderr, then ...) | |
| 278 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)" | |
| 279 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)" | |
| 280 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)" | |
| 281 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)" | |
| 282 | |
| 283 Here the first row contains a summary covering the time period from 0 | |
| 284 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3 | |
| 285 seconds and the fourth from 3 seconds to the end of the (short) audio | |
| 286 file. | |
| 287 |
