Chris@0
|
1
|
Chris@0
|
2 Sonic Annotator
|
Chris@0
|
3 ===============
|
Chris@0
|
4
|
Chris@0
|
5 Sonic Annotator is a utility program for batch feature extraction from
|
Chris@0
|
6 audio files. It runs Vamp audio analysis plugins on audio files, and
|
Chris@0
|
7 can write the result features in a selection of formats.
|
Chris@0
|
8
|
Chris@2
|
9 For more information, see
|
Chris@2
|
10
|
Chris@2
|
11 http://www.omras2.org/SonicAnnotator
|
Chris@2
|
12
|
Chris@2
|
13 More documentation follows further down this README file, after the
|
Chris@2
|
14 credits.
|
Chris@2
|
15
|
Chris@2
|
16
|
Chris@2
|
17 Credits
|
Chris@2
|
18 -------
|
Chris@2
|
19
|
Chris@2
|
20 Sonic Annotator was developed at the Centre for Digital Music,
|
Chris@2
|
21 Queen Mary, University of London.
|
Chris@2
|
22
|
Chris@87
|
23 http://c4dm.eecs.qmul.ac.uk/
|
Chris@2
|
24
|
Chris@2
|
25 The main program is by Mark Levy, Chris Cannam, and Chris Sutton.
|
Chris@2
|
26 Sonic Annotator incorporates library code from the Sonic Visualiser
|
Chris@2
|
27 application by Chris Cannam. Code copyright 2005-2007 Chris Cannam,
|
Chris@95
|
28 copyright 2006-2014 Queen Mary, University of London, except where
|
Chris@2
|
29 indicated in the individual source files.
|
Chris@2
|
30
|
Chris@2
|
31 This work was funded by the Engineering and Physical Sciences Research
|
Chris@2
|
32 Council through the OMRAS2 project EP/E017614/1.
|
Chris@2
|
33
|
Chris@2
|
34 Sonic Annotator is free software; you can redistribute it and/or
|
Chris@2
|
35 modify it under the terms of the GNU General Public License as
|
Chris@2
|
36 published by the Free Software Foundation; either version 2 of the
|
Chris@2
|
37 License, or (at your option) any later version. See the file COPYING
|
Chris@2
|
38 included with this distribution for more information.
|
Chris@2
|
39
|
Chris@2
|
40 Sonic Annotator may also make use of the following libraries:
|
Chris@2
|
41
|
Chris@87
|
42 * Qt5 -- Copyright Digia Oyj, distributed under the LGPL
|
Chris@2
|
43 * Ogg decoder -- Copyright CSIRO Australia, BSD license
|
Chris@2
|
44 * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL
|
Chris@2
|
45 * libsamplerate -- Copyright Erik de Castro Lopo, GPL
|
Chris@2
|
46 * libsndfile -- Copyright Erik de Castro Lopo, LGPL
|
Chris@2
|
47 * FFTW3 -- Copyright Matteo Frigo and MIT, GPL
|
Chris@87
|
48 * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license
|
Chris@87
|
49 * Dataquay -- Copyright Breakfast Quay, BSD license
|
Chris@87
|
50 * Sord and Serd -- Copyright David Robillard, BSD license
|
Chris@2
|
51
|
Chris@2
|
52 (Some distributions of Sonic Annotator may have one or more of these
|
Chris@2
|
53 libraries statically linked.) Many thanks to their authors.
|
Chris@2
|
54
|
Chris@0
|
55
|
Chris@0
|
56 A Quick Tutorial
|
Chris@2
|
57 ================
|
Chris@0
|
58
|
Chris@0
|
59 To use Sonic Annotator, you need to tell it three things: what audio
|
Chris@0
|
60 files to extract features from; what features to extract; and how and
|
Chris@0
|
61 where to write the results. You can also optionally tell it to
|
Chris@0
|
62 summarise the features.
|
Chris@0
|
63
|
Chris@0
|
64
|
Chris@0
|
65 1. What audio files to extract features from
|
Chris@0
|
66
|
Chris@0
|
67 Sonic Annotator accepts a list of audio files on the command line.
|
Chris@0
|
68 Any argument that is not understood as a supported command-line option
|
Chris@0
|
69 will be taken to be the name of an audio file. Any number of files
|
Chris@0
|
70 may be listed.
|
Chris@0
|
71
|
Chris@0
|
72 Several common audio file formats are supported, including MP3, Ogg,
|
Chris@0
|
73 and a number of PCM formats such as WAV and AIFF. AAC is supported on
|
Chris@0
|
74 OS/X only, and only if not DRM protected. WMA is not supported.
|
Chris@0
|
75
|
Chris@0
|
76 File paths do not have to be local; you can also provide remote HTTP
|
Chris@0
|
77 or FTP URLs for Sonic Annotator to retrieve.
|
Chris@0
|
78
|
Chris@0
|
79 Sonic Annotator also accepts the names of playlist files (.m3u
|
Chris@0
|
80 extension) and will process every file found in the playlist.
|
Chris@0
|
81
|
Chris@0
|
82 Finally, you can provide a local directory path instead of a file,
|
Chris@0
|
83 together with the -r (recursive) option, for Sonic Annotator to
|
Chris@0
|
84 process every audio file found in that directory or any of its
|
Chris@0
|
85 subdirectories.
|
Chris@0
|
86
|
Chris@0
|
87
|
Chris@0
|
88 2. What features to extract
|
Chris@0
|
89
|
Chris@0
|
90 Sonic Annotator applies "transforms" to its input audio files, where a
|
Chris@0
|
91 transform (in this terminology) consists of a Vamp plugin together
|
Chris@0
|
92 with a certain set of parameters and a specified execution context:
|
Chris@0
|
93 step and block size, sample rate, etc.
|
Chris@0
|
94
|
Chris@0
|
95 (See http://www.vamp-plugins.org/ for more information about Vamp
|
Chris@0
|
96 plugins.)
|
Chris@0
|
97
|
Chris@0
|
98 To use a particular transform, specify its filename on the command
|
Chris@0
|
99 line with the -t option.
|
Chris@0
|
100
|
Chris@0
|
101 Transforms are usually described in RDF, following the transform part
|
Chris@0
|
102 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
|
Chris@0
|
103 Transform may use any Vamp plugin that is currently installed and
|
Chris@0
|
104 available on the system. You can obtain a list of available plugin
|
Chris@0
|
105 outputs by running Sonic Annotator with the -l option, and you can
|
Chris@0
|
106 obtain a skeleton transform description for one of these plugins with
|
Chris@0
|
107 the -s option.
|
Chris@0
|
108
|
Chris@0
|
109 For example, if the example plugins from the Vamp plugin SDK are
|
Chris@0
|
110 available and no other plugins are installed, you might have an
|
Chris@0
|
111 exchange like this:
|
Chris@0
|
112
|
Chris@0
|
113 $ sonic-annotator -l
|
Chris@0
|
114 vamp:vamp-example-plugins:amplitudefollower:amplitude
|
Chris@0
|
115 vamp:vamp-example-plugins:fixedtempo:acf
|
Chris@0
|
116 vamp:vamp-example-plugins:fixedtempo:detectionfunction
|
Chris@0
|
117 vamp:vamp-example-plugins:fixedtempo:filtered_acf
|
Chris@0
|
118 vamp:vamp-example-plugins:fixedtempo:tempo
|
Chris@0
|
119 vamp:vamp-example-plugins:fixedtempo:candidates
|
Chris@0
|
120 vamp:vamp-example-plugins:percussiononsets:detectionfunction
|
Chris@0
|
121 vamp:vamp-example-plugins:percussiononsets:onsets
|
Chris@0
|
122 vamp:vamp-example-plugins:powerspectrum:powerspectrum
|
Chris@0
|
123 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
|
Chris@0
|
124 vamp:vamp-example-plugins:spectralcentroid:logcentroid
|
Chris@0
|
125 vamp:vamp-example-plugins:zerocrossing:counts
|
Chris@0
|
126 vamp:vamp-example-plugins:zerocrossing:zerocrossings
|
Chris@0
|
127 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
|
Chris@0
|
128 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
Chris@0
|
129 @prefix vamp: <http://purl.org/ontology/vamp/> .
|
Chris@0
|
130 @prefix : <#> .
|
Chris@0
|
131
|
Chris@0
|
132 :transform a vamp:Transform ;
|
Chris@0
|
133 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
|
Chris@0
|
134 vamp:step_size "64"^^xsd:int ;
|
Chris@0
|
135 vamp:block_size "256"^^xsd:int ;
|
Chris@0
|
136 vamp:parameter_binding [
|
Chris@0
|
137 vamp:parameter [ vamp:identifier "maxbpm" ] ;
|
Chris@0
|
138 vamp:value "190"^^xsd:float ;
|
Chris@0
|
139 ] ;
|
Chris@0
|
140 vamp:parameter_binding [
|
Chris@0
|
141 vamp:parameter [ vamp:identifier "maxdflen" ] ;
|
Chris@0
|
142 vamp:value "10"^^xsd:float ;
|
Chris@0
|
143 ] ;
|
Chris@0
|
144 vamp:parameter_binding [
|
Chris@0
|
145 vamp:parameter [ vamp:identifier "minbpm" ] ;
|
Chris@0
|
146 vamp:value "50"^^xsd:float ;
|
Chris@0
|
147 ] ;
|
Chris@0
|
148 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
|
Chris@0
|
149 $
|
Chris@0
|
150
|
Chris@0
|
151 The output of -s is an RDF/Turtle document describing the default
|
Chris@0
|
152 settings for the Tempo output of the Fixed Tempo Estimator plugin in
|
Chris@0
|
153 the Vamp plugin SDK.
|
Chris@0
|
154
|
Chris@0
|
155 (The exact format of the RDF printed may differ -- e.g. if the
|
Chris@0
|
156 plugin's RDF description is not installed and so its "home" URI is not
|
Chris@0
|
157 known -- but the result should be functionally equivalent to this.)
|
Chris@0
|
158
|
Chris@0
|
159 You could run this transform by saving the RDF to a file and
|
Chris@0
|
160 specifying that file with -t:
|
Chris@0
|
161
|
Chris@0
|
162 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
|
Chris@0
|
163 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
|
Chris@0
|
164 (... logging output on stderr, then ...)
|
Chris@0
|
165 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
|
Chris@0
|
166 $
|
Chris@0
|
167
|
Chris@0
|
168 The single line of output above consists of the audio file name, the
|
Chris@0
|
169 timestamp and duration for a single feature, the value of that feature
|
Chris@0
|
170 (the estimated tempo of the given region of time from that file, in
|
Chris@0
|
171 bpm -- the plugin in question performs a single tempo estimation and
|
Chris@0
|
172 nothing else) and the feature's label.
|
Chris@0
|
173
|
Chris@0
|
174 A quicker way to achieve the above is to use the -d (default) option
|
Chris@0
|
175 to tell Sonic Annotator to use directly the default configuration for
|
Chris@0
|
176 a named transform:
|
Chris@0
|
177
|
Chris@0
|
178 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
|
Chris@0
|
179 (... some log output on stderr, then ...)
|
Chris@0
|
180 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
|
Chris@0
|
181 $
|
Chris@0
|
182
|
Chris@0
|
183 Although handy for experimentation, the -d option is inadvisable in
|
Chris@0
|
184 any "production" situation because the plugin configuration is not
|
Chris@0
|
185 guaranteed to be the same each time (for example if an updated version
|
Chris@0
|
186 of a plugin changes some of its defaults). It's better to save a
|
Chris@0
|
187 well-defined transform to file and refer to that, even if it is simply
|
Chris@0
|
188 the transform created by the skeleton option.
|
Chris@0
|
189
|
Chris@0
|
190 To run more than one transform on the same audio files, just put more
|
Chris@0
|
191 than one set of transform RDF descriptions in the same file, or give
|
Chris@0
|
192 the -t option more than once with separate transform description
|
Chris@0
|
193 files. Remember that if you want to specify more than one transform
|
Chris@0
|
194 in the same file, they will need to have distinct URIs (that is, the
|
Chris@0
|
195 ":transform" part of the example above, which may be any arbitrary
|
Chris@0
|
196 name, must be distinct for each described transform).
|
Chris@0
|
197
|
Chris@0
|
198
|
Chris@0
|
199 3. How and where to write the results
|
Chris@0
|
200
|
Chris@0
|
201 Sonic Annotator supports various different output modules (and it is
|
Chris@0
|
202 fairly easy for the developer to add new ones). You have to choose at
|
Chris@0
|
203 least one output module; use the -w (writer) option to do so. Each
|
Chris@0
|
204 module has its own set of parameters which can be adjusted on the
|
Chris@0
|
205 command line, as well as its own default rules about where to write
|
Chris@0
|
206 the results.
|
Chris@0
|
207
|
Chris@174
|
208 To get help on a specific writer, run Sonic Annotator with the -h
|
Chris@174
|
209 option followed by the writer name (e.g. "-h csv").
|
Chris@174
|
210
|
Chris@0
|
211 The following writers are currently supported. (Others exist, but are
|
Chris@0
|
212 not properly implemented or not supported.)
|
Chris@0
|
213
|
Chris@0
|
214 * csv
|
Chris@0
|
215
|
Chris@0
|
216 Writes the results into comma-separated data files.
|
Chris@0
|
217
|
Chris@0
|
218 One file is created for each transform applied to each input audio
|
Chris@0
|
219 file, named after the input audio file and transform name with .csv
|
Chris@0
|
220 suffix and ":" replaced by "_" throughout, placed in the same
|
Chris@0
|
221 directory as the audio file.
|
Chris@0
|
222
|
Chris@0
|
223 To instruct Sonic Annotator to place the output files in another
|
Chris@0
|
224 location, use --csv-basedir with a directory name.
|
Chris@0
|
225
|
Chris@0
|
226 To write a single file with all data in it, use --csv-one-file.
|
Chris@0
|
227
|
Chris@0
|
228 To write all data to stdout instead of to a file, use --csv-stdout.
|
Chris@0
|
229
|
Chris@0
|
230 Sonic Annotator will not write to an output file that already
|
Chris@0
|
231 exists. If you want to make it do this, use --csv-force to
|
Chris@0
|
232 overwrite or --csv-append to append to it.
|
Chris@0
|
233
|
Chris@0
|
234 The data generated consists of one line for each result feature,
|
Chris@0
|
235 containing the feature timestamp, feature duration if present, all
|
Chris@0
|
236 of the feature's bin values in order, followed by the feature's
|
Chris@0
|
237 label if present. If the --csv-one-file or --csv-stdout option is
|
Chris@0
|
238 specified, then an additional column will appear before any of the
|
Chris@0
|
239 above, containing the audio file name from which the feature was
|
Chris@174
|
240 extracted, if it differs from that of the previous row. To suppress
|
Chris@174
|
241 this additional column, use the --csv-omit-filenames option.
|
Chris@174
|
242
|
Chris@174
|
243 To make the CSV writer emit the end time instead of the duration
|
Chris@174
|
244 (for features with duration) use the --csv-end-times option.
|
Chris@174
|
245
|
Chris@174
|
246 To make the writer always emit end time or duration, even when the
|
Chris@174
|
247 feature lacks duration, by using the time of the following feature
|
Chris@174
|
248 as the end time, use the --csv-fill-ends option.
|
Chris@0
|
249
|
Chris@0
|
250 The default column separator is a comma; you can specify a
|
Chris@0
|
251 different one with the --csv-separator option.
|
Chris@0
|
252
|
Chris@174
|
253 * lab
|
Chris@174
|
254
|
Chris@174
|
255 Writes the results into a tab-separated label file (.lab).
|
Chris@174
|
256
|
Chris@174
|
257 This is equivalent to using the CSV writer with a tab separator and
|
Chris@174
|
258 the options --csv-end-times --csv-omit-filenames.
|
Chris@174
|
259
|
Chris@174
|
260 It supports the --lab-basedir, --lab-one-file, --lab-stdout,
|
Chris@174
|
261 --lab-force, --lab-append, and --lab-fill-ends options, which all
|
Chris@174
|
262 behave similarly to their CSV writer equivalents.
|
Chris@174
|
263
|
Chris@0
|
264 * rdf
|
Chris@0
|
265
|
Chris@0
|
266 Writes the results into RDF/Turtle documents following the Audio
|
Chris@0
|
267 Features ontology (http://purl.org/ontology/af/).
|
Chris@0
|
268
|
Chris@0
|
269 One file is created for each input audio file containing the
|
Chris@0
|
270 features extracted by all transforms applied to that file, named
|
Chris@0
|
271 after the input audio file with .n3 extension, placed in the same
|
Chris@0
|
272 directory as the audio file.
|
Chris@0
|
273
|
Chris@0
|
274 To instruct Sonic Annotator to place the output files in another
|
Chris@0
|
275 location, use --rdf-basedir with a directory name.
|
Chris@0
|
276
|
Chris@0
|
277 To write a single file with all data (from all input audio files)
|
Chris@0
|
278 in it, use --rdf-one-file.
|
Chris@0
|
279
|
Chris@0
|
280 To write one file for each transform applied to each input audio
|
Chris@0
|
281 file, named after the input audio file and transform name with .n3
|
Chris@0
|
282 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
|
Chris@0
|
283
|
Chris@0
|
284 To write all data to stdout instead of to a file, use --rdf-stdout.
|
Chris@0
|
285
|
Chris@0
|
286 Sonic Annotator will not write to an output file that already
|
Chris@0
|
287 exists. If you want to make it do this, use --rdf-force to
|
Chris@0
|
288 overwrite or --rdf-append to append to it.
|
Chris@0
|
289
|
Chris@0
|
290 Sonic Annotator will use plugin description RDF if available to
|
Chris@0
|
291 enhance its output (for example identifying note onset times as
|
Chris@0
|
292 note onset times, if the plugin's RDF says that is what it
|
Chris@0
|
293 produces, rather than writing them as plain events). Best results
|
Chris@0
|
294 will be obtained if an RDF document is provided with your plugins
|
Chris@0
|
295 (for example, vamp-example-plugins.n3) and you have this installed
|
Chris@0
|
296 in the same location as the plugins. To override this enhanced
|
Chris@0
|
297 output and write plain events for all features, use --rdf-plain.
|
Chris@0
|
298
|
Chris@0
|
299 The output RDF will include an available_as property linking the
|
Chris@0
|
300 results to the original audio signal URI. By default, this will
|
Chris@0
|
301 point to the URI of the file or resource containing the audio that
|
Chris@0
|
302 Sonic Annotator processed, such as the file:/// location on disk.
|
Chris@0
|
303 To override this, for example to process a local copy of a file
|
Chris@0
|
304 while generating RDF that describes a copy of it available on a
|
Chris@0
|
305 network, you can use the --rdf-signal-uri option to specify an
|
Chris@0
|
306 alternative signal URI.
|
Chris@0
|
307
|
Chris@174
|
308 * json
|
Chris@174
|
309
|
Chris@174
|
310 Writes the results into JSON format following JAMS, the JSON
|
Chris@174
|
311 Annotated Music Specification. This writer is provisional as of
|
Chris@174
|
312 Sonic Annotator v1.1.
|
Chris@174
|
313
|
Chris@174
|
314 * midi
|
Chris@174
|
315
|
Chris@174
|
316 Writes the results to MIDI files. All features are written as MIDI
|
Chris@174
|
317 notes.
|
Chris@174
|
318
|
Chris@174
|
319 If a feature has at least one value, its first value will be used
|
Chris@174
|
320 as the note pitch, the second value (if present) for velocity. If a
|
Chris@174
|
321 feature has units of Hz, then its pitch will be converted from
|
Chris@174
|
322 frequency to an integer value in MIDI range, otherwise it will be
|
Chris@174
|
323 written directly.
|
Chris@174
|
324
|
Chris@174
|
325 Multiple (up to 16) transforms can be written to a single MIDI
|
Chris@174
|
326 file, where they will be given separate MIDI channel numbers.
|
Chris@174
|
327
|
Chris@0
|
328
|
Chris@0
|
329 4. Optionally, how to summarise the features
|
Chris@0
|
330
|
Chris@0
|
331 Sonic Annotator can also calculate and write summaries of features,
|
Chris@0
|
332 such as mean and median values.
|
Chris@0
|
333
|
Chris@0
|
334 To obtain a summary as well as the feature results, just use the -S
|
Chris@0
|
335 option, naming the type of summary you want (min, max, mean, median,
|
Chris@0
|
336 mode, sum, variance, sd or count). You can also tell it to produce
|
Chris@0
|
337 only the summary, not the individual features, with --summary-only.
|
Chris@0
|
338
|
Chris@0
|
339 Alternatively, you can specify a summary in a transform description.
|
Chris@0
|
340 The following example tells Sonic Annotator to write both the times of
|
Chris@0
|
341 note onsets estimated by the simple percussion onset detector example
|
Chris@0
|
342 plugin, and the variance of the plugin's onset detection function.
|
Chris@0
|
343 (It will only process the audio file and run the plugin once.)
|
Chris@0
|
344
|
Chris@0
|
345 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
|
Chris@0
|
346 @prefix vamp: <http://purl.org/ontology/vamp/>.
|
Chris@0
|
347 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
|
Chris@0
|
348 @prefix : <#>.
|
Chris@0
|
349
|
Chris@0
|
350 :transform1 a vamp:Transform;
|
Chris@0
|
351 vamp:plugin examples:percussiononsets ;
|
Chris@0
|
352 vamp:output examples:percussiononsets_output_onsets .
|
Chris@0
|
353
|
Chris@0
|
354 :transform0 a vamp:Transform;
|
Chris@0
|
355 vamp:plugin examples:percussiononsets ;
|
Chris@0
|
356 vamp:output examples:percussiononsets_output_detectionfunction ;
|
Chris@0
|
357 vamp:summary_type "variance" .
|
Chris@0
|
358
|
Chris@0
|
359 Sonic Annotator can also summarise in segments -- if you provide a
|
Chris@0
|
360 comma-separated list of times as an argument to the --segments option,
|
Chris@0
|
361 it will calculate one summary for each segment bounded by the times
|
Chris@0
|
362 you provided. For example,
|
Chris@0
|
363
|
Chris@0
|
364 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
|
Chris@0
|
365 (... some log output on stderr, then ...)
|
Chris@0
|
366 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
|
Chris@0
|
367 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
|
Chris@0
|
368 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
|
Chris@0
|
369 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
|
Chris@0
|
370
|
Chris@0
|
371 Here the first row contains a summary covering the time period from 0
|
Chris@0
|
372 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
|
Chris@0
|
373 seconds and the fourth from 3 seconds to the end of the (short) audio
|
Chris@0
|
374 file.
|
Chris@0
|
375
|