comparison README.md @ 309:12eadc54e874

Start rejigging README
author Chris Cannam
date Tue, 11 Jul 2017 18:00:08 +0100
parents README@c1862b8712ca
children 99d361aa7ad7
comparison
equal deleted inserted replaced
308:535559475847 309:12eadc54e874
1
2 Sonic Annotator
3 ===============
4
5 Sonic Annotator is a utility program for batch feature extraction from
6 audio files. It runs Vamp audio analysis plugins on audio files, and
7 can write the result features in a selection of formats.
8
9 For more information, see
10
11 http://vamp-plugins.org/sonic-annotator
12
13 More documentation follows further down this README file, after the
14 credits.
15
16
17 ### Credits
18
19 Sonic Annotator was developed at the Centre for Digital Music,
20 Queen Mary, University of London.
21
22 http://c4dm.eecs.qmul.ac.uk/
23
24 The main program is by Mark Levy, Chris Cannam, and Chris Sutton.
25 Sonic Annotator incorporates library code from the Sonic Visualiser
26 application by Chris Cannam. Code copyright 2005-2007 Chris Cannam,
27 copyright 2006-2017 Queen Mary, University of London, except where
28 indicated in the individual source files.
29
30 This work was funded by the Engineering and Physical Sciences Research
31 Council through the OMRAS2 project EP/E017614/1.
32
33 Sonic Annotator is free software; you can redistribute it and/or
34 modify it under the terms of the GNU General Public License as
35 published by the Free Software Foundation; either version 2 of the
36 License, or (at your option) any later version. See the file COPYING
37 included with this distribution for more information.
38
39 Sonic Annotator may also make use of the following libraries:
40
41 * Qt5 -- Copyright Digia Oyj, distributed under the LGPL
42 * Ogg decoder -- Copyright CSIRO Australia, BSD license
43 * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL
44 * libsamplerate -- Copyright Erik de Castro Lopo, GPL
45 * libsndfile -- Copyright Erik de Castro Lopo, LGPL
46 * FFTW3 -- Copyright Matteo Frigo and MIT, GPL
47 * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license
48 * Dataquay -- Copyright Breakfast Quay, BSD license
49 * Sord and Serd -- Copyright David Robillard, BSD license
50
51 (Some distributions of Sonic Annotator may have one or more of these
52 libraries statically linked.) Many thanks to their authors.
53
54
55 A Quick Tutorial
56 ----------------
57
58 To use Sonic Annotator, you need to tell it three things: what audio
59 files to extract features from; what features to extract; and how and
60 where to write the results. You can also optionally tell it to
61 summarise the features.
62
63
64 ### 1. What audio files to extract features from
65
66 Sonic Annotator accepts a list of audio files on the command line.
67 Any argument that is not understood as a supported command-line option
68 will be taken to be the name of an audio file. Any number of files
69 may be listed.
70
71 Several common audio file formats are supported, including MP3, Ogg,
72 and a number of PCM formats such as WAV and AIFF. AAC is supported on
73 OS/X only, and only if not DRM protected. WMA is not supported.
74
75 File paths do not have to be local; you can also provide remote HTTP
76 or FTP URLs for Sonic Annotator to retrieve.
77
78 Sonic Annotator also accepts the names of playlist files (.m3u
79 extension) and will process every file found in the playlist.
80
81 Finally, you can provide a local directory path instead of a file,
82 together with the -r (recursive) option, for Sonic Annotator to
83 process every audio file found in that directory or any of its
84 subdirectories.
85
86
87 ### 2. What features to extract
88
89 Sonic Annotator applies "transforms" to its input audio files, where a
90 transform (in this terminology) consists of a Vamp plugin together
91 with a certain set of parameters and a specified execution context:
92 step and block size, sample rate, etc.
93
94 (See http://www.vamp-plugins.org/ for more information about Vamp
95 plugins.)
96
97 To use a particular transform, specify its filename on the command
98 line with the -t option.
99
100 Transforms are usually described in RDF, following the transform part
101 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
102 Transform may use any Vamp plugin that is currently installed and
103 available on the system. You can obtain a list of available plugin
104 outputs by running Sonic Annotator with the -l option, and you can
105 obtain a skeleton transform description for one of these plugins with
106 the -s option.
107
108 For example, if the example plugins from the Vamp plugin SDK are
109 available and no other plugins are installed, you might have an
110 exchange like this:
111
112 $ sonic-annotator -l
113 vamp:vamp-example-plugins:amplitudefollower:amplitude
114 vamp:vamp-example-plugins:fixedtempo:acf
115 vamp:vamp-example-plugins:fixedtempo:detectionfunction
116 vamp:vamp-example-plugins:fixedtempo:filtered_acf
117 vamp:vamp-example-plugins:fixedtempo:tempo
118 vamp:vamp-example-plugins:fixedtempo:candidates
119 vamp:vamp-example-plugins:percussiononsets:detectionfunction
120 vamp:vamp-example-plugins:percussiononsets:onsets
121 vamp:vamp-example-plugins:powerspectrum:powerspectrum
122 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
123 vamp:vamp-example-plugins:spectralcentroid:logcentroid
124 vamp:vamp-example-plugins:zerocrossing:counts
125 vamp:vamp-example-plugins:zerocrossing:zerocrossings
126 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
127 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
128 @prefix vamp: <http://purl.org/ontology/vamp/> .
129 @prefix : <#> .
130
131 :transform a vamp:Transform ;
132 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
133 vamp:step_size "64"^^xsd:int ;
134 vamp:block_size "256"^^xsd:int ;
135 vamp:parameter_binding [
136 vamp:parameter [ vamp:identifier "maxbpm" ] ;
137 vamp:value "190"^^xsd:float ;
138 ] ;
139 vamp:parameter_binding [
140 vamp:parameter [ vamp:identifier "maxdflen" ] ;
141 vamp:value "10"^^xsd:float ;
142 ] ;
143 vamp:parameter_binding [
144 vamp:parameter [ vamp:identifier "minbpm" ] ;
145 vamp:value "50"^^xsd:float ;
146 ] ;
147 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
148 $
149
150 The output of -s is an RDF/Turtle document describing the default
151 settings for the Tempo output of the Fixed Tempo Estimator plugin in
152 the Vamp plugin SDK.
153
154 (The exact format of the RDF printed may differ -- e.g. if the
155 plugin's RDF description is not installed and so its "home" URI is not
156 known -- but the result should be functionally equivalent to this.)
157
158 You could run this transform by saving the RDF to a file and
159 specifying that file with -t:
160
161 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
162 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
163 (... logging output on stderr, then ...)
164 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
165 $
166
167 The single line of output above consists of the audio file name, the
168 timestamp and duration for a single feature, the value of that feature
169 (the estimated tempo of the given region of time from that file, in
170 bpm -- the plugin in question performs a single tempo estimation and
171 nothing else) and the feature's label.
172
173 A quicker way to achieve the above is to use the -d (default) option
174 to tell Sonic Annotator to use directly the default configuration for
175 a named transform:
176
177 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
178 (... some log output on stderr, then ...)
179 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
180 $
181
182 Although handy for experimentation, the -d option is inadvisable in
183 any "production" situation because the plugin configuration is not
184 guaranteed to be the same each time (for example if an updated version
185 of a plugin changes some of its defaults). It's better to save a
186 well-defined transform to file and refer to that, even if it is simply
187 the transform created by the skeleton option.
188
189 To run more than one transform on the same audio files, just put more
190 than one set of transform RDF descriptions in the same file, or give
191 the -t option more than once with separate transform description
192 files. Remember that if you want to specify more than one transform
193 in the same file, they will need to have distinct URIs (that is, the
194 ":transform" part of the example above, which may be any arbitrary
195 name, must be distinct for each described transform).
196
197
198 ### 3. How and where to write the results
199
200 Sonic Annotator supports various different output modules (and it is
201 fairly easy for the developer to add new ones). You have to choose at
202 least one output module; use the -w (writer) option to do so. Each
203 module has its own set of parameters which can be adjusted on the
204 command line, as well as its own default rules about where to write
205 the results.
206
207 To get help on a specific writer, run Sonic Annotator with the -h
208 option followed by the writer name (e.g. "-h csv").
209
210 The following writers are currently supported. (Others exist, but are
211 not properly implemented or not supported.)
212
213 * csv
214
215 Writes the results into comma-separated data files.
216
217 One file is created for each transform applied to each input audio
218 file, named after the input audio file and transform name with .csv
219 suffix and ":" replaced by "_" throughout, placed in the same
220 directory as the audio file.
221
222 To instruct Sonic Annotator to place the output files in another
223 location, use --csv-basedir with a directory name.
224
225 To write a single file with all data in it, use --csv-one-file.
226
227 To write all data to stdout instead of to a file, use --csv-stdout.
228
229 Sonic Annotator will not write to an output file that already
230 exists. If you want to make it do this, use --csv-force to
231 overwrite or --csv-append to append to it.
232
233 The data generated consists of one line for each result feature,
234 containing the feature timestamp, feature duration if present, all
235 of the feature's bin values in order, followed by the feature's
236 label if present. If the --csv-one-file or --csv-stdout option is
237 specified, then an additional column will appear before any of the
238 above, containing the audio file name from which the feature was
239 extracted, if it differs from that of the previous row. To suppress
240 this additional column, use the --csv-omit-filenames option.
241
242 To make the CSV writer emit the end time instead of the duration
243 (for features with duration) use the --csv-end-times option.
244
245 To make the writer always emit end time or duration, even when the
246 feature lacks duration, by using the time of the following feature
247 as the end time, use the --csv-fill-ends option.
248
249 The default column separator is a comma; you can specify a
250 different one with the --csv-separator option.
251
252 * lab
253
254 Writes the results into a tab-separated label file (.lab).
255
256 This is equivalent to using the CSV writer with a tab separator and
257 the options --csv-end-times --csv-omit-filenames.
258
259 It supports the --lab-basedir, --lab-one-file, --lab-stdout,
260 --lab-force, --lab-append, and --lab-fill-ends options, which all
261 behave similarly to their CSV writer equivalents.
262
263 * rdf
264
265 Writes the results into RDF/Turtle documents following the Audio
266 Features ontology (http://purl.org/ontology/af/).
267
268 One file is created for each input audio file containing the
269 features extracted by all transforms applied to that file, named
270 after the input audio file with .n3 extension, placed in the same
271 directory as the audio file.
272
273 To instruct Sonic Annotator to place the output files in another
274 location, use --rdf-basedir with a directory name.
275
276 To write a single file with all data (from all input audio files)
277 in it, use --rdf-one-file.
278
279 To write one file for each transform applied to each input audio
280 file, named after the input audio file and transform name with .n3
281 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
282
283 To write all data to stdout instead of to a file, use --rdf-stdout.
284
285 Sonic Annotator will not write to an output file that already
286 exists. If you want to make it do this, use --rdf-force to
287 overwrite or --rdf-append to append to it.
288
289 Sonic Annotator will use plugin description RDF if available to
290 enhance its output (for example identifying note onset times as
291 note onset times, if the plugin's RDF says that is what it
292 produces, rather than writing them as plain events). Best results
293 will be obtained if an RDF document is provided with your plugins
294 (for example, vamp-example-plugins.n3) and you have this installed
295 in the same location as the plugins. To override this enhanced
296 output and write plain events for all features, use --rdf-plain.
297
298 The output RDF will include an available_as property linking the
299 results to the original audio signal URI. By default, this will
300 point to the URI of the file or resource containing the audio that
301 Sonic Annotator processed, such as the file:/// location on disk.
302 To override this, for example to process a local copy of a file
303 while generating RDF that describes a copy of it available on a
304 network, you can use the --rdf-signal-uri option to specify an
305 alternative signal URI.
306
307 * json
308
309 Writes the results into JSON format following JAMS, the JSON
310 Annotated Music Specification. This writer is provisional as of
311 Sonic Annotator v1.1.
312
313 * midi
314
315 Writes the results to MIDI files. All features are written as MIDI
316 notes.
317
318 If a feature has at least one value, its first value will be used
319 as the note pitch, the second value (if present) for velocity. If a
320 feature has units of Hz, then its pitch will be converted from
321 frequency to an integer value in MIDI range, otherwise it will be
322 written directly.
323
324 Multiple (up to 16) transforms can be written to a single MIDI
325 file, where they will be given separate MIDI channel numbers.
326
327
328 ### 4. Optionally, how to summarise the features
329
330 Sonic Annotator can also calculate and write summaries of features,
331 such as mean and median values.
332
333 To obtain a summary as well as the feature results, just use the -S
334 option, naming the type of summary you want (min, max, mean, median,
335 mode, sum, variance, sd or count). You can also tell it to produce
336 only the summary, not the individual features, with --summary-only.
337
338 Alternatively, you can specify a summary in a transform description.
339 The following example tells Sonic Annotator to write both the times of
340 note onsets estimated by the simple percussion onset detector example
341 plugin, and the variance of the plugin's onset detection function.
342 (It will only process the audio file and run the plugin once.)
343
344 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
345 @prefix vamp: <http://purl.org/ontology/vamp/>.
346 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
347 @prefix : <#>.
348
349 :transform1 a vamp:Transform;
350 vamp:plugin examples:percussiononsets ;
351 vamp:output examples:percussiononsets_output_onsets .
352
353 :transform0 a vamp:Transform;
354 vamp:plugin examples:percussiononsets ;
355 vamp:output examples:percussiononsets_output_detectionfunction ;
356 vamp:summary_type "variance" .
357
358 Sonic Annotator can also summarise in segments -- if you provide a
359 comma-separated list of times as an argument to the --segments option,
360 it will calculate one summary for each segment bounded by the times
361 you provided. For example,
362
363 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
364 (... some log output on stderr, then ...)
365 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
366 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
367 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
368 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
369
370 Here the first row contains a summary covering the time period from 0
371 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
372 seconds and the fourth from 3 seconds to the end of the (short) audio
373 file.
374
375
376 Automated build reports
377 -----------------------
378
379 * Linux and macOS CI build: [![Build Status](https://travis-ci.org/sonic-visualiser/sonic-annotator.svg?branch=master)](https://travis-ci.org/sonic-visualiser/sonic-annotator)
380 * Windows CI build: [![Build status](https://ci.appveyor.com/api/projects/status/26pygienkigw39p7?svg=true)](https://ci.appveyor.com/project/cannam/sonic-annotator)