Chris@0: Chris@0: Sonic Annotator Chris@0: =============== Chris@0: Chris@0: Sonic Annotator is a utility program for batch feature extraction from Chris@0: audio files. It runs Vamp audio analysis plugins on audio files, and Chris@0: can write the result features in a selection of formats. Chris@0: Chris@2: For more information, see Chris@2: Chris@179: http://vamp-plugins.org/sonic-annotator Chris@2: Chris@2: More documentation follows further down this README file, after the Chris@2: credits. Chris@2: Chris@2: Chris@309: ### Credits Chris@2: Chris@2: Sonic Annotator was developed at the Centre for Digital Music, Chris@2: Queen Mary, University of London. Chris@2: Chris@87: http://c4dm.eecs.qmul.ac.uk/ Chris@2: Chris@2: The main program is by Mark Levy, Chris Cannam, and Chris Sutton. Chris@2: Sonic Annotator incorporates library code from the Sonic Visualiser Chris@2: application by Chris Cannam. Code copyright 2005-2007 Chris Cannam, Chris@295: copyright 2006-2017 Queen Mary, University of London, except where Chris@2: indicated in the individual source files. Chris@2: Chris@2: This work was funded by the Engineering and Physical Sciences Research Chris@2: Council through the OMRAS2 project EP/E017614/1. Chris@2: Chris@2: Sonic Annotator is free software; you can redistribute it and/or Chris@2: modify it under the terms of the GNU General Public License as Chris@2: published by the Free Software Foundation; either version 2 of the Chris@2: License, or (at your option) any later version. See the file COPYING Chris@2: included with this distribution for more information. Chris@2: Chris@2: Sonic Annotator may also make use of the following libraries: Chris@2: Chris@87: * Qt5 -- Copyright Digia Oyj, distributed under the LGPL Chris@2: * Ogg decoder -- Copyright CSIRO Australia, BSD license Chris@2: * MAD mp3 decoder -- Copyright Underbit Technologies Inc, GPL Chris@2: * libsamplerate -- Copyright Erik de Castro Lopo, GPL Chris@2: * libsndfile -- Copyright Erik de Castro Lopo, LGPL Chris@2: * FFTW3 -- Copyright Matteo Frigo and MIT, GPL Chris@87: * Vamp plugin SDK -- Copyright Chris Cannam and QMUL, BSD license Chris@87: * Dataquay -- Copyright Breakfast Quay, BSD license Chris@87: * Sord and Serd -- Copyright David Robillard, BSD license Chris@2: Chris@2: (Some distributions of Sonic Annotator may have one or more of these Chris@2: libraries statically linked.) Many thanks to their authors. Chris@2: Chris@0: Chris@0: A Quick Tutorial Chris@309: ---------------- Chris@0: Chris@0: To use Sonic Annotator, you need to tell it three things: what audio Chris@0: files to extract features from; what features to extract; and how and Chris@0: where to write the results. You can also optionally tell it to Chris@0: summarise the features. Chris@0: Chris@0: Chris@309: ### 1. What audio files to extract features from Chris@0: Chris@0: Sonic Annotator accepts a list of audio files on the command line. Chris@0: Any argument that is not understood as a supported command-line option Chris@0: will be taken to be the name of an audio file. Any number of files Chris@0: may be listed. Chris@0: Chris@0: Several common audio file formats are supported, including MP3, Ogg, Chris@0: and a number of PCM formats such as WAV and AIFF. AAC is supported on Chris@0: OS/X only, and only if not DRM protected. WMA is not supported. Chris@0: Chris@0: File paths do not have to be local; you can also provide remote HTTP Chris@0: or FTP URLs for Sonic Annotator to retrieve. Chris@0: Chris@0: Sonic Annotator also accepts the names of playlist files (.m3u Chris@0: extension) and will process every file found in the playlist. Chris@0: Chris@0: Finally, you can provide a local directory path instead of a file, Chris@0: together with the -r (recursive) option, for Sonic Annotator to Chris@0: process every audio file found in that directory or any of its Chris@0: subdirectories. Chris@0: Chris@0: Chris@309: ### 2. What features to extract Chris@0: Chris@0: Sonic Annotator applies "transforms" to its input audio files, where a Chris@0: transform (in this terminology) consists of a Vamp plugin together Chris@0: with a certain set of parameters and a specified execution context: Chris@0: step and block size, sample rate, etc. Chris@0: Chris@0: (See http://www.vamp-plugins.org/ for more information about Vamp Chris@0: plugins.) Chris@0: Chris@0: To use a particular transform, specify its filename on the command Chris@0: line with the -t option. Chris@0: Chris@0: Transforms are usually described in RDF, following the transform part Chris@0: of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A Chris@0: Transform may use any Vamp plugin that is currently installed and Chris@0: available on the system. You can obtain a list of available plugin Chris@0: outputs by running Sonic Annotator with the -l option, and you can Chris@0: obtain a skeleton transform description for one of these plugins with Chris@0: the -s option. Chris@0: Chris@0: For example, if the example plugins from the Vamp plugin SDK are Chris@0: available and no other plugins are installed, you might have an Chris@0: exchange like this: Chris@0: Chris@310: ``` Chris@0: $ sonic-annotator -l Chris@0: vamp:vamp-example-plugins:amplitudefollower:amplitude Chris@0: vamp:vamp-example-plugins:fixedtempo:acf Chris@0: vamp:vamp-example-plugins:fixedtempo:detectionfunction Chris@0: vamp:vamp-example-plugins:fixedtempo:filtered_acf Chris@0: vamp:vamp-example-plugins:fixedtempo:tempo Chris@0: vamp:vamp-example-plugins:fixedtempo:candidates Chris@0: vamp:vamp-example-plugins:percussiononsets:detectionfunction Chris@0: vamp:vamp-example-plugins:percussiononsets:onsets Chris@0: vamp:vamp-example-plugins:powerspectrum:powerspectrum Chris@0: vamp:vamp-example-plugins:spectralcentroid:linearcentroid Chris@0: vamp:vamp-example-plugins:spectralcentroid:logcentroid Chris@0: vamp:vamp-example-plugins:zerocrossing:counts Chris@0: vamp:vamp-example-plugins:zerocrossing:zerocrossings Chris@0: $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo Chris@0: @prefix xsd: . Chris@0: @prefix vamp: . Chris@0: @prefix : <#> . Chris@0: Chris@0: :transform a vamp:Transform ; Chris@0: vamp:plugin ; Chris@0: vamp:step_size "64"^^xsd:int ; Chris@0: vamp:block_size "256"^^xsd:int ; Chris@0: vamp:parameter_binding [ Chris@0: vamp:parameter [ vamp:identifier "maxbpm" ] ; Chris@0: vamp:value "190"^^xsd:float ; Chris@0: ] ; Chris@0: vamp:parameter_binding [ Chris@0: vamp:parameter [ vamp:identifier "maxdflen" ] ; Chris@0: vamp:value "10"^^xsd:float ; Chris@0: ] ; Chris@0: vamp:parameter_binding [ Chris@0: vamp:parameter [ vamp:identifier "minbpm" ] ; Chris@0: vamp:value "50"^^xsd:float ; Chris@0: ] ; Chris@0: vamp:output . Chris@0: $ Chris@310: ``` Chris@0: Chris@0: The output of -s is an RDF/Turtle document describing the default Chris@0: settings for the Tempo output of the Fixed Tempo Estimator plugin in Chris@0: the Vamp plugin SDK. Chris@0: Chris@0: (The exact format of the RDF printed may differ -- e.g. if the Chris@0: plugin's RDF description is not installed and so its "home" URI is not Chris@0: known -- but the result should be functionally equivalent to this.) Chris@0: Chris@0: You could run this transform by saving the RDF to a file and Chris@0: specifying that file with -t: Chris@0: Chris@310: ``` Chris@0: $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3 Chris@0: $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout Chris@0: (... logging output on stderr, then ...) Chris@0: "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" Chris@0: $ Chris@310: ``` Chris@0: Chris@0: The single line of output above consists of the audio file name, the Chris@0: timestamp and duration for a single feature, the value of that feature Chris@0: (the estimated tempo of the given region of time from that file, in Chris@0: bpm -- the plugin in question performs a single tempo estimation and Chris@0: nothing else) and the feature's label. Chris@0: Chris@0: A quicker way to achieve the above is to use the -d (default) option Chris@0: to tell Sonic Annotator to use directly the default configuration for Chris@0: a named transform: Chris@0: Chris@310: ``` Chris@0: $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout Chris@0: (... some log output on stderr, then ...) Chris@0: "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm" Chris@0: $ Chris@310: ``` Chris@0: Chris@0: Although handy for experimentation, the -d option is inadvisable in Chris@0: any "production" situation because the plugin configuration is not Chris@0: guaranteed to be the same each time (for example if an updated version Chris@0: of a plugin changes some of its defaults). It's better to save a Chris@0: well-defined transform to file and refer to that, even if it is simply Chris@0: the transform created by the skeleton option. Chris@0: Chris@0: To run more than one transform on the same audio files, just put more Chris@0: than one set of transform RDF descriptions in the same file, or give Chris@0: the -t option more than once with separate transform description Chris@0: files. Remember that if you want to specify more than one transform Chris@0: in the same file, they will need to have distinct URIs (that is, the Chris@0: ":transform" part of the example above, which may be any arbitrary Chris@0: name, must be distinct for each described transform). Chris@0: Chris@0: Chris@309: ### 3. How and where to write the results Chris@0: Chris@0: Sonic Annotator supports various different output modules (and it is Chris@0: fairly easy for the developer to add new ones). You have to choose at Chris@0: least one output module; use the -w (writer) option to do so. Each Chris@0: module has its own set of parameters which can be adjusted on the Chris@0: command line, as well as its own default rules about where to write Chris@0: the results. Chris@0: Chris@174: To get help on a specific writer, run Sonic Annotator with the -h Chris@174: option followed by the writer name (e.g. "-h csv"). Chris@174: Chris@0: The following writers are currently supported. (Others exist, but are Chris@0: not properly implemented or not supported.) Chris@0: Chris@0: * csv Chris@0: Chris@0: Writes the results into comma-separated data files. Chris@0: Chris@0: One file is created for each transform applied to each input audio Chris@0: file, named after the input audio file and transform name with .csv Chris@0: suffix and ":" replaced by "_" throughout, placed in the same Chris@0: directory as the audio file. Chris@0: Chris@0: To instruct Sonic Annotator to place the output files in another Chris@0: location, use --csv-basedir with a directory name. Chris@0: Chris@0: To write a single file with all data in it, use --csv-one-file. Chris@0: Chris@0: To write all data to stdout instead of to a file, use --csv-stdout. Chris@0: Chris@0: Sonic Annotator will not write to an output file that already Chris@0: exists. If you want to make it do this, use --csv-force to Chris@0: overwrite or --csv-append to append to it. Chris@0: Chris@0: The data generated consists of one line for each result feature, Chris@0: containing the feature timestamp, feature duration if present, all Chris@0: of the feature's bin values in order, followed by the feature's Chris@0: label if present. If the --csv-one-file or --csv-stdout option is Chris@0: specified, then an additional column will appear before any of the Chris@0: above, containing the audio file name from which the feature was Chris@174: extracted, if it differs from that of the previous row. To suppress Chris@174: this additional column, use the --csv-omit-filenames option. Chris@174: Chris@174: To make the CSV writer emit the end time instead of the duration Chris@174: (for features with duration) use the --csv-end-times option. Chris@174: Chris@174: To make the writer always emit end time or duration, even when the Chris@174: feature lacks duration, by using the time of the following feature Chris@174: as the end time, use the --csv-fill-ends option. Chris@0: Chris@0: The default column separator is a comma; you can specify a Chris@0: different one with the --csv-separator option. Chris@0: Chris@174: * lab Chris@174: Chris@174: Writes the results into a tab-separated label file (.lab). Chris@174: Chris@174: This is equivalent to using the CSV writer with a tab separator and Chris@174: the options --csv-end-times --csv-omit-filenames. Chris@174: Chris@174: It supports the --lab-basedir, --lab-one-file, --lab-stdout, Chris@174: --lab-force, --lab-append, and --lab-fill-ends options, which all Chris@174: behave similarly to their CSV writer equivalents. Chris@174: Chris@0: * rdf Chris@0: Chris@0: Writes the results into RDF/Turtle documents following the Audio Chris@0: Features ontology (http://purl.org/ontology/af/). Chris@0: Chris@0: One file is created for each input audio file containing the Chris@0: features extracted by all transforms applied to that file, named Chris@0: after the input audio file with .n3 extension, placed in the same Chris@0: directory as the audio file. Chris@0: Chris@0: To instruct Sonic Annotator to place the output files in another Chris@0: location, use --rdf-basedir with a directory name. Chris@0: Chris@0: To write a single file with all data (from all input audio files) Chris@0: in it, use --rdf-one-file. Chris@0: Chris@0: To write one file for each transform applied to each input audio Chris@0: file, named after the input audio file and transform name with .n3 Chris@0: suffix and ":" replaced by "_" throughout, use --rdf-many-files. Chris@0: Chris@0: To write all data to stdout instead of to a file, use --rdf-stdout. Chris@0: Chris@0: Sonic Annotator will not write to an output file that already Chris@0: exists. If you want to make it do this, use --rdf-force to Chris@0: overwrite or --rdf-append to append to it. Chris@0: Chris@0: Sonic Annotator will use plugin description RDF if available to Chris@0: enhance its output (for example identifying note onset times as Chris@0: note onset times, if the plugin's RDF says that is what it Chris@0: produces, rather than writing them as plain events). Best results Chris@0: will be obtained if an RDF document is provided with your plugins Chris@0: (for example, vamp-example-plugins.n3) and you have this installed Chris@0: in the same location as the plugins. To override this enhanced Chris@0: output and write plain events for all features, use --rdf-plain. Chris@0: Chris@0: The output RDF will include an available_as property linking the Chris@0: results to the original audio signal URI. By default, this will Chris@0: point to the URI of the file or resource containing the audio that Chris@0: Sonic Annotator processed, such as the file:/// location on disk. Chris@0: To override this, for example to process a local copy of a file Chris@0: while generating RDF that describes a copy of it available on a Chris@0: network, you can use the --rdf-signal-uri option to specify an Chris@0: alternative signal URI. Chris@0: Chris@174: * json Chris@174: Chris@174: Writes the results into JSON format following JAMS, the JSON Chris@174: Annotated Music Specification. This writer is provisional as of Chris@174: Sonic Annotator v1.1. Chris@174: Chris@174: * midi Chris@174: Chris@174: Writes the results to MIDI files. All features are written as MIDI Chris@174: notes. Chris@174: Chris@174: If a feature has at least one value, its first value will be used Chris@174: as the note pitch, the second value (if present) for velocity. If a Chris@174: feature has units of Hz, then its pitch will be converted from Chris@174: frequency to an integer value in MIDI range, otherwise it will be Chris@174: written directly. Chris@174: Chris@174: Multiple (up to 16) transforms can be written to a single MIDI Chris@174: file, where they will be given separate MIDI channel numbers. Chris@174: Chris@0: Chris@309: ### 4. Optionally, how to summarise the features Chris@0: Chris@0: Sonic Annotator can also calculate and write summaries of features, Chris@0: such as mean and median values. Chris@0: Chris@0: To obtain a summary as well as the feature results, just use the -S Chris@0: option, naming the type of summary you want (min, max, mean, median, Chris@0: mode, sum, variance, sd or count). You can also tell it to produce Chris@0: only the summary, not the individual features, with --summary-only. Chris@0: Chris@0: Alternatively, you can specify a summary in a transform description. Chris@0: The following example tells Sonic Annotator to write both the times of Chris@0: note onsets estimated by the simple percussion onset detector example Chris@0: plugin, and the variance of the plugin's onset detection function. Chris@0: (It will only process the audio file and run the plugin once.) Chris@0: Chris@310: ``` Chris@0: @prefix rdf: . Chris@0: @prefix vamp: . Chris@0: @prefix examples: . Chris@0: @prefix : <#>. Chris@0: Chris@0: :transform1 a vamp:Transform; Chris@0: vamp:plugin examples:percussiononsets ; Chris@0: vamp:output examples:percussiononsets_output_onsets . Chris@0: Chris@0: :transform0 a vamp:Transform; Chris@0: vamp:plugin examples:percussiononsets ; Chris@0: vamp:output examples:percussiononsets_output_detectionfunction ; Chris@0: vamp:summary_type "variance" . Chris@310: ``` Chris@0: Chris@0: Sonic Annotator can also summarise in segments -- if you provide a Chris@0: comma-separated list of times as an argument to the --segments option, Chris@0: it will calculate one summary for each segment bounded by the times Chris@0: you provided. For example, Chris@0: Chris@310: ``` Chris@0: $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav Chris@0: (... some log output on stderr, then ...) Chris@0: ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)" Chris@0: ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)" Chris@0: ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)" Chris@0: ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)" Chris@310: ``` Chris@0: Chris@0: Here the first row contains a summary covering the time period from 0 Chris@0: to 1 second, the second from 1 to 2 seconds, the third from 2 to 3 Chris@0: seconds and the fourth from 3 seconds to the end of the (short) audio Chris@0: file. Chris@0: Chris@309: Chris@309: Automated build reports Chris@309: ----------------------- Chris@309: Chris@309: * Linux and macOS CI build: [![Build Status](https://travis-ci.org/sonic-visualiser/sonic-annotator.svg?branch=master)](https://travis-ci.org/sonic-visualiser/sonic-annotator) Chris@309: * Windows CI build: [![Build status](https://ci.appveyor.com/api/projects/status/26pygienkigw39p7?svg=true)](https://ci.appveyor.com/project/cannam/sonic-annotator)