sonic-annotator: README annotate

annotate README @ 1:92911f967a16

* some reorganisation

author	Chris Cannam
date	Thu, 11 Dec 2008 10:26:12 +0000
parents	581b1b150a4d
children	475f4623feba

rev	line source
Chris@0	1
Chris@0	2 Sonic Annotator
Chris@0	3 ===============
Chris@0	4
Chris@0	5 Sonic Annotator is a utility program for batch feature extraction from
Chris@0	6 audio files. It runs Vamp audio analysis plugins on audio files, and
Chris@0	7 can write the result features in a selection of formats.
Chris@0	8
Chris@0	9
Chris@0	10 A Quick Tutorial
Chris@0	11 ----------------
Chris@0	12
Chris@0	13 To use Sonic Annotator, you need to tell it three things: what audio
Chris@0	14 files to extract features from; what features to extract; and how and
Chris@0	15 where to write the results. You can also optionally tell it to
Chris@0	16 summarise the features.
Chris@0	17
Chris@0	18
Chris@0	19 1. What audio files to extract features from
Chris@0	20
Chris@0	21 Sonic Annotator accepts a list of audio files on the command line.
Chris@0	22 Any argument that is not understood as a supported command-line option
Chris@0	23 will be taken to be the name of an audio file. Any number of files
Chris@0	24 may be listed.
Chris@0	25
Chris@0	26 Several common audio file formats are supported, including MP3, Ogg,
Chris@0	27 and a number of PCM formats such as WAV and AIFF. AAC is supported on
Chris@0	28 OS/X only, and only if not DRM protected. WMA is not supported.
Chris@0	29
Chris@0	30 File paths do not have to be local; you can also provide remote HTTP
Chris@0	31 or FTP URLs for Sonic Annotator to retrieve.
Chris@0	32
Chris@0	33 Sonic Annotator also accepts the names of playlist files (.m3u
Chris@0	34 extension) and will process every file found in the playlist.
Chris@0	35
Chris@0	36 Finally, you can provide a local directory path instead of a file,
Chris@0	37 together with the -r (recursive) option, for Sonic Annotator to
Chris@0	38 process every audio file found in that directory or any of its
Chris@0	39 subdirectories.
Chris@0	40
Chris@0	41
Chris@0	42 2. What features to extract
Chris@0	43
Chris@0	44 Sonic Annotator applies "transforms" to its input audio files, where a
Chris@0	45 transform (in this terminology) consists of a Vamp plugin together
Chris@0	46 with a certain set of parameters and a specified execution context:
Chris@0	47 step and block size, sample rate, etc.
Chris@0	48
Chris@0	49 (See http://www.vamp-plugins.org/ for more information about Vamp
Chris@0	50 plugins.)
Chris@0	51
Chris@0	52 To use a particular transform, specify its filename on the command
Chris@0	53 line with the -t option.
Chris@0	54
Chris@0	55 Transforms are usually described in RDF, following the transform part
Chris@0	56 of the Vamp plugin ontology (http://purl.org/ontology/vamp/). A
Chris@0	57 Transform may use any Vamp plugin that is currently installed and
Chris@0	58 available on the system. You can obtain a list of available plugin
Chris@0	59 outputs by running Sonic Annotator with the -l option, and you can
Chris@0	60 obtain a skeleton transform description for one of these plugins with
Chris@0	61 the -s option.
Chris@0	62
Chris@0	63 For example, if the example plugins from the Vamp plugin SDK are
Chris@0	64 available and no other plugins are installed, you might have an
Chris@0	65 exchange like this:
Chris@0	66
Chris@0	67 $ sonic-annotator -l
Chris@0	68 vamp:vamp-example-plugins:amplitudefollower:amplitude
Chris@0	69 vamp:vamp-example-plugins:fixedtempo:acf
Chris@0	70 vamp:vamp-example-plugins:fixedtempo:detectionfunction
Chris@0	71 vamp:vamp-example-plugins:fixedtempo:filtered_acf
Chris@0	72 vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0	73 vamp:vamp-example-plugins:fixedtempo:candidates
Chris@0	74 vamp:vamp-example-plugins:percussiononsets:detectionfunction
Chris@0	75 vamp:vamp-example-plugins:percussiononsets:onsets
Chris@0	76 vamp:vamp-example-plugins:powerspectrum:powerspectrum
Chris@0	77 vamp:vamp-example-plugins:spectralcentroid:linearcentroid
Chris@0	78 vamp:vamp-example-plugins:spectralcentroid:logcentroid
Chris@0	79 vamp:vamp-example-plugins:zerocrossing:counts
Chris@0	80 vamp:vamp-example-plugins:zerocrossing:zerocrossings
Chris@0	81 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo
Chris@0	82 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
Chris@0	83 @prefix vamp: <http://purl.org/ontology/vamp/> .
Chris@0	84 @prefix : <#> .
Chris@0	85
Chris@0	86 :transform a vamp:Transform ;
Chris@0	87 vamp:plugin <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo> ;
Chris@0	88 vamp:step_size "64"^^xsd:int ;
Chris@0	89 vamp:block_size "256"^^xsd:int ;
Chris@0	90 vamp:parameter_binding [
Chris@0	91 vamp:parameter [ vamp:identifier "maxbpm" ] ;
Chris@0	92 vamp:value "190"^^xsd:float ;
Chris@0	93 ] ;
Chris@0	94 vamp:parameter_binding [
Chris@0	95 vamp:parameter [ vamp:identifier "maxdflen" ] ;
Chris@0	96 vamp:value "10"^^xsd:float ;
Chris@0	97 ] ;
Chris@0	98 vamp:parameter_binding [
Chris@0	99 vamp:parameter [ vamp:identifier "minbpm" ] ;
Chris@0	100 vamp:value "50"^^xsd:float ;
Chris@0	101 ] ;
Chris@0	102 vamp:output <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo_output_tempo> .
Chris@0	103 $
Chris@0	104
Chris@0	105 The output of -s is an RDF/Turtle document describing the default
Chris@0	106 settings for the Tempo output of the Fixed Tempo Estimator plugin in
Chris@0	107 the Vamp plugin SDK.
Chris@0	108
Chris@0	109 (The exact format of the RDF printed may differ -- e.g. if the
Chris@0	110 plugin's RDF description is not installed and so its "home" URI is not
Chris@0	111 known -- but the result should be functionally equivalent to this.)
Chris@0	112
Chris@0	113 You could run this transform by saving the RDF to a file and
Chris@0	114 specifying that file with -t:
Chris@0	115
Chris@0	116 $ sonic-annotator -s vamp:vamp-example-plugins:fixedtempo:tempo > test.n3
Chris@0	117 $ sonic-annotator -t test.n3 audio.wav -w csv --csv-stdout
Chris@0	118 (... logging output on stderr, then ...)
Chris@0	119 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0	120 $
Chris@0	121
Chris@0	122 The single line of output above consists of the audio file name, the
Chris@0	123 timestamp and duration for a single feature, the value of that feature
Chris@0	124 (the estimated tempo of the given region of time from that file, in
Chris@0	125 bpm -- the plugin in question performs a single tempo estimation and
Chris@0	126 nothing else) and the feature's label.
Chris@0	127
Chris@0	128 A quicker way to achieve the above is to use the -d (default) option
Chris@0	129 to tell Sonic Annotator to use directly the default configuration for
Chris@0	130 a named transform:
Chris@0	131
Chris@0	132 $ sonic-annotator -d vamp:vamp-example-plugins:fixedtempo:tempo audio.wav -w csv --csv-stdout
Chris@0	133 (... some log output on stderr, then ...)
Chris@0	134 "audio.wav",0.002902494,5.196916099,68.7916,"68.8 bpm"
Chris@0	135 $
Chris@0	136
Chris@0	137 Although handy for experimentation, the -d option is inadvisable in
Chris@0	138 any "production" situation because the plugin configuration is not
Chris@0	139 guaranteed to be the same each time (for example if an updated version
Chris@0	140 of a plugin changes some of its defaults). It's better to save a
Chris@0	141 well-defined transform to file and refer to that, even if it is simply
Chris@0	142 the transform created by the skeleton option.
Chris@0	143
Chris@0	144 To run more than one transform on the same audio files, just put more
Chris@0	145 than one set of transform RDF descriptions in the same file, or give
Chris@0	146 the -t option more than once with separate transform description
Chris@0	147 files. Remember that if you want to specify more than one transform
Chris@0	148 in the same file, they will need to have distinct URIs (that is, the
Chris@0	149 ":transform" part of the example above, which may be any arbitrary
Chris@0	150 name, must be distinct for each described transform).
Chris@0	151
Chris@0	152
Chris@0	153 3. How and where to write the results
Chris@0	154
Chris@0	155 Sonic Annotator supports various different output modules (and it is
Chris@0	156 fairly easy for the developer to add new ones). You have to choose at
Chris@0	157 least one output module; use the -w (writer) option to do so. Each
Chris@0	158 module has its own set of parameters which can be adjusted on the
Chris@0	159 command line, as well as its own default rules about where to write
Chris@0	160 the results.
Chris@0	161
Chris@0	162 The following writers are currently supported. (Others exist, but are
Chris@0	163 not properly implemented or not supported.)
Chris@0	164
Chris@0	165 * csv
Chris@0	166
Chris@0	167 Writes the results into comma-separated data files.
Chris@0	168
Chris@0	169 One file is created for each transform applied to each input audio
Chris@0	170 file, named after the input audio file and transform name with .csv
Chris@0	171 suffix and ":" replaced by "_" throughout, placed in the same
Chris@0	172 directory as the audio file.
Chris@0	173
Chris@0	174 To instruct Sonic Annotator to place the output files in another
Chris@0	175 location, use --csv-basedir with a directory name.
Chris@0	176
Chris@0	177 To write a single file with all data in it, use --csv-one-file.
Chris@0	178
Chris@0	179 To write all data to stdout instead of to a file, use --csv-stdout.
Chris@0	180
Chris@0	181 Sonic Annotator will not write to an output file that already
Chris@0	182 exists. If you want to make it do this, use --csv-force to
Chris@0	183 overwrite or --csv-append to append to it.
Chris@0	184
Chris@0	185 The data generated consists of one line for each result feature,
Chris@0	186 containing the feature timestamp, feature duration if present, all
Chris@0	187 of the feature's bin values in order, followed by the feature's
Chris@0	188 label if present. If the --csv-one-file or --csv-stdout option is
Chris@0	189 specified, then an additional column will appear before any of the
Chris@0	190 above, containing the audio file name from which the feature was
Chris@0	191 extracted, if it differs from that of the previous row.
Chris@0	192
Chris@0	193 The default column separator is a comma; you can specify a
Chris@0	194 different one with the --csv-separator option.
Chris@0	195
Chris@0	196 * rdf
Chris@0	197
Chris@0	198 Writes the results into RDF/Turtle documents following the Audio
Chris@0	199 Features ontology (http://purl.org/ontology/af/).
Chris@0	200
Chris@0	201 One file is created for each input audio file containing the
Chris@0	202 features extracted by all transforms applied to that file, named
Chris@0	203 after the input audio file with .n3 extension, placed in the same
Chris@0	204 directory as the audio file.
Chris@0	205
Chris@0	206 To instruct Sonic Annotator to place the output files in another
Chris@0	207 location, use --rdf-basedir with a directory name.
Chris@0	208
Chris@0	209 To write a single file with all data (from all input audio files)
Chris@0	210 in it, use --rdf-one-file.
Chris@0	211
Chris@0	212 To write one file for each transform applied to each input audio
Chris@0	213 file, named after the input audio file and transform name with .n3
Chris@0	214 suffix and ":" replaced by "_" throughout, use --rdf-many-files.
Chris@0	215
Chris@0	216 To write all data to stdout instead of to a file, use --rdf-stdout.
Chris@0	217
Chris@0	218 Sonic Annotator will not write to an output file that already
Chris@0	219 exists. If you want to make it do this, use --rdf-force to
Chris@0	220 overwrite or --rdf-append to append to it.
Chris@0	221
Chris@0	222 Sonic Annotator will use plugin description RDF if available to
Chris@0	223 enhance its output (for example identifying note onset times as
Chris@0	224 note onset times, if the plugin's RDF says that is what it
Chris@0	225 produces, rather than writing them as plain events). Best results
Chris@0	226 will be obtained if an RDF document is provided with your plugins
Chris@0	227 (for example, vamp-example-plugins.n3) and you have this installed
Chris@0	228 in the same location as the plugins. To override this enhanced
Chris@0	229 output and write plain events for all features, use --rdf-plain.
Chris@0	230
Chris@0	231 The output RDF will include an available_as property linking the
Chris@0	232 results to the original audio signal URI. By default, this will
Chris@0	233 point to the URI of the file or resource containing the audio that
Chris@0	234 Sonic Annotator processed, such as the file:/// location on disk.
Chris@0	235 To override this, for example to process a local copy of a file
Chris@0	236 while generating RDF that describes a copy of it available on a
Chris@0	237 network, you can use the --rdf-signal-uri option to specify an
Chris@0	238 alternative signal URI.
Chris@0	239
Chris@0	240
Chris@0	241 4. Optionally, how to summarise the features
Chris@0	242
Chris@0	243 Sonic Annotator can also calculate and write summaries of features,
Chris@0	244 such as mean and median values.
Chris@0	245
Chris@0	246 To obtain a summary as well as the feature results, just use the -S
Chris@0	247 option, naming the type of summary you want (min, max, mean, median,
Chris@0	248 mode, sum, variance, sd or count). You can also tell it to produce
Chris@0	249 only the summary, not the individual features, with --summary-only.
Chris@0	250
Chris@0	251 Alternatively, you can specify a summary in a transform description.
Chris@0	252 The following example tells Sonic Annotator to write both the times of
Chris@0	253 note onsets estimated by the simple percussion onset detector example
Chris@0	254 plugin, and the variance of the plugin's onset detection function.
Chris@0	255 (It will only process the audio file and run the plugin once.)
Chris@0	256
Chris@0	257 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
Chris@0	258 @prefix vamp: <http://purl.org/ontology/vamp/>.
Chris@0	259 @prefix examples: <http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#>.
Chris@0	260 @prefix : <#>.
Chris@0	261
Chris@0	262 :transform1 a vamp:Transform;
Chris@0	263 vamp:plugin examples:percussiononsets ;
Chris@0	264 vamp:output examples:percussiononsets_output_onsets .
Chris@0	265
Chris@0	266 :transform0 a vamp:Transform;
Chris@0	267 vamp:plugin examples:percussiononsets ;
Chris@0	268 vamp:output examples:percussiononsets_output_detectionfunction ;
Chris@0	269 vamp:summary_type "variance" .
Chris@0	270
Chris@0	271 Sonic Annotator can also summarise in segments -- if you provide a
Chris@0	272 comma-separated list of times as an argument to the --segments option,
Chris@0	273 it will calculate one summary for each segment bounded by the times
Chris@0	274 you provided. For example,
Chris@0	275
Chris@0	276 $ sonic-annotator -d vamp:vamp-example-plugins:percussiononsets:detectionfunction -S variance --sumary-only --segments 1,2,3 -w csv --csv-stdout audio.wav
Chris@0	277 (... some log output on stderr, then ...)
Chris@0	278 ,0.000000000,1.000000000,variance,1723.99,"(variance, continuous-time average)"
Chris@0	279 ,1.000000000,1.000000000,variance,1981.75,"(variance, continuous-time average)"
Chris@0	280 ,2.000000000,1.000000000,variance,1248.79,"(variance, continuous-time average)"
Chris@0	281 ,3.000000000,7.031020407,variance,1030.06,"(variance, continuous-time average)"
Chris@0	282
Chris@0	283 Here the first row contains a summary covering the time period from 0
Chris@0	284 to 1 second, the second from 1 to 2 seconds, the third from 2 to 3
Chris@0	285 seconds and the fourth from 3 seconds to the end of the (short) audio
Chris@0	286 file.
Chris@0	287

Mercurial > hg > sonic-annotator

annotate README @ 1:92911f967a16