auditok: README.md annotate

annotate README.md @ 23:2beb3fb562f3

doc update

author	Amine Sehili <amine.sehili@gmail.com>
date	Sun, 29 Nov 2015 11:52:56 +0100
parents	8c164d41bbbf
children	9699fc1478a5 4e62d1463588

rev	line source
amsehili@11	1 [![Build Status](https://travis-ci.org/amsehili/auditok.svg?branch=master)](https://travis-ci.org/amsehili/auditok)
amsehili@11	2 AUDIo TOKenizer
amine@2	3 ===============
amine@2	4
amsehili@20	5 `auditok` is an Audio Activity Detection tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program and offers an easy to use API.
amsehili@20	6
amsehili@20	7 The following two figures illustrate the detector output when:
amsehili@20	8
amsehili@20	9 1. the detector tolerates phases of silence of up to 0.3 second (300 ms) within an audio activity (also referred to as acoustic event):
amsehili@20	10 ![](doc/figures/figure_1.png)
amsehili@20	11
amsehili@20	12 2. the detector splits an audio activity event into many activities if the within silence is over 0.2 second:
amsehili@20	13 ![](doc/figures/figure_2.png)
amsehili@20	14
amine@2	15
amine@2	16 Requirements
amine@2	17 ------------
amsehili@20	18 `auditok` can be used with standard Python!
amsehili@20	19 However if you want more features, the following packages are needed:
amsehili@20	20 - [pydub](https://github.com/jiaaro/pydub): read audio files of popular audio formats (ogg, mp3, etc.) or extract audio from a video file
amsehili@20	21 - [PyAudio](http://people.csail.mit.edu/hubert/pyaudio/): read audio data from the microphone and play back detections
amine@21	22 - `matplotlib`: plot audio signal and detections (see figures above)
amine@21	23 - `numpy`: required by matplotlib. Also used for math operations instead of standard python if available
amsehili@20	24 - Optionnaly, you can use `sox` or `parecord` for data acquisition and feed `auditok` using a pipe.
amsehili@20	25
amine@2	26
amine@2	27 Installation
amine@2	28 ------------
amine@4	29 python setup.py install
amine@2	30
amine@21	31 Command line usage:
amine@21	32 ------------------
amine@21	33
amine@21	34 The first thing you want to check is perhaps how `auditok` detects your voice. If you have installed `PyAudio` just run (`Ctrl-C` to stop):
amine@21	35
amine@21	36 auditok -D -E
amine@21	37
amine@21	38 Option `-D` means debug, whereas `-E` stands for echo, so `auditok` plays back whatever it detects.
amine@21	39
amine@21	40 If there are too many detections, use a higher value for energy threshold (the current version only implements a `validator` based on energy threshold. The use of spectral information is also desirable and might be part of future releases). To change the energy threshold (default: 45), use option `-e`:
amine@21	41
amine@21	42 auditok -D -E -e 55
amine@21	43
amine@21	44 If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`):
amine@21	45
amine@21	46 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - \| auditok -r 16000 -i -
amine@21	47
amine@21	48 With `-i -`, `auditok` reads data from standard input.
amine@21	49
amine@21	50 `rec` and `play` are just an alias for `sox`. Doing so you won't be able to play audio detections (`-E` requires `Pyaudio`). Fortunately, `auditok` gives the possibility to call any command every time it detects an activity, passing the activity as a file to the user supplied command:
amine@21	51
amine@21	52 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - \| auditok -i - -r 16000 -C "play -q -t raw -r 16000 -c 1 -b 16 -e signed $"
amine@21	53
amine@21	54 The `-C` option tells `auditok` to interpret its content as a command that is run whenever `auditok` detects an audio activity, replacing the `$` by a name of a temporary file into which the activity is saved as raw audio. Here we use `play` to play the activity, giving the necessary `play` arguments for raw data.
amine@21	55
amine@21	56 The `-C` option can be useful in many cases. Imagine a command that sends audio data over a network only if there is an audio activity and saves bandwidth during silence.
amine@21	57
amine@21	58 ### Plot signal and detections:
amine@21	59
amine@21	60 use option `-p`. Requires `matplotlib` and `numpy`
amine@21	61
amine@21	62 ### read data from file
amine@21	63
amine@21	64 auditok -i input.wav ...
amine@21	65
amine@21	66 Install `pydub` for other audio formats.
amine@21	67
amine@21	68 ### Limit the length of aquired data
amine@21	69
amine@21	70 auditok -M 12 ...
amine@21	71
amine@21	72 Time is in seconds.
amine@21	73
amine@21	74 ### Save the whole acquired audio signal
amine@21	75
amine@21	76 auditok -O output.wav ...
amine@21	77
amine@21	78 Install `pydub` for other audio formats.
amine@21	79
amine@21	80
amine@21	81 ### Save each detection into a separate audio file
amine@21	82
amine@21	83 auditok -o det_{N}_{start}_{end}.wav ...
amine@21	84
amine@21	85 You can use a free text and place `{N}`, `{start}` and `{end}` wherever you want, they will be replaced by detection number, start time and end time respectively. Another example:
amine@21	86
amine@21	87 auditok -o {start}-{end}.wav ...
amine@21	88
amine@21	89 Install `pydub` for more audio formats.
amine@21	90
amine@2	91 Demos
amine@2	92 -----
amine@2	93 This code reads data from the microphone and plays back whatever it detects.
amine@3	94
amine@2	95 python demos/echo.py
amine@2	96
amine@2	97 `echo.py` accepts two arguments: energy threshold (default=45) and duration in seconds (default=10):
amine@2	98
amine@2	99 python demos/echo.py 50 15
amine@2	100
amine@4	101 If only one argument is given it will be used for energy.
amine@4	102
amine@4	103 Try out this demo with an audio file (no argument is required):
amine@4	104
amine@4	105 python demos/audio_tokenize_demo.py
amine@4	106
amsehili@6	107 Finally, in this demo `auditok` is used to remove tailing and leading silence from an audio file:
amine@4	108
amine@4	109 python demos/audio_trim_demo.py
amine@2	110
amine@2	111 Documentation
amine@2	112 -------------
amine@2	113
amsehili@6	114 Check out this [quick start](https://github.com/amsehili/auditok/blob/master/quickstart.rst) or the [API documentation](http://amsehili.github.io/auditok/pdoc/).
amsehili@6	115
amine@2	116
amine@2	117 Contribution
amine@2	118 ------------
amine@2	119 Contributions are very appreciated !
amine@2	120
amine@2	121 License
amine@2	122 -------
amine@2	123 `auditok` is published under the GNU General Public License Version 3.
amine@2	124
amine@2	125 Author
amine@2	126 ------
amine@2	127 Amine Sehili (<amine.sehili@gmail.com>)
amine@21	128

Mercurial > hg > auditok

annotate README.md @ 23:2beb3fb562f3