annotate README.rst @ 353:5edf3f0ad2bc

Update plots with matplotlib 3.2.1
author Amine Sehili <amine.sehili@gmail.com>
date Tue, 31 Mar 2020 22:46:30 +0200
parents 1076056833c5
children 0cd0210a0dac
rev   line source
amsehili@349 1
amsehili@343 2
amine@344 3 .. image:: doc/figures/auditok-logo.png
amsehili@343 4 :align: center
amsehili@343 5
amine@336 6 .. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master
amine@336 7 :target: https://travis-ci.org/amsehili/auditok
amine@336 8
amine@336 9 .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
amine@336 10 :target: http://auditok.readthedocs.org/en/latest/?badge=latest
amine@336 11 :alt: Documentation Status
amine@336 12
amsehili@349 13 ``auditok`` is an **Audio Activity Detection** tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program or by calling its API.
amsehili@349 14
amsehili@349 15 A basic version of ``auditok`` will run with standard Python (>=3.4). Without installing additional dependencies, ``auditok`` can only deal with audio files in *wav* or *raw* formats. if you want more features, the following packages are needed:
amsehili@349 16
amsehili@349 17 - `pydub <https://github.com/jiaaro/pydub>`_ : read audio files in popular audio formats (ogg, mp3, etc.) or extract audio from a video file.
amsehili@349 18
amsehili@349 19 - `pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>`_ : read audio data from the microphone and play back detections.
amsehili@349 20
amsehili@349 21 - `tqdm <https://github.com/tqdm/tqdm>`_ : show progress bar while playing audio clips.
amsehili@349 22
amsehili@349 23 - `matplotlib <http://matplotlib.org/>`_ : plot audio signal and detections (see figures above ).
amsehili@349 24
amsehili@349 25 - `numpy <http://www.numpy.org>`_ : required by matplotlib. Also used for some math operations instead of standard python if available.
amsehili@349 26
amsehili@349 27 Installation
amsehili@349 28 ------------
amsehili@349 29
amsehili@349 30 .. code:: bash
amsehili@349 31
amsehili@349 32 git clone https://github.com/amsehili/auditok.git
amsehili@349 33 cd auditok
amsehili@349 34 python setup.py install
amsehili@349 35
amsehili@343 36 Basic example
amsehili@343 37 -------------
amsehili@343 38
amine@336 39 .. code:: python
amine@336 40
amine@336 41 from auditok import split
amsehili@343 42
amsehili@343 43 # split returns a generator of AudioRegion objects
amine@336 44 audio_regions = split("audio.wav")
amine@336 45 for region in audio_regions:
amine@336 46 region.play(progress_bar=True)
amine@336 47 filename = region.save("/tmp/region_{meta.start:.3f}.wav")
amine@336 48 print("region saved as: {}".format(filename))
amine@336 49
amsehili@343 50 Example using `AudioRegion`
amsehili@349 51 ---------------------------
amine@336 52
amine@336 53 .. code:: python
amine@336 54
amine@336 55 from auditok import AudioRegion
amine@336 56 region = AudioRegion.load("audio.wav")
amsehili@343 57 regions = region.split_and_plot() # or just region.splitp()
amine@336 58
amsehili@349 59 output figure:
amine@336 60
amine@336 61 .. image:: doc/figures/example_1.png
amsehili@349 62
amsehili@349 63 Working with AudioRegions
amsehili@349 64 -------------------------
amsehili@349 65
amsehili@349 66 Beyond splitting, there are a couple of interesting operations you can do with ``AudioRegion`` objects.
amsehili@349 67
amsehili@349 68 Concatenate regions
amsehili@349 69 ===================
amsehili@349 70
amsehili@349 71 .. code:: python
amsehili@349 72
amsehili@349 73 from auditok import AudioRegion
amsehili@349 74 region_1 = AudioRegion.load("audio_1.wav")
amsehili@349 75 region_2 = AudioRegion.load("audio_2.wav")
amsehili@349 76 region_3 = region_1 + region_2
amsehili@349 77
amsehili@349 78 Particularly useful if you want to join regions returned by ``split``:
amsehili@349 79
amsehili@349 80 .. code:: python
amsehili@349 81
amsehili@349 82 from auditok import AudioRegion
amsehili@349 83 regions = AudioRegion.load("audio.wav").split()
amsehili@349 84 gapless_region = sum(regions)
amsehili@349 85
amsehili@349 86 Repeat a region
amsehili@349 87 ===============
amsehili@349 88
amsehili@349 89 Multiply by a positive integer:
amsehili@349 90
amsehili@349 91 .. code:: python
amsehili@349 92
amsehili@349 93 from auditok import AudioRegion
amsehili@349 94 region = AudioRegion.load("audio.wav")
amsehili@349 95 region_x3 = region * 3
amsehili@349 96
amsehili@349 97 Make slices of equal size out of a region
amsehili@349 98 =========================================
amsehili@349 99
amsehili@349 100 Divide by a positive integer:
amsehili@349 101
amsehili@349 102 .. code:: python
amsehili@349 103
amsehili@349 104 from auditok import AudioRegion
amsehili@349 105 region = AudioRegion.load("audio.wav")
amsehili@349 106 regions = regions / 5
amsehili@349 107 assert sum(regions) == region
amsehili@349 108
amsehili@349 109 Make audio slices of arbitrary size
amsehili@349 110 ===================================
amsehili@349 111
amsehili@349 112 Slicing an ``AudioRegion`` can be interesting in many situations. You can for example remove a fixed-size portion of audio data from the beginning or the end of a region or crop a region by an arbitrary amount as a data augmentation strategy, etc.
amsehili@349 113
amsehili@349 114 The most accurate way to slice an ``AudioRegion`` is to use indices that directly refer to raw audio samples. In the following example, assuming that the sampling rate of audio data is 16000, you can extract a 5-second region from main region, starting from the 20th second as follows:
amsehili@349 115
amsehili@349 116 .. code:: python
amsehili@349 117
amsehili@349 118 from auditok import AudioRegion
amsehili@349 119 region = AudioRegion.load("audio.wav")
amsehili@349 120 start = 20 * 16000
amsehili@349 121 stop = 25 * 16000
amsehili@349 122 five_second_region = region[start:stop]
amsehili@349 123
amsehili@349 124 This allows you to practically start and stop at any sample within the region. Just as with a `list` you can omit one of `start` and `stop`, or both. You can also use negative indices:
amsehili@349 125
amsehili@349 126 .. code:: python
amsehili@349 127
amsehili@349 128 from auditok import AudioRegion
amsehili@349 129 region = AudioRegion.load("audio.wav")
amsehili@349 130 start = -3 * region.sr # `sr` is an alias of `sampling_rate`
amsehili@349 131 three_last_seconds = region[start:]
amsehili@349 132
amsehili@349 133 While slicing by raw samples is accurate, slicing with temporal indices is more intuitive. You can do so by accessing the ``millis`` or ``seconds`` views of ``AudioRegion`` (or their shortcut alias ``ms`` and ``sec``/``s``).
amsehili@349 134
amsehili@349 135 With the ``millis`` view:
amsehili@349 136
amsehili@349 137 .. code:: python
amsehili@349 138
amsehili@349 139 from auditok import AudioRegion
amsehili@349 140 region = AudioRegion.load("audio.wav")
amsehili@349 141 five_second_region = region.millis[5000:10000]
amsehili@349 142
amsehili@349 143 or with the ``seconds`` view
amsehili@349 144
amsehili@349 145 .. code:: python
amsehili@349 146
amsehili@349 147 from auditok import AudioRegion
amsehili@349 148 region = AudioRegion.load("audio.wav")
amsehili@349 149 five_second_region = region.seconds[5:10]
amsehili@349 150
amsehili@349 151 Get an array of audio samples
amsehili@349 152 =============================
amsehili@349 153
amsehili@349 154 .. code:: python
amsehili@349 155
amsehili@349 156 from auditok import AudioRegion
amsehili@349 157 region = AudioRegion.load("audio.wav")
amsehili@349 158 samples = region.samples
amsehili@349 159
amsehili@349 160 If ``numpy`` is installed, this will return a ``numpy.ndarray``. If audio data is mono the returned array is 1D, otherwise it's 2D. If ``numpy`` is not installed this will return a standard ``array.array`` for mono data, and a list of ``array.array`` for multichannel data.
amsehili@349 161
amsehili@349 162 Alternatively you can use:
amsehili@349 163
amsehili@349 164 .. code:: python
amsehili@349 165
amsehili@349 166 import numpy as np
amsehili@349 167 region = AudioRegion.load("audio.wav")
amsehili@349 168 samples = np.asarray(region)
amsehili@349 169
amsehili@349 170 License
amsehili@349 171 -------
amsehili@349 172 MIT.
amsehili@349 173