auditok: README.rst annotate

annotate README.rst @ 368:683c98b7f5a6

Put documentation in numpy style

author	Amine Sehili <amine.sehili@gmail.com>
date	Sun, 10 Jan 2021 17:11:07 +0100
parents	0cd0210a0dac
children	2e26a7c5f300

rev	line source
amsehili@349	1
amsehili@343	2
amine@344	3 .. image:: doc/figures/auditok-logo.png
amsehili@343	4 :align: center
amsehili@343	5
amine@336	6 .. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master
amine@336	7 :target: https://travis-ci.org/amsehili/auditok
amine@336	8
amine@336	9 .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
amine@336	10 :target: http://auditok.readthedocs.org/en/latest/?badge=latest
amine@336	11 :alt: Documentation Status
amine@336	12
amsehili@349	13 ``auditok`` is an Audio Activity Detection tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program or by calling its API.
amsehili@349	14
amsehili@349	15 A basic version of ``auditok`` will run with standard Python (>=3.4). Without installing additional dependencies, ``auditok`` can only deal with audio files in wav or raw formats. if you want more features, the following packages are needed:
amsehili@349	16
amsehili@349	17 - `pydub <https://github.com/jiaaro/pydub>`_ : read audio files in popular audio formats (ogg, mp3, etc.) or extract audio from a video file.
amsehili@349	18
amsehili@349	19 - `pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>`_ : read audio data from the microphone and play back detections.
amsehili@349	20
amsehili@349	21 - `tqdm <https://github.com/tqdm/tqdm>`_ : show progress bar while playing audio clips.
amsehili@349	22
amsehili@354	23 - `matplotlib <http://matplotlib.org/>`_ : plot audio signal and detections.
amsehili@349	24
amsehili@349	25 - `numpy <http://www.numpy.org>`_ : required by matplotlib. Also used for some math operations instead of standard python if available.
amsehili@349	26
amsehili@349	27 Installation
amsehili@349	28 ------------
amsehili@349	29
amsehili@349	30 .. code:: bash
amsehili@349	31
amsehili@354	32 pip install git+https://github.com/amsehili/auditok
amsehili@354	33
amsehili@349	34
amsehili@343	35 Basic example
amsehili@343	36 -------------
amsehili@343	37
amine@336	38 .. code:: python
amine@336	39
amine@336	40 from auditok import split
amsehili@343	41
amsehili@343	42 # split returns a generator of AudioRegion objects
amine@336	43 audio_regions = split("audio.wav")
amine@336	44 for region in audio_regions:
amine@336	45 region.play(progress_bar=True)
amine@336	46 filename = region.save("/tmp/region_{meta.start:.3f}.wav")
amine@336	47 print("region saved as: {}".format(filename))
amine@336	48
amsehili@343	49 Example using `AudioRegion`
amsehili@349	50 ---------------------------
amine@336	51
amine@336	52 .. code:: python
amine@336	53
amine@336	54 from auditok import AudioRegion
amine@336	55 region = AudioRegion.load("audio.wav")
amsehili@343	56 regions = region.split_and_plot() # or just region.splitp()
amine@336	57
amsehili@349	58 output figure:
amine@336	59
amine@336	60 .. image:: doc/figures/example_1.png
amsehili@349	61
amsehili@349	62 Working with AudioRegions
amsehili@349	63 -------------------------
amsehili@349	64
amsehili@349	65 Beyond splitting, there are a couple of interesting operations you can do with ``AudioRegion`` objects.
amsehili@349	66
amsehili@349	67 Concatenate regions
amsehili@349	68 ===================
amsehili@349	69
amsehili@349	70 .. code:: python
amsehili@349	71
amsehili@349	72 from auditok import AudioRegion
amsehili@349	73 region_1 = AudioRegion.load("audio_1.wav")
amsehili@349	74 region_2 = AudioRegion.load("audio_2.wav")
amsehili@349	75 region_3 = region_1 + region_2
amsehili@349	76
amsehili@349	77 Particularly useful if you want to join regions returned by ``split``:
amsehili@349	78
amsehili@349	79 .. code:: python
amsehili@349	80
amsehili@349	81 from auditok import AudioRegion
amsehili@349	82 regions = AudioRegion.load("audio.wav").split()
amsehili@349	83 gapless_region = sum(regions)
amsehili@349	84
amsehili@349	85 Repeat a region
amsehili@349	86 ===============
amsehili@349	87
amsehili@349	88 Multiply by a positive integer:
amsehili@349	89
amsehili@349	90 .. code:: python
amsehili@349	91
amsehili@349	92 from auditok import AudioRegion
amsehili@349	93 region = AudioRegion.load("audio.wav")
amsehili@349	94 region_x3 = region * 3
amsehili@349	95
amsehili@349	96 Make slices of equal size out of a region
amsehili@349	97 =========================================
amsehili@349	98
amsehili@349	99 Divide by a positive integer:
amsehili@349	100
amsehili@349	101 .. code:: python
amsehili@349	102
amsehili@349	103 from auditok import AudioRegion
amsehili@349	104 region = AudioRegion.load("audio.wav")
amsehili@349	105 regions = regions / 5
amsehili@349	106 assert sum(regions) == region
amsehili@349	107
amsehili@349	108 Make audio slices of arbitrary size
amsehili@349	109 ===================================
amsehili@349	110
amsehili@349	111 Slicing an ``AudioRegion`` can be interesting in many situations. You can for example remove a fixed-size portion of audio data from the beginning or the end of a region or crop a region by an arbitrary amount as a data augmentation strategy, etc.
amsehili@349	112
amsehili@349	113 The most accurate way to slice an ``AudioRegion`` is to use indices that directly refer to raw audio samples. In the following example, assuming that the sampling rate of audio data is 16000, you can extract a 5-second region from main region, starting from the 20th second as follows:
amsehili@349	114
amsehili@349	115 .. code:: python
amsehili@349	116
amsehili@349	117 from auditok import AudioRegion
amsehili@349	118 region = AudioRegion.load("audio.wav")
amsehili@349	119 start = 20 * 16000
amsehili@349	120 stop = 25 * 16000
amsehili@349	121 five_second_region = region[start:stop]
amsehili@349	122
amsehili@349	123 This allows you to practically start and stop at any sample within the region. Just as with a `list` you can omit one of `start` and `stop`, or both. You can also use negative indices:
amsehili@349	124
amsehili@349	125 .. code:: python
amsehili@349	126
amsehili@349	127 from auditok import AudioRegion
amsehili@349	128 region = AudioRegion.load("audio.wav")
amsehili@349	129 start = -3 * region.sr # `sr` is an alias of `sampling_rate`
amsehili@349	130 three_last_seconds = region[start:]
amsehili@349	131
amsehili@349	132 While slicing by raw samples is accurate, slicing with temporal indices is more intuitive. You can do so by accessing the ``millis`` or ``seconds`` views of ``AudioRegion`` (or their shortcut alias ``ms`` and ``sec``/``s``).
amsehili@349	133
amsehili@349	134 With the ``millis`` view:
amsehili@349	135
amsehili@349	136 .. code:: python
amsehili@349	137
amsehili@349	138 from auditok import AudioRegion
amsehili@349	139 region = AudioRegion.load("audio.wav")
amsehili@349	140 five_second_region = region.millis[5000:10000]
amsehili@349	141
amsehili@349	142 or with the ``seconds`` view
amsehili@349	143
amsehili@349	144 .. code:: python
amsehili@349	145
amsehili@349	146 from auditok import AudioRegion
amsehili@349	147 region = AudioRegion.load("audio.wav")
amsehili@349	148 five_second_region = region.seconds[5:10]
amsehili@349	149
amsehili@349	150 Get an array of audio samples
amsehili@349	151 =============================
amsehili@349	152
amsehili@349	153 .. code:: python
amsehili@349	154
amsehili@349	155 from auditok import AudioRegion
amsehili@349	156 region = AudioRegion.load("audio.wav")
amsehili@349	157 samples = region.samples
amsehili@349	158
amsehili@349	159 If ``numpy`` is installed, this will return a ``numpy.ndarray``. If audio data is mono the returned array is 1D, otherwise it's 2D. If ``numpy`` is not installed this will return a standard ``array.array`` for mono data, and a list of ``array.array`` for multichannel data.
amsehili@349	160
amsehili@349	161 Alternatively you can use:
amsehili@349	162
amsehili@349	163 .. code:: python
amsehili@349	164
amsehili@349	165 import numpy as np
amsehili@349	166 region = AudioRegion.load("audio.wav")
amsehili@349	167 samples = np.asarray(region)
amsehili@349	168
amsehili@349	169 License
amsehili@349	170 -------
amsehili@349	171 MIT.
amsehili@349	172

Mercurial > hg > auditok

annotate README.rst @ 368:683c98b7f5a6