auditok: README.rst annotate

annotate README.rst @ 387:bd242e80455f

Update documentation and configuration

author	Amine Sehili <amine.sehili@gmail.com>
date	Tue, 02 Mar 2021 20:10:50 +0100
parents	c030134b7870
children	5fd9b6b7ff0d 30f276d1bddf

rev	line source
amine@344	1 .. image:: doc/figures/auditok-logo.png
amsehili@343	2 :align: center
amsehili@343	3
amine@336	4 .. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master
amine@336	5 :target: https://travis-ci.org/amsehili/auditok
amine@336	6
amine@336	7 .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
amine@336	8 :target: http://auditok.readthedocs.org/en/latest/?badge=latest
amine@336	9 :alt: Documentation Status
amine@336	10
amine@374	11 ``auditok`` is an Audio Activity Detection tool that can process online data
amine@374	12 (read from an audio device or from standard input) as well as audio files.
amine@387	13 It can be used as a command-line program or by calling its API.
amsehili@349	14
amine@374	15 The latest version of the documentation can be found on
amine@374	16 `readthedocs. <https://readthedocs.org/projects/auditok/badge/?version=latest>`_
amsehili@349	17
amsehili@349	18
amsehili@349	19 Installation
amsehili@349	20 ------------
amsehili@349	21
amine@374	22 A basic version of ``auditok`` will run with standard Python (>=3.4). However,
amine@374	23 without installing additional dependencies, ``auditok`` can only deal with audio
amine@374	24 files in wav or raw formats. if you want more features, the following
amine@374	25 packages are needed:
amine@374	26
amine@375	27 - `pydub <https://github.com/jiaaro/pydub>`_ : read audio files in popular audio formats (ogg, mp3, etc.) or extract audio from a video file.
amine@377	28 - `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ : read audio data from the microphone and play audio back.
amine@375	29 - `tqdm <https://github.com/tqdm/tqdm>`_ : show progress bar while playing audio clips.
amine@375	30 - `matplotlib <https://matplotlib.org/stable/index.html>`_ : plot audio signal and detections.
amine@375	31 - `numpy <https://numpy.org/>`_ : required by matplotlib. Also used for some math operations instead of standard python if available.
amine@375	32
amine@375	33 Install the latest stable version with pip:
amine@375	34
amsehili@383	35
amine@375	36 .. code:: bash
amine@375	37
amine@375	38 sudo pip install auditok
amine@375	39
amsehili@383	40
amsehili@383	41 Install with the latest development version from github:
amine@374	42
amsehili@349	43 .. code:: bash
amsehili@349	44
amsehili@354	45 pip install git+https://github.com/amsehili/auditok
amsehili@354	46
amine@375	47 or
amine@375	48
amine@375	49 .. code:: bash
amine@375	50
amine@375	51 git clone https://github.com/amsehili/auditok.git
amine@375	52 cd auditok
amine@375	53 python setup.py install
amine@375	54
amsehili@349	55
amsehili@343	56 Basic example
amsehili@343	57 -------------
amsehili@343	58
amine@336	59 .. code:: python
amine@336	60
amine@374	61 import auditok
amsehili@343	62
amsehili@343	63 # split returns a generator of AudioRegion objects
amine@374	64 audio_regions = auditok.split(
amine@374	65 "audio.wav",
amine@374	66 min_dur=0.2, # minimum duration of a valid audio event in seconds
amine@374	67 max_dur=4, # maximum duration of an event
amine@374	68 max_silence=0.3, # maximum duration of tolerated continuous silence within an event
amine@374	69 energy_threshold=55 # threshold of detection
amine@374	70 )
amine@375	71
amine@375	72 for i, r in enumerate(audio_regions):
amine@375	73
amine@376	74 # Regions returned by `split` have 'start' and 'end' metadata fields
amine@375	75 print("Region {i}: {r.meta.start:.3f}s -- {r.meta.end:.3f}s".format(i=i, r=r))
amine@375	76
amine@375	77 # play detection
amine@375	78 # r.play(progress_bar=True)
amine@375	79
amine@376	80 # region's metadata can also be used with the `save` method
amine@375	81 # (no need to explicitly specify region's object and `format` arguments)
amine@375	82 filename = r.save("region_{meta.start:.3f}-{meta.end:.3f}.wav")
amine@336	83 print("region saved as: {}".format(filename))
amine@336	84
amine@375	85 output example:
amine@375	86
amine@375	87 .. code:: bash
amine@375	88
amine@375	89 Region 0: 0.700s -- 1.400s
amine@375	90 region saved as: region_0.700-1.400.wav
amine@375	91 Region 1: 3.800s -- 4.500s
amine@375	92 region saved as: region_3.800-4.500.wav
amine@375	93 Region 2: 8.750s -- 9.950s
amine@375	94 region saved as: region_8.750-9.950.wav
amine@375	95 Region 3: 11.700s -- 12.400s
amine@375	96 region saved as: region_11.700-12.400.wav
amine@375	97 Region 4: 15.050s -- 15.850s
amine@375	98 region saved as: region_15.050-15.850.wav
amine@375	99
amine@375	100
amine@374	101 Split and plot
amine@374	102 --------------
amine@336	103
amine@375	104 Visualize audio signal and detections:
amine@375	105
amine@336	106 .. code:: python
amine@336	107
amine@374	108 import auditok
amine@374	109 region = auditok.load("audio.wav") # returns an AudioRegion object
amine@375	110 regions = region.split_and_plot(...) # or just region.splitp()
amine@336	111
amsehili@349	112 output figure:
amine@336	113
amine@336	114 .. image:: doc/figures/example_1.png
amsehili@349	115
amine@375	116
amine@374	117 Limitations
amine@374	118 -----------
amsehili@349	119
amine@374	120 Currently, the core detection algorithm is based on the energy of audio signal.
amine@374	121 While this is fast and works very well for audio streams with low background
amine@374	122 noise (e.g., podcasts with few people talking, language lessons, audio recorded
amine@374	123 in a rather quiet environment, etc.) the performance can drop as the level of
amine@374	124 noise increases. Furthermore, the algorithm makes now distinction between speech
amine@374	125 and other kinds of sounds, so you shouldn't use it for Voice Activity Detection
amine@376	126 if your audio data also contain non-speech events.
amsehili@349	127
amsehili@349	128 License
amsehili@349	129 -------
amsehili@349	130 MIT.

Mercurial > hg > auditok

annotate README.rst @ 387:bd242e80455f