amine@344: .. image:: doc/figures/auditok-logo.png
amsehili@343:     :align: center
amsehili@343: 
amine@428: .. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg
amine@428:     :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/
amine@428:     :alt: Build Status
amine@336: 
amine@446: .. image:: https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf
amine@446:  :target: https://codecov.io/github/amsehili/auditok
amine@446: 
amine@336: .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
amine@336:     :target: http://auditok.readthedocs.org/en/latest/?badge=latest
amine@428:     :alt: Documentation Status
amine@336: 
amine@428: ``auditok`` is an **Audio Activity Detection** tool that processes online data
amine@428: (from an audio device or standard input) and audio files. It can be used via the command line or through its API.
amsehili@349: 
amine@428: Full documentation is available on `Read the Docs <https://auditok.readthedocs.io/en/latest/>`_.
amsehili@349: 
amsehili@349: Installation
amsehili@349: ------------
amsehili@349: 
amine@432: ``auditok`` requires Python 3.7 or higher.
amine@374: 
amine@428: To install the latest stable version, use pip:
amsehili@383: 
amine@375: .. code:: bash
amine@375: 
amine@375:     sudo pip install auditok
amine@375: 
amine@428: To install the latest development version from GitHub:
amine@374: 
amsehili@349: .. code:: bash
amsehili@349: 
amsehili@354:     pip install git+https://github.com/amsehili/auditok
amsehili@354: 
amine@428: Alternatively, clone the repository and install it manually:
amine@375: 
amine@375: .. code:: bash
amine@375: 
amsehili@381:     pip install git+https://github.com/amsehili/auditok
amsehili@381: or
amsehili@381: .. code:: bash
amsehili@381: 
amine@375:     git clone https://github.com/amsehili/auditok.git
amine@375:     cd auditok
amine@375:     python setup.py install
amine@375: 
amsehili@343: Basic example
amsehili@343: -------------
amsehili@343: 
amine@429: Here's a simple example of using ``auditok`` to detect audio events:
amine@428: 
amine@336: .. code:: python
amine@336: 
amine@374:     import auditok
amsehili@343: 
amine@428:     # `split` returns a generator of AudioRegion objects
amine@428:     audio_events = auditok.split(
amine@374:         "audio.wav",
amine@428:         min_dur=0.2,     # Minimum duration of a valid audio event in seconds
amine@428:         max_dur=4,       # Maximum duration of an event
amine@428:         max_silence=0.3, # Maximum tolerated silence duration within an event
amine@428:         energy_threshold=55 # Detection threshold
amine@374:     )
amine@375: 
amine@428:     for i, r in enumerate(audio_events):
amine@428:         # AudioRegions returned by `split` have defined 'start' and 'end' attributes
amine@428:         print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")
amine@375: 
amine@428:         # Play the audio event
amine@428:         r.play(progress_bar=True)
amine@375: 
amine@428:         # Save the event with start and end times in the filename
amine@428:         filename = r.save("event_{start:.3f}-{end:.3f}.wav")
amine@428:         print(f"Event saved as: {filename}")
amine@375: 
amine@428: Example output:
amine@375: 
amine@375: .. code:: bash
amine@375: 
amine@428:     Event 0: 0.700s -- 1.400s
amine@428:     Event saved as: event_0.700-1.400.wav
amine@428:     Event 1: 3.800s -- 4.500s
amine@428:     Event saved as: event_3.800-4.500.wav
amine@428:     Event 2: 8.750s -- 9.950s
amine@428:     Event saved as: event_8.750-9.950.wav
amine@428:     Event 3: 11.700s -- 12.400s
amine@428:     Event saved as: event_11.700-12.400.wav
amine@428:     Event 4: 15.050s -- 15.850s
amine@428:     Event saved as: event_15.050-15.850.wav
amine@375: 
amine@374: Split and plot
amine@374: --------------
amine@336: 
amine@428: Visualize the audio signal with detected events:
amine@375: 
amine@336: .. code:: python
amine@336: 
amine@374:     import auditok
amine@428:     region = auditok.load("audio.wav") # Returns an AudioRegion object
amine@428:     regions = region.split_and_plot(...) # Or simply use `region.splitp()`
amine@336: 
amine@428: Example output:
amine@336: 
amine@336: .. image:: doc/figures/example_1.png
amsehili@349: 
amine@428: Split an audio stream and re-join (glue) audio events with silence
amine@428: ------------------------------------------------------------------
amine@428: 
amine@429: The following code detects audio events within an audio stream, then insert
amine@429: 1 second of silence between them to create an audio with pauses:
amine@428: 
amine@428: .. code:: python
amine@428: 
amine@428:     # Create a 1-second silent audio region
amine@428:     # Audio parameters must match the original stream
amine@428:     from auditok import split, make_silence
amine@428:     silence = make_silence(duration=1,
amine@428:                            sampling_rate=16000,
amine@428:                            sample_width=2,
amine@428:                            channels=1)
amine@428:     events = split("audio.wav")
amine@428:     audio_with_pauses = silence.join(events)
amine@428: 
amine@429: Alternatively, use ``split_and_join_with_silence``:
amine@428: 
amine@428: .. code:: python
amine@428: 
amine@428:     from auditok import split_and_join_with_silence
amine@428:     audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")
amine@375: 
amine@429: Export an ``AudioRegion`` as a ``numpy`` array
amine@429: ----------------------------------------------
amine@429: 
amine@429: .. code:: python
amine@429: 
amine@429:     from auditok import load, AudioRegion
amine@429:     audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
amine@429:     x = audio.numpy()
amine@429:     assert x.shape[0] == audio.channels
amine@429:     assert x.shape[1] == len(audio)
amine@429: 
amine@429: 
amine@374: Limitations
amine@374: -----------
amsehili@349: 
amine@428: The detection algorithm is based on audio signal energy. While it performs well
amine@428: in low-noise environments (e.g., podcasts, language lessons, or quiet recordings),
amine@428: performance may drop in noisy settings. Additionally, the algorithm does not
amine@428: distinguish between speech and other sounds, so it is not suitable for Voice
amine@428: Activity Detection in multi-sound environments.
amsehili@349: 
amsehili@349: License
amsehili@349: -------
amine@428: 
amsehili@349: MIT.