amine@344: .. image:: doc/figures/auditok-logo.png amsehili@343: :align: center amsehili@343: amine@428: .. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg amine@428: :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/ amine@428: :alt: Build Status amine@336: amine@446: .. image:: https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf amine@446: :target: https://codecov.io/github/amsehili/auditok amine@446: amine@336: .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest amine@336: :target: http://auditok.readthedocs.org/en/latest/?badge=latest amine@428: :alt: Documentation Status amine@336: amine@428: ``auditok`` is an **Audio Activity Detection** tool that processes online data amine@428: (from an audio device or standard input) and audio files. It can be used via the command line or through its API. amsehili@349: amine@428: Full documentation is available on `Read the Docs `_. amsehili@349: amsehili@349: Installation amsehili@349: ------------ amsehili@349: amine@432: ``auditok`` requires Python 3.7 or higher. amine@374: amine@428: To install the latest stable version, use pip: amsehili@383: amine@375: .. code:: bash amine@375: amine@375: sudo pip install auditok amine@375: amine@428: To install the latest development version from GitHub: amine@374: amsehili@349: .. code:: bash amsehili@349: amsehili@354: pip install git+https://github.com/amsehili/auditok amsehili@354: amine@428: Alternatively, clone the repository and install it manually: amine@375: amine@375: .. code:: bash amine@375: amsehili@381: pip install git+https://github.com/amsehili/auditok amsehili@381: or amsehili@381: .. code:: bash amsehili@381: amine@375: git clone https://github.com/amsehili/auditok.git amine@375: cd auditok amine@375: python setup.py install amine@375: amsehili@343: Basic example amsehili@343: ------------- amsehili@343: amine@429: Here's a simple example of using ``auditok`` to detect audio events: amine@428: amine@336: .. code:: python amine@336: amine@374: import auditok amsehili@343: amine@428: # `split` returns a generator of AudioRegion objects amine@428: audio_events = auditok.split( amine@374: "audio.wav", amine@428: min_dur=0.2, # Minimum duration of a valid audio event in seconds amine@428: max_dur=4, # Maximum duration of an event amine@428: max_silence=0.3, # Maximum tolerated silence duration within an event amine@428: energy_threshold=55 # Detection threshold amine@374: ) amine@375: amine@428: for i, r in enumerate(audio_events): amine@428: # AudioRegions returned by `split` have defined 'start' and 'end' attributes amine@428: print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}") amine@375: amine@428: # Play the audio event amine@428: r.play(progress_bar=True) amine@375: amine@428: # Save the event with start and end times in the filename amine@428: filename = r.save("event_{start:.3f}-{end:.3f}.wav") amine@428: print(f"Event saved as: {filename}") amine@375: amine@428: Example output: amine@375: amine@375: .. code:: bash amine@375: amine@428: Event 0: 0.700s -- 1.400s amine@428: Event saved as: event_0.700-1.400.wav amine@428: Event 1: 3.800s -- 4.500s amine@428: Event saved as: event_3.800-4.500.wav amine@428: Event 2: 8.750s -- 9.950s amine@428: Event saved as: event_8.750-9.950.wav amine@428: Event 3: 11.700s -- 12.400s amine@428: Event saved as: event_11.700-12.400.wav amine@428: Event 4: 15.050s -- 15.850s amine@428: Event saved as: event_15.050-15.850.wav amine@375: amine@374: Split and plot amine@374: -------------- amine@336: amine@428: Visualize the audio signal with detected events: amine@375: amine@336: .. code:: python amine@336: amine@374: import auditok amine@428: region = auditok.load("audio.wav") # Returns an AudioRegion object amine@428: regions = region.split_and_plot(...) # Or simply use `region.splitp()` amine@336: amine@428: Example output: amine@336: amine@336: .. image:: doc/figures/example_1.png amsehili@349: amine@428: Split an audio stream and re-join (glue) audio events with silence amine@428: ------------------------------------------------------------------ amine@428: amine@429: The following code detects audio events within an audio stream, then insert amine@429: 1 second of silence between them to create an audio with pauses: amine@428: amine@428: .. code:: python amine@428: amine@428: # Create a 1-second silent audio region amine@428: # Audio parameters must match the original stream amine@428: from auditok import split, make_silence amine@428: silence = make_silence(duration=1, amine@428: sampling_rate=16000, amine@428: sample_width=2, amine@428: channels=1) amine@428: events = split("audio.wav") amine@428: audio_with_pauses = silence.join(events) amine@428: amine@429: Alternatively, use ``split_and_join_with_silence``: amine@428: amine@428: .. code:: python amine@428: amine@428: from auditok import split_and_join_with_silence amine@428: audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav") amine@375: amine@429: Export an ``AudioRegion`` as a ``numpy`` array amine@429: ---------------------------------------------- amine@429: amine@429: .. code:: python amine@429: amine@429: from auditok import load, AudioRegion amine@429: audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")` amine@429: x = audio.numpy() amine@429: assert x.shape[0] == audio.channels amine@429: assert x.shape[1] == len(audio) amine@429: amine@429: amine@374: Limitations amine@374: ----------- amsehili@349: amine@428: The detection algorithm is based on audio signal energy. While it performs well amine@428: in low-noise environments (e.g., podcasts, language lessons, or quiet recordings), amine@428: performance may drop in noisy settings. Additionally, the algorithm does not amine@428: distinguish between speech and other sounds, so it is not suitable for Voice amine@428: Activity Detection in multi-sound environments. amsehili@349: amsehili@349: License amsehili@349: ------- amine@428: amsehili@349: MIT.