Mercurial > hg > auditok

diff README.rst @ 428:1baa80ec22c3
Update README
author: Amine Sehili <amine.sehili@gmail.com>
date: Wed, 30 Oct 2024 10:47:56 +0100
parents: c89c0977db47
children: 97eff033c8f8
--- a/README.rst	Tue Oct 29 18:52:13 2024 +0100
+++ b/README.rst	Wed Oct 30 10:47:56 2024 +0100
@@ -1,51 +1,37 @@
 .. image:: doc/figures/auditok-logo.png
     :align: center
-    :alt: Build status
 
-.. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master
-    :target: https://travis-ci.org/amsehili/auditok
+.. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg
+    :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/
+    :alt: Build Status
 
 .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
     :target: http://auditok.readthedocs.org/en/latest/?badge=latest
-    :alt: Documentation status
+    :alt: Documentation Status
 
-``auditok`` is an **Audio Activity Detection** tool that can process online data
-(read from an audio device or from standard input) as well as audio files.
-It can be used as a command-line program or by calling its API.
+``auditok`` is an **Audio Activity Detection** tool that processes online data
+(from an audio device or standard input) and audio files. It can be used via the command line or through its API.
 
-The latest version of the documentation can be found on
-`readthedocs. <https://auditok.readthedocs.io/en/latest/>`_
-
+Full documentation is available on `Read the Docs <https://auditok.readthedocs.io/en/latest/>`_.
 
 Installation
 ------------
 
-A basic version of ``auditok`` will run with standard Python (>=3.4). However,
-without installing additional dependencies, ``auditok`` can only deal with audio
-files in *wav* or *raw* formats. if you want more features, the following
-packages are needed:
+``auditok`` requires Python 3.7+.
 
-- `pydub <https://github.com/jiaaro/pydub>`_ : read audio files in popular audio formats (ogg, mp3, etc.) or extract audio from a video file.
-- `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ : read audio data from the microphone and play audio back.
-- `tqdm <https://github.com/tqdm/tqdm>`_ : show progress bar while playing audio clips.
-- `matplotlib <https://matplotlib.org/stable/index.html>`_ : plot audio signal and detections.
-- `numpy <https://numpy.org/>`_ : required by matplotlib. Also used for some math operations instead of standard python if available.
-
-Install the latest stable version with pip:
-
+To install the latest stable version, use pip:
 
 .. code:: bash
 
     sudo pip install auditok
 
-
-Install the latest development version from github:
+To install the latest development version from GitHub:
 
 .. code:: bash
 
     pip install git+https://github.com/amsehili/auditok
 
-or
+Alternatively, clone the repository and install it manually:
 
 .. code:: bash
 
@@ -53,79 +39,100 @@
     cd auditok
     python setup.py install
 
-
 Basic example
 -------------
 
+Here's a simple example of using `auditok` to detect audio events:
+
 .. code:: python
 
     import auditok
 
-    # split returns a generator of AudioRegion objects
-    audio_regions = auditok.split(
+    # `split` returns a generator of AudioRegion objects
+    audio_events = auditok.split(
         "audio.wav",
-        min_dur=0.2,     # minimum duration of a valid audio event in seconds
-        max_dur=4,       # maximum duration of an event
-        max_silence=0.3, # maximum duration of tolerated continuous silence within an event
-        energy_threshold=55 # threshold of detection
+        min_dur=0.2,     # Minimum duration of a valid audio event in seconds
+        max_dur=4,       # Maximum duration of an event
+        max_silence=0.3, # Maximum tolerated silence duration within an event
+        energy_threshold=55 # Detection threshold
     )
 
-    for i, r in enumerate(audio_regions):
+    for i, r in enumerate(audio_events):
+        # AudioRegions returned by `split` have defined 'start' and 'end' attributes
+        print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")
 
-        # Regions returned by `split` have 'start' and 'end' metadata fields
-        print("Region {i}: {r.meta.start:.3f}s -- {r.meta.end:.3f}s".format(i=i, r=r))
+        # Play the audio event
+        r.play(progress_bar=True)
 
-        # play detection
-        # r.play(progress_bar=True)
+        # Save the event with start and end times in the filename
+        filename = r.save("event_{start:.3f}-{end:.3f}.wav")
+        print(f"Event saved as: {filename}")
 
-        # region's metadata can also be used with the `save` method
-        # (no need to explicitly specify region's object and `format` arguments)
-        filename = r.save("region_{meta.start:.3f}-{meta.end:.3f}.wav")
-        print("region saved as: {}".format(filename))
-
-output example:
+Example output:
 
 .. code:: bash
 
-    Region 0: 0.700s -- 1.400s
-    region saved as: region_0.700-1.400.wav
-    Region 1: 3.800s -- 4.500s
-    region saved as: region_3.800-4.500.wav
-    Region 2: 8.750s -- 9.950s
-    region saved as: region_8.750-9.950.wav
-    Region 3: 11.700s -- 12.400s
-    region saved as: region_11.700-12.400.wav
-    Region 4: 15.050s -- 15.850s
-    region saved as: region_15.050-15.850.wav
-
+    Event 0: 0.700s -- 1.400s
+    Event saved as: event_0.700-1.400.wav
+    Event 1: 3.800s -- 4.500s
+    Event saved as: event_3.800-4.500.wav
+    Event 2: 8.750s -- 9.950s
+    Event saved as: event_8.750-9.950.wav
+    Event 3: 11.700s -- 12.400s
+    Event saved as: event_11.700-12.400.wav
+    Event 4: 15.050s -- 15.850s
+    Event saved as: event_15.050-15.850.wav
 
 Split and plot
 --------------
 
-Visualize audio signal and detections:
+Visualize the audio signal with detected events:
 
 .. code:: python
 
     import auditok
-    region = auditok.load("audio.wav") # returns an AudioRegion object
-    regions = region.split_and_plot(...) # or just region.splitp()
+    region = auditok.load("audio.wav") # Returns an AudioRegion object
+    regions = region.split_and_plot(...) # Or simply use `region.splitp()`
 
-output figure:
+Example output:
 
 .. image:: doc/figures/example_1.png
 
+Split an audio stream and re-join (glue) audio events with silence
+------------------------------------------------------------------
+
+The following detects audio events within an audio stream, then insert
+1 second of silence between them to create an audio with pauses.
+
+.. code:: python
+
+    # Create a 1-second silent audio region
+    # Audio parameters must match the original stream
+    from auditok import split, make_silence
+    silence = make_silence(duration=1,
+                           sampling_rate=16000,
+                           sample_width=2,
+                           channels=1)
+    events = split("audio.wav")
+    audio_with_pauses = silence.join(events)
+
+Alternatively, use `split_and_join_with_silence`:
+
+.. code:: python
+
+    from auditok import split_and_join_with_silence
+    audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")
 
 Limitations
 -----------
 
-Currently, the core detection algorithm is based on the energy of audio signal.
-While this is fast and works very well for audio streams with low background
-noise (e.g., podcasts with few people talking, language lessons, audio recorded
-in a rather quiet environment, etc.) the performance can drop as the level of
-noise increases. Furthermore, the algorithm makes no distinction between speech
-and other kinds of sounds, so you shouldn't use it for Voice Activity Detection
-if your audio data also contain non-speech events.
+The detection algorithm is based on audio signal energy. While it performs well
+in low-noise environments (e.g., podcasts, language lessons, or quiet recordings),
+performance may drop in noisy settings. Additionally, the algorithm does not
+distinguish between speech and other sounds, so it is not suitable for Voice
+Activity Detection in multi-sound environments.
 
 License
 -------
+
 MIT.
author	Amine Sehili <amine.sehili@gmail.com>
date	Wed, 30 Oct 2024 10:47:56 +0100
parents	c89c0977db47
children	97eff033c8f8