Mercurial > hg > auditok

--- a/README.rst	Wed Oct 30 13:21:58 2024 +0100
+++ b/README.rst	Wed Oct 30 17:21:30 2024 +0100
@@ -17,7 +17,7 @@
 Installation
 ------------

-``auditok`` requires Python 3.7+.
+``auditok`` requires Python 3.7 or higher.

 To install the latest stable version, use pip:
--- a/doc/command_line_usage.rst	Wed Oct 30 13:21:58 2024 +0100
+++ b/doc/command_line_usage.rst	Wed Oct 30 17:21:30 2024 +0100
@@ -1,53 +1,56 @@
-``auditok`` can also be used from the command-line. For more information about
-parameters and their description type:
+Command-line guide
+==================

+``auditok`` can also be used from the command line. For information
+about available parameters and descriptions, type:

 .. code:: bash

     auditok -h

-In the following we'll a few examples that covers most use-cases.
+Below, we provide several examples covering the most common use cases.


-Read and split audio data online
---------------------------------
+Read audio data and detect audio events online
+----------------------------------------------

-To try ``auditok`` from the command line with you voice, you should either
-install `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ so that ``auditok``
-can directly read data from the microphone, or record data with an external program
-(e.g., `sox`) and redirect its output to ``auditok``.
+To try ``auditok`` from the command line with your own voice, you’ll need to
+either install `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ so
+that ``auditok`` can read directly from the microphone, or record audio with
+an external program (e.g., `sox`) and redirect its output to ``auditok``.

-Read data from the microphone (`pyaudio` installed):
+To read data directly from the microphone and use default parameters for audio
+data and tokenization, simply type:

 .. code:: bash

     auditok

-This will print the *id*, *start time* and *end time* of each detected audio
-event. Note that we didn't pass any additional arguments to the previous command,
-so ``auditok`` will use default values. The most important arguments are:
+This will print the **id**, **start time**, and **end time** of each detected
+audio event. As mentioned above, no additional arguments were passed in the
+previous command, so ``auditok`` will use its default values. The most important
+arguments are:


-- ``-n``, ``--min-duration`` : minimum duration of a valid audio event in seconds, default: 0.2
-- ``-m``, ``--max-duration`` : maximum duration of a valid audio event in seconds, default: 5
-- ``-s``, ``--max-silence`` : maximum duration of a consecutive silence within a valid audio event in seconds, default: 0.3
-- ``-e``, ``--energy-threshold`` : energy threshold for detection, default: 50
+- ``-n``, ``--min-duration``: minimum duration of a valid audio event in seconds, default: 0.2
+- ``-m``, ``--max-duration``: maximum duration of a valid audio event in seconds, default: 5
+- ``-s``, ``--max-silence``: maximum duration of a continuous silence within a valid audio event in seconds, default: 0.3
+- ``-e``, ``--energy-threshold``: energy threshold for detection, default: 50


 Read audio data with an external program
 ----------------------------------------
-
-If you don't have `pyaudio`, you can use `sox` for data acquisition
-(`sudo apt-get install sox`) and make ``auditok`` read data from standard input:
+You can use an external program, such as `sox` (``sudo apt-get install sox``),
+to record audio data in real-time, redirect it, and have `auditok` read the data
+from standard input:

 .. code:: bash

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok - -r 16000 -w 2 -c 1

-Note that when data is read from standard input, the same audio parameters must
-be used for both `sox` (or any other data generation/acquisition tool) and
-``auditok``. The following table summarizes audio parameters.
-
+Note that when reading data from standard input, the same audio parameters must
+be set for both `sox` (or any other data generation/acquisition tool) and ``auditok``.
+The following table provides a summary of the audio parameters:

 +-----------------+------------+------------------+-----------------------+
 | Audio parameter | sox option | `auditok` option | `auditok` default     |
@@ -61,17 +64,17 @@
 | Encoding        | -e         | NA               | always a signed int   |
 +-----------------+------------+------------------+-----------------------+

-According to this table, the previous command can be run with the default
-parameters as:
+Based on the table, the previous command can be run with the default parameters as:

 .. code:: bash

-    rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i -
+    rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -
+

 Play back audio detections
 --------------------------

-Use the ``-E`` option (for echo):
+Use the ``-E`` (or ``--echo``) option :

 .. code:: bash

@@ -79,11 +82,6 @@
     # or
     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok - -E

-The second command works without further argument because data is recorded with
-``auditok``'s default audio parameters . If one of the parameters is not at the
-default value you should specify it alongside ``-E``.
-
-

 Using ``-E`` requires `pyaudio`, if it's not installed you can use the ``-C``
 (used to run an external command with detected audio event as argument):
@@ -101,10 +99,10 @@
 Print out detection information
 -------------------------------

-By default ``auditok`` prints out the **id**, the **start** and the **end** of
-each detected audio event. The latter two values represent the absolute position
-of the event within input stream (file or microphone) in seconds. The following
-listing is an example output with the default format:
+By default, ``auditok`` outputs the **id**, **start**, and **end** times for each
+detected audio event. The start and end values indicate the beginning and end of
+the event within the input stream (file or microphone) in seconds. Below is an
+example of the output in the default format:

 .. code:: bash

@@ -123,7 +121,7 @@

     auditok audio.wav  --printf "{id}: [{timestamp}] start:{start}, end:{end}, dur: {duration}"

-the output would be something like:
+the output will look like:

 .. code:: bash

@@ -145,7 +143,7 @@
 ---------------

 You can save audio events to disk as they're detected using ``-o`` or
-``--save-detections-as``. To get a uniq file name for each event, you can use
+``--save-detections-as``. To create a uniq file name for each event, you can use
 ``{id}``, ``{start}``, ``{end}`` and ``{duration}`` placeholders. Example:


@@ -153,9 +151,9 @@

     auditok --save-detections-as "{id}_{start}_{end}.wav"

-When using ``{start}``, ``{end}`` and ``{duration}`` placeholders, it's
-recommended that the number of decimals of the corresponding values be limited
-to 3. You can use something like:
+When using ``{start}``, ``{end}``, and ``{duration}`` placeholders, it is
+recommended to limit the number of decimal places for these values to 3. You
+can do this with a format like:

 .. code:: bash

@@ -165,22 +163,39 @@
 Save whole audio stream
 -----------------------

-When reading audio data from the microphone, you most certainly want to save it
-to disk. For this you can use the ``-O`` or ``--save-stream`` option.
+When reading audio data from the microphone, you may want to save it to disk.
+To do this, use the ``-O`` or ``--save-stream`` option:

 .. code:: bash

-    auditok --save-stream "stream.wav"
+    auditok --save-stream output.wav

-Note this will work even if you read data from another file on disk.
+Note that this will work even if you read data from a file on disk.


+Join detected audio events with a silence of a given duration
+-------------------------------------------------------------
+
+Sometimes, you may want to detect audio events while also
+creating a file that contains the same events with modified
+pause durations.
+
+To do this, use the ``-j`` or ``--join-detections`` option together
+with the ``-O`` / ``--save-stream`` option. In the example below, we
+read data from `input.wav` and save audio events to `output.wav`, adding
+1-second pauses between them:
+
+
+.. code:: bash
+
+    auditok input.wav --join-detections 1 -O output.wav
+
 Plot detections
 ---------------

 Audio signal and detections can be plotted using the ``-p`` or ``--plot`` option.
 You can also save plot to disk using ``--save-image``. The following example
-does both:
+demonstrates both:

 .. code:: bash
--- a/doc/examples.rst	Wed Oct 30 13:21:58 2024 +0100
+++ b/doc/examples.rst	Wed Oct 30 17:21:30 2024 +0100
@@ -1,38 +1,54 @@
 Load audio data
 ---------------

-Audio data is loaded with the :func:`load` function which can read from audio
-files, the microphone or use raw audio data.
+Audio data is loaded using the :func:`load` function, which can read from
+audio files, capture from the microphone, or accept raw audio data
+(as a ``bytes`` object).

 From a file
 ===========

-If the first argument of :func:`load` is a string, it should be a path to an
-audio file.
+If the first argument of :func:`load` is a string or a `Path`, it should
+refer to an existing audio file.

 .. code:: python

     import auditok
     region = auditok.load("audio.ogg")

-If input file contains raw (headerless) audio data, passing `audio_format="raw"`
-and other audio parameters (`sampling_rate`, `sample_width` and `channels`) is
-mandatory. In the following example we pass audio parameters with their short
-names:
+If the input file contains raw (headerless) audio data, specifying audio
+parameters (``sampling_rate``, ``sample_width``, and ``channels``) is required.
+Additionally, if the file name does not end with 'raw', you should explicitly
+pass `audio_format="raw"` to the function.
+
+In the example below, we provide audio parameters using their abbreviated names:

 .. code:: python

     region = auditok.load("audio.dat",
                           audio_format="raw",
                           sr=44100, # alias for `sampling_rate`
-                          sw=2      # alias for `sample_width`
+                          sw=2,      # alias for `sample_width`
                           ch=1      # alias for `channels`
                           )

+Alternatively you can user :class:`AudioRegion` to load audio data:
+
+.. code:: python
+
+    from auditok import AudioRegion
+    region = AudioRegion.load("audio.dat",
+                              audio_format="raw",
+                              sr=44100, # alias for `sampling_rate`
+                              sw=2,      # alias for `sample_width`
+                              ch=1      # alias for `channels`
+                              )
+
+
 From a `bytes` object
 =====================

-If the type of the first argument `bytes`, it's interpreted as raw audio data:
+If the first argument is of type `bytes`, it is interpreted as raw audio data:

 .. code:: python

@@ -43,7 +59,7 @@
     region = auditok.load(data, sr=sr, sw=sw, ch=ch)
     print(region)
     # alternatively you can use
-    #region = auditok.AudioRegion(data, sr, sw, ch)
+    region = auditok.AudioRegion(data, sr, sw, ch)

 output:

@@ -54,9 +70,9 @@
 From the microphone
 ===================

-If the first argument is `None`, :func:`load` will try to read data from the
-microphone. Audio parameters, as well as the `max_read` parameter are mandatory:
-
+If the first argument is `None`, :func:`load` will attempt to read data from the
+microphone. In this case, audio parameters, along with the `max_read` parameter,
+are required.

 .. code:: python

@@ -76,8 +92,8 @@
 Skip part of audio data
 =======================

-If the `skip` parameter is > 0, :func:`load` will skip that amount  in seconds
-of leading audio data:
+If the ``skip`` parameter is greater than 0, :func:`load` will skip that specified
+amount of leading audio data, measured in seconds:

 .. code:: python

@@ -90,7 +106,7 @@
 Limit the amount of read audio
 ==============================

-If the `max_read` parameter is > 0, :func:`load` will read at most that amount
+If the ``max_read`` parameter is > 0, :func:`load` will read at most that amount
 in seconds of audio data:

 .. code:: python
@@ -99,67 +115,64 @@
     region = auditok.load("audio.ogg", max_read=5)
     assert region.duration <= 5

-This argument is mandatory when reading data from the microphone.
+This argument is required when reading data from the microphone.


 Basic split example
 -------------------

-In the following we'll use the :func:`split` function to tokenize an audio file,
-requiring that valid audio events be at least 0.2 second long, at most 4 seconds
-long and contain a maximum of 0.3 second of continuous silence. Limiting the size
-of detected events to 4 seconds means that an event of, say, 9.5 seconds will
-be returned as two 4-second events plus a third 1.5-second event. Moreover, a
-valid event might contain many *silences* as far as none of them exceeds 0.3
-second.
+In the following example, we'll use the :func:`split` function to tokenize an
+audio file.We’ll specify that valid audio events must be at least 0.2 seconds
+long, no longer than 4 seconds, and contain no more than 0.3 seconds of continuous
+silence. By setting a 4-second limit, an event lasting 9.5 seconds, for instance,
+will be returned as two 4-second events plus a final 1.5-second event. Additionally,
+a valid event may contain multiple silences, as long as none exceed 0.3 seconds.

-:func:`split` returns a generator of :class:`AudioRegion`. An :class:`AudioRegion`
-can be played, saved, repeated (i.e., multiplied by an integer) and concatenated
-with another region (see examples below). Notice that :class:`AudioRegion` objects
-returned by :func:`split` have a ``start`` a ``stop`` information stored in
-their meta data that can be accessed like `object.meta.start`.
+:func:`split` returns a generator of :class:`AudioRegion` objects. Each
+:class:`AudioRegion` can be played, saved, repeated (multiplied by an integer),
+and concatenated with another region (see examples below). Note that
+:class:`AudioRegion` objects returned by :func:`split` include `start` and `stop`
+attributes, which mark the beginning and end of the audio event relative to the
+input audio stream.

 .. code:: python

     import auditok

-    # split returns a generator of AudioRegion objects
-    audio_regions = auditok.split(
+    # `split` returns a generator of AudioRegion objects
+    audio_events = auditok.split(
         "audio.wav",
-        min_dur=0.2,     # minimum duration of a valid audio event in seconds
-        max_dur=4,       # maximum duration of an event
-        max_silence=0.3, # maximum duration of tolerated continuous silence within an event
-        energy_threshold=55 # threshold of detection
+        min_dur=0.2,     # Minimum duration of a valid audio event in seconds
+        max_dur=4,       # Maximum duration of an event
+        max_silence=0.3, # Maximum tolerated silence duration within an event
+        energy_threshold=55 # Detection threshold
     )

-    for i, r in enumerate(audio_regions):
+    for i, r in enumerate(audio_events):
+        # AudioRegions returned by `split` have defined 'start' and 'end' attributes
+        print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")

-        # Regions returned by `split` have 'start' and 'end' metadata fields
-        print("Region {i}: {r.meta.start:.3f}s -- {r.meta.end:.3f}s".format(i=i, r=r))
+        # Play the audio event
+        r.play(progress_bar=True)

-        # play detection
-        # r.play(progress_bar=True)
+        # Save the event with start and end times in the filename
+        filename = r.save("event_{start:.3f}-{end:.3f}.wav")
+        print(f"Event saved as: {filename}")

-        # region's metadata can also be used with the `save` method
-        # (no need to explicitly specify region's object and `format` arguments)
-        filename = r.save("region_{meta.start:.3f}-{meta.end:.3f}.wav")
-        print("region saved as: {}".format(filename))
-
-output example:
+Example output:

 .. code:: bash

-    Region 0: 0.700s -- 1.400s
-    region saved as: region_0.700-1.400.wav
-    Region 1: 3.800s -- 4.500s
-    region saved as: region_3.800-4.500.wav
-    Region 2: 8.750s -- 9.950s
-    region saved as: region_8.750-9.950.wav
-    Region 3: 11.700s -- 12.400s
-    region saved as: region_11.700-12.400.wav
-    Region 4: 15.050s -- 15.850s
-    region saved as: region_15.050-15.850.wav
-
+    Event 0: 0.700s -- 1.400s
+    Event saved as: event_0.700-1.400.wav
+    Event 1: 3.800s -- 4.500s
+    Event saved as: event_3.800-4.500.wav
+    Event 2: 8.750s -- 9.950s
+    Event saved as: event_8.750-9.950.wav
+    Event 3: 11.700s -- 12.400s
+    Event saved as: event_11.700-12.400.wav
+    Event 4: 15.050s -- 15.850s
+    Event saved as: event_15.050-15.850.wav

 Split and plot
 --------------
@@ -176,11 +189,36 @@

 .. image:: figures/example_1.png

+Split an audio stream and re-join (glue) audio events with silence
+------------------------------------------------------------------
+
+The following code detects audio events within an audio stream, then insert
+1 second of silence between them to create an audio with pauses:
+
+.. code:: python
+
+    # Create a 1-second silent audio region
+    # Audio parameters must match the original stream
+    from auditok import split, make_silence
+    silence = make_silence(duration=1,
+                           sampling_rate=16000,
+                           sample_width=2,
+                           channels=1)
+    events = split("audio.wav")
+    audio_with_pauses = silence.join(events)
+
+Alternatively, use ``split_and_join_with_silence``:
+
+.. code:: python
+
+    from auditok import split_and_join_with_silence
+    audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")
+

 Read and split data from the microphone
 ---------------------------------------

-If the first argument of :func:`split` is None, audio data is read from the
+If the first argument of :func:`split` is ``None``, audio data is read from the
 microphone (requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_):

 .. code:: python
@@ -200,15 +238,16 @@
          pass


-:func:`split` will continue reading audio data until you press ``Ctrl-C``. If
-you want to read a specific amount of audio data, pass the desired number of
-seconds with the `max_read` argument.
+:func:`split` will continue reading audio data until you press ``Ctrl-C``. To read
+a specific amount of audio data, pass the desired number of seconds using the
+`max_read` argument.


 Access recorded data after split
 --------------------------------

-Using a :class:`Recorder` object you can get hold of acquired audio data:
+Using a :class:`Recorder` object you can access to audio data read from a file
+of from the mirophone. With the following code press ``Ctrl-C`` to stop recording:


 .. code:: python
@@ -221,11 +260,13 @@
     eth = 55 # alias for energy_threshold, default value is 50

     rec = auditok.Recorder(input=None, sr=sr, sw=sw, ch=ch)
+    events = []

     try:
         for region in auditok.split(rec, sr=sr, sw=sw, ch=ch, eth=eth):
             print(region)
-            region.play(progress_bar=True) # progress bar requires `tqdm`
+            region.play(progress_bar=True)
+            events.append(region)
     except KeyboardInterrupt:
          pass

@@ -233,6 +274,7 @@
     full_audio = load(rec.data, sr=sr, sw=sw, ch=ch)
     # alternatively you can use
     full_audio = auditok.AudioRegion(rec.data, sr, sw, ch)
+    full_audio.play(progress_bar=True)


 :class:`Recorder` also accepts a `max_read` argument.
@@ -240,9 +282,8 @@
 Working with AudioRegions
 -------------------------

-The following are a couple of interesting operations you can do with
-:class:`AudioRegion` objects.
-
+In the following sections, we will review several operations
+that can be performed with :class:AudioRegion objects.

 Basic region information
 ========================
@@ -257,6 +298,9 @@
     region.sample_width # alias `sw`
     region.channels # alias `ch`

+When an audio region is returned by the :func:`split` function, it includes defined
+``start`` and ``end`` attributes that refer to the beginning and end of the audio
+event relative to the input audio stream.

 Concatenate regions
 ===================
@@ -268,7 +312,8 @@
     region_2 = auditok.load("audio_2.wav")
     region_3 = region_1 + region_2

-Particularly useful if you want to join regions returned by :func:`split`:
+This is particularly useful when you want to join regions returned by the
+:func:`split` function:

 .. code:: python

@@ -290,8 +335,7 @@
 Split one region into N regions of equal size
 =============================================

-Divide by a positive integer (this has nothing to do with silence-based
-tokenization):
+Divide by a positive integer (this is unrelated to silence-based tokenization!):

 .. code:: python

@@ -300,21 +344,21 @@
     regions = regions / 5
     assert sum(regions) == region

-Note that if no perfect division is possible, the last region might be a bit
-shorter than the previous N-1 regions.
+Note that if an exact split is not possible, the last region may be shorter
+than the preceding N-1 regions.

 Slice a region by samples, seconds or milliseconds
 ==================================================

-Slicing an :class:`AudioRegion` can be interesting in many situations. You can for
-example remove a fixed-size portion of audio data from the beginning or from the
-end of a region or crop a region by an arbitrary amount as a data augmentation
-strategy.
+Slicing an :class:`AudioRegion` can be useful in various situations.
+For example, you can remove a fixed-length portion of audio data from
+the beginning or end of a region, or crop a region by an arbitrary amount
+as a data augmentation strategy.

-The most accurate way to slice an `AudioRegion` is to use indices that
-directly refer to raw audio samples. In the following example, assuming that the
-sampling rate of audio data is 16000, you can extract a 5-second region from
-main region, starting from the 20th second as follows:
+The most accurate way to slice an `AudioRegion` is by using indices that
+directly refer to raw audio samples. In the following example, assuming
+the audio data has a sampling rate of 16000, you can extract a 5-second
+segment from the main region, starting at the 20th second, as follows:

 .. code:: python

@@ -324,9 +368,9 @@
     stop = 25 * 16000
     five_second_region = region[start:stop]

-This allows you to practically start and stop at any audio sample within the region.
-Just as with a `list` you can omit one of `start` and `stop`, or both. You can
-also use negative indices:
+This allows you to start and stop at any audio sample within the region. Similar
+to a ``list``, you can omit either ``start`` or ``stop``, or both. Negative
+indices are also supported:

 .. code:: python

@@ -335,9 +379,9 @@
     start = -3 * region.sr # `sr` is an alias of `sampling_rate`
     three_last_seconds = region[start:]

-While slicing by raw samples is flexible, slicing with temporal indices is more
-intuitive. You can do so by accessing the ``millis`` or ``seconds`` views of an
-`AudioRegion` (or their shortcut alias `ms` and `sec` or `s`).
+While slicing by raw samples offers flexibility, using temporal indices is
+often more intuitive. You can achieve this by accessing the ``millis`` or ``seconds``
+*views* of an :class:`AudioRegion` (or using their shortcut aliases ``ms``, ``sec``, or ``s``).

 With the ``millis`` view:

@@ -346,6 +390,8 @@
     import auditok
     region = auditok.load("audio.wav")
     five_second_region = region.millis[5000:10000]
+    # or
+    five_second_region = region.ms[5000:10000]

 or with the ``seconds`` view:

@@ -354,6 +400,10 @@
     import auditok
     region = auditok.load("audio.wav")
     five_second_region = region.seconds[5:10]
+    # or
+    five_second_region = region.sec[5:10]
+    # or
+    five_second_region = region.s[5:10]

 ``seconds`` indices can also be floats:

@@ -363,27 +413,13 @@
     region = auditok.load("audio.wav")
     five_second_region = region.seconds[2.5:7.5]

-Get arrays of audio samples
-===========================
-
-If `numpy` is not installed, the `samples` attributes is a list of audio samples
-arrays (standard `array.array` objects), one per channels. If numpy is installed,
-`samples` is a 2-D `numpy.ndarray` where the fist dimension is the channel
-and the second is the the sample.
+Export an ``AudioRegion`` as a ``numpy`` array
+==============================================

 .. code:: python

-    import auditok
-    region = auditok.load("audio.wav")
-    samples = region.samples
-    assert len(samples) == region.channels
-
-
-If `numpy` is installed you can use:
-
-.. code:: python
-
-    import numpy as np
-    region = auditok.load("audio.wav")
-    samples = np.asarray(region)
-    assert len(samples.shape) == 2
+    from auditok import load, AudioRegion
+    audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
+    x = audio.numpy()
+    assert x.shape[0] == audio.channels
+    assert x.shape[1] == len(audio)
--- a/doc/index.rst	Wed Oct 30 13:21:58 2024 +0100
+++ b/doc/index.rst	Wed Oct 30 17:21:30 2024 +0100
@@ -1,8 +1,8 @@
 auditok, an AUDIo TOKenization tool
 ===================================

-.. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master
-    :target: https://travis-ci.org/amsehili/auditok
+.. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg
+    :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/
     :alt: Build Status

 .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
@@ -11,9 +11,10 @@


-``auditok`` is an **Audio Activity Detection** tool that can process online data
-(read from an audio device or from standard input) as well as audio files. It
-can be used as a command line program or by calling its API.
+```auditok`` is an **Audio Activity Detection** tool that processes online data
+(from an audio device or standard input) and audio files. It can be used via the command line or through its API.
+
+Full documentation is available on `Read the Docs <https://auditok.readthedocs.io/en/latest/>`_.


 .. toctree::
@@ -39,8 +40,8 @@
     util
     io
     signal
-    dataset

 License
 -------
+
 MIT.
--- a/doc/installation.rst	Wed Oct 30 13:21:58 2024 +0100
+++ b/doc/installation.rst	Wed Oct 30 17:21:30 2024 +0100
@@ -1,31 +1,31 @@
 Installation
 ------------

-A basic version of ``auditok`` will run with standard Python (>=3.4). However,
-without installing additional dependencies, ``auditok`` can only deal with audio
-files in *wav* or *raw* formats. if you want more features, the following
-packages are needed:
+**Dependencies**

-- `pydub <https://github.com/jiaaro/pydub>`_ : read audio files in popular audio formats (ogg, mp3, etc.) or extract audio from a video file.
-- `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ : read audio data from the microphone and play audio back.
-- `tqdm <https://github.com/tqdm/tqdm>`_ : show progress bar while playing audio clips.
-- `matplotlib <https://matplotlib.org/stable/index.html>`_ : plot audio signal and detections.
-- `numpy <https://numpy.org/>`_ : required by matplotlib. Also used for some math operations instead of standard python if available.
+The following dependencies are required by ``auditok`` and will be installed automatically:

+- `numpy <https://numpy.org/>`_: Used for signal processing.
+- `pydub <https://github.com/jiaaro/pydub>`_: to read audio files in popular formats (e.g., ogg, mp3) or extract audio from video files.
+- `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_: to read audio data from the microphone and play audio back.
+- `tqdm <https://github.com/tqdm/tqdm>`_: to display a progress bar while playing audio clips.
+- `matplotlib <https://matplotlib.org/stable/index.html>`_: to plot audio signal and detections.

-Install the latest stable version with pip:
+``auditok`` requires Python 3.7 or higher.
+
+To install the latest stable version, use pip:

 .. code:: bash

     sudo pip install auditok

-Install with the latest development version from github:
+To install the latest development version from GitHub:

 .. code:: bash

     pip install git+https://github.com/amsehili/auditok

-or
+Alternatively, clone the repository and install it manually:

 .. code:: bash