Mercurial > hg > auditok

--- a/doc/command_line_usage.rst	Thu Oct 31 08:09:28 2024 +0100
+++ b/doc/command_line_usage.rst	Thu Oct 31 08:26:18 2024 +0100
@@ -11,8 +11,8 @@
 Below, we provide several examples covering the most common use cases.


-Read audio data and detect audio events online
-----------------------------------------------
+Real-Time audio acquisition and event detection
+-----------------------------------------------

 To try ``auditok`` from the command line with your own voice, you’ll need to
 either install `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ so
@@ -96,8 +96,8 @@
 program but you can use it to run any other command.


-Print out detection information
--------------------------------
+Output detection details
+------------------------

 By default, ``auditok`` outputs the **id**, **start**, and **end** times for each
 detected audio event. The start and end values indicate the beginning and end of
@@ -139,12 +139,14 @@

 To completely disable printing detection information use ``-q``.

+
 Save detections
 ---------------

 You can save audio events to disk as they're detected using ``-o`` or
-``--save-detections-as``. To create a uniq file name for each event, you can use
-``{id}``, ``{start}``, ``{end}`` and ``{duration}`` placeholders. Example:
+``--save-detections-as`` followed by a file name with placeholders. To create
+a uniq file name for each event, you can use ``{id}``, ``{start}``, ``{end}``
+and ``{duration}`` placeholders as in this example:


 .. code:: bash
@@ -160,8 +162,8 @@
     auditok -o "{id}_{start:.3f}_{end:.3f}.wav"


-Record the full audio stream
-----------------------------
+Save the full audio stream
+--------------------------

 When reading audio data from the microphone, you may want to save it to disk.
 To do this, use the ``-O`` or ``--save-stream`` option:
@@ -176,25 +178,26 @@
 Join detected audio events, inserting a silence between them
 ------------------------------------------------------------

-Sometimes, you may want to detect audio events while also
-creating a file that contains the same events with modified
-pause durations.
+Sometimes, you may want to detect audio events and create a new file containing
+these events with pauses of a specific duration between them. This is useful if
+you wish to preserve your original audio data while adjusting the length of pauses
+(either shortening or extending them).

 To achieve this, use the ``-j`` or ``--join-detections`` option together
 with the ``-O`` / ``--save-stream`` option. In the example below, we
-read data from `input.wav` and save audio events to `output.wav`, adding
+read data from ``input.wav`` and save audio events to ``output.wav``, adding
 1-second pauses between them:

-
 .. code:: bash

     auditok input.wav --join-detections 1 -O output.wav

+
 Plot detections
 ---------------

 Audio signal and detections can be plotted using the ``-p`` or ``--plot`` option.
-You can also save plot to disk using ``--save-image``. The following example
+You can also save the plot to disk using ``--save-image``. The following example
 demonstrates both:

 .. code:: bash
--- a/doc/examples.rst	Thu Oct 31 08:09:28 2024 +0100
+++ b/doc/examples.rst	Thu Oct 31 08:26:18 2024 +0100
@@ -8,7 +8,7 @@
 From a file
 ===========

-If the first argument of :func:`load` is a string or a `Path`, it should
+If the first argument of :func:`load` is a string or a ``Path``, it should
 refer to an existing audio file.

 .. code:: python
@@ -19,7 +19,7 @@
 If the input file contains raw (headerless) audio data, specifying audio
 parameters (``sampling_rate``, ``sample_width``, and ``channels``) is required.
 Additionally, if the file name does not end with 'raw', you should explicitly
-pass `audio_format="raw"` to the function.
+pass ``audio_format="raw"`` to the function.

 In the example below, we provide audio parameters using their abbreviated names:

@@ -40,15 +40,15 @@
     region = AudioRegion.load("audio.dat",
                               audio_format="raw",
                               sr=44100, # alias for `sampling_rate`
-                              sw=2,      # alias for `sample_width`
+                              sw=2,     # alias for `sample_width`
                               ch=1      # alias for `channels`
                               )


-From a `bytes` object
-=====================
+From a ``bytes`` object
+=======================

-If the first argument is of type `bytes`, it is interpreted as raw audio data:
+If the first argument is of type ``bytes``, it is interpreted as raw audio data:

 .. code:: python

@@ -70,8 +70,8 @@
 From the microphone
 ===================

-If the first argument is `None`, :func:`load` will attempt to read data from the
-microphone. In this case, audio parameters, along with the `max_read` parameter,
+If the first argument is ``None``, :func:`load` will attempt to read data from the
+microphone. In this case, audio parameters, along with the ``max_read`` parameter,
 are required.

 .. code:: python
@@ -131,7 +131,7 @@
 :func:`split` returns a generator of :class:`AudioRegion` objects. Each
 :class:`AudioRegion` can be played, saved, repeated (multiplied by an integer),
 and concatenated with another region (see examples below). Note that
-:class:`AudioRegion` objects returned by :func:`split` include `start` and `stop`
+:class:`AudioRegion` objects returned by :func:`split` include ``start`` and ``stop``
 attributes, which mark the beginning and end of the audio event relative to the
 input audio stream.

@@ -157,22 +157,22 @@

         # Save the event with start and end times in the filename
         filename = r.save("event_{start:.3f}-{end:.3f}.wav")
-        print(f"Event saved as: {filename}")
+        print(f"event saved as: {filename}")

 Example output:

 .. code:: bash

     Event 0: 0.700s -- 1.400s
-    Event saved as: event_0.700-1.400.wav
+    event saved as: event_0.700-1.400.wav
     Event 1: 3.800s -- 4.500s
-    Event saved as: event_3.800-4.500.wav
+    event saved as: event_3.800-4.500.wav
     Event 2: 8.750s -- 9.950s
-    Event saved as: event_8.750-9.950.wav
+    event saved as: event_8.750-9.950.wav
     Event 3: 11.700s -- 12.400s
-    Event saved as: event_11.700-12.400.wav
+    event saved as: event_11.700-12.400.wav
     Event 4: 15.050s -- 15.850s
-    Event saved as: event_15.050-15.850.wav
+    event saved as: event_15.050-15.850.wav

 Split and plot
 --------------
@@ -215,8 +215,8 @@
     audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")


-Read and split data from the microphone
----------------------------------------
+Read audio data from the microphone and perform real-time event detection
+-------------------------------------------------------------------------

 If the first argument of :func:`split` is ``None``, audio data is read from the
 microphone (requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_):
@@ -240,7 +240,7 @@

 :func:`split` will continue reading audio data until you press ``Ctrl-C``. To read
 a specific amount of audio data, pass the desired number of seconds using the
-`max_read` argument.
+``max_read`` argument.


 Access recorded data after split
@@ -277,13 +277,13 @@
     full_audio.play(progress_bar=True)


-:class:`Recorder` also accepts a `max_read` argument.
+:class:`Recorder` also accepts a ``max_read`` argument.

 Working with AudioRegions
 -------------------------

 In the following sections, we will review several operations
-that can be performed with :class:AudioRegion objects.
+that can be performed with :class:`AudioRegion` objects.

 Basic region information
 ========================
@@ -355,8 +355,8 @@
 the beginning or end of a region, or crop a region by an arbitrary amount
 as a data augmentation strategy.

-The most accurate way to slice an `AudioRegion` is by using indices that
-directly refer to raw audio samples. In the following example, assuming
+The most accurate way to slice an :class:`AudioRegion` is by using indices
+that directly refer to raw audio samples. In the following example, assuming
 the audio data has a sampling rate of 16000, you can extract a 5-second
 segment from the main region, starting at the 20th second, as follows: