Mercurial > hg > auditok
changeset 441:6cf3ea23fadb
Update documentation
author | Amine Sehili <amine.sehili@gmail.com> |
---|---|
date | Thu, 31 Oct 2024 08:26:18 +0100 |
parents | 439463d9cdda |
children | 44dcd2c4d860 |
files | doc/command_line_usage.rst doc/examples.rst |
diffstat | 2 files changed, 39 insertions(+), 36 deletions(-) [+] |
line wrap: on
line diff
--- a/doc/command_line_usage.rst Thu Oct 31 08:09:28 2024 +0100 +++ b/doc/command_line_usage.rst Thu Oct 31 08:26:18 2024 +0100 @@ -11,8 +11,8 @@ Below, we provide several examples covering the most common use cases. -Read audio data and detect audio events online ----------------------------------------------- +Real-Time audio acquisition and event detection +----------------------------------------------- To try ``auditok`` from the command line with your own voice, you’ll need to either install `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ so @@ -96,8 +96,8 @@ program but you can use it to run any other command. -Print out detection information -------------------------------- +Output detection details +------------------------ By default, ``auditok`` outputs the **id**, **start**, and **end** times for each detected audio event. The start and end values indicate the beginning and end of @@ -139,12 +139,14 @@ To completely disable printing detection information use ``-q``. + Save detections --------------- You can save audio events to disk as they're detected using ``-o`` or -``--save-detections-as``. To create a uniq file name for each event, you can use -``{id}``, ``{start}``, ``{end}`` and ``{duration}`` placeholders. Example: +``--save-detections-as`` followed by a file name with placeholders. To create +a uniq file name for each event, you can use ``{id}``, ``{start}``, ``{end}`` +and ``{duration}`` placeholders as in this example: .. code:: bash @@ -160,8 +162,8 @@ auditok -o "{id}_{start:.3f}_{end:.3f}.wav" -Record the full audio stream ----------------------------- +Save the full audio stream +-------------------------- When reading audio data from the microphone, you may want to save it to disk. To do this, use the ``-O`` or ``--save-stream`` option: @@ -176,25 +178,26 @@ Join detected audio events, inserting a silence between them ------------------------------------------------------------ -Sometimes, you may want to detect audio events while also -creating a file that contains the same events with modified -pause durations. +Sometimes, you may want to detect audio events and create a new file containing +these events with pauses of a specific duration between them. This is useful if +you wish to preserve your original audio data while adjusting the length of pauses +(either shortening or extending them). To achieve this, use the ``-j`` or ``--join-detections`` option together with the ``-O`` / ``--save-stream`` option. In the example below, we -read data from `input.wav` and save audio events to `output.wav`, adding +read data from ``input.wav`` and save audio events to ``output.wav``, adding 1-second pauses between them: - .. code:: bash auditok input.wav --join-detections 1 -O output.wav + Plot detections --------------- Audio signal and detections can be plotted using the ``-p`` or ``--plot`` option. -You can also save plot to disk using ``--save-image``. The following example +You can also save the plot to disk using ``--save-image``. The following example demonstrates both: .. code:: bash
--- a/doc/examples.rst Thu Oct 31 08:09:28 2024 +0100 +++ b/doc/examples.rst Thu Oct 31 08:26:18 2024 +0100 @@ -8,7 +8,7 @@ From a file =========== -If the first argument of :func:`load` is a string or a `Path`, it should +If the first argument of :func:`load` is a string or a ``Path``, it should refer to an existing audio file. .. code:: python @@ -19,7 +19,7 @@ If the input file contains raw (headerless) audio data, specifying audio parameters (``sampling_rate``, ``sample_width``, and ``channels``) is required. Additionally, if the file name does not end with 'raw', you should explicitly -pass `audio_format="raw"` to the function. +pass ``audio_format="raw"`` to the function. In the example below, we provide audio parameters using their abbreviated names: @@ -40,15 +40,15 @@ region = AudioRegion.load("audio.dat", audio_format="raw", sr=44100, # alias for `sampling_rate` - sw=2, # alias for `sample_width` + sw=2, # alias for `sample_width` ch=1 # alias for `channels` ) -From a `bytes` object -===================== +From a ``bytes`` object +======================= -If the first argument is of type `bytes`, it is interpreted as raw audio data: +If the first argument is of type ``bytes``, it is interpreted as raw audio data: .. code:: python @@ -70,8 +70,8 @@ From the microphone =================== -If the first argument is `None`, :func:`load` will attempt to read data from the -microphone. In this case, audio parameters, along with the `max_read` parameter, +If the first argument is ``None``, :func:`load` will attempt to read data from the +microphone. In this case, audio parameters, along with the ``max_read`` parameter, are required. .. code:: python @@ -131,7 +131,7 @@ :func:`split` returns a generator of :class:`AudioRegion` objects. Each :class:`AudioRegion` can be played, saved, repeated (multiplied by an integer), and concatenated with another region (see examples below). Note that -:class:`AudioRegion` objects returned by :func:`split` include `start` and `stop` +:class:`AudioRegion` objects returned by :func:`split` include ``start`` and ``stop`` attributes, which mark the beginning and end of the audio event relative to the input audio stream. @@ -157,22 +157,22 @@ # Save the event with start and end times in the filename filename = r.save("event_{start:.3f}-{end:.3f}.wav") - print(f"Event saved as: {filename}") + print(f"event saved as: {filename}") Example output: .. code:: bash Event 0: 0.700s -- 1.400s - Event saved as: event_0.700-1.400.wav + event saved as: event_0.700-1.400.wav Event 1: 3.800s -- 4.500s - Event saved as: event_3.800-4.500.wav + event saved as: event_3.800-4.500.wav Event 2: 8.750s -- 9.950s - Event saved as: event_8.750-9.950.wav + event saved as: event_8.750-9.950.wav Event 3: 11.700s -- 12.400s - Event saved as: event_11.700-12.400.wav + event saved as: event_11.700-12.400.wav Event 4: 15.050s -- 15.850s - Event saved as: event_15.050-15.850.wav + event saved as: event_15.050-15.850.wav Split and plot -------------- @@ -215,8 +215,8 @@ audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav") -Read and split data from the microphone ---------------------------------------- +Read audio data from the microphone and perform real-time event detection +------------------------------------------------------------------------- If the first argument of :func:`split` is ``None``, audio data is read from the microphone (requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_): @@ -240,7 +240,7 @@ :func:`split` will continue reading audio data until you press ``Ctrl-C``. To read a specific amount of audio data, pass the desired number of seconds using the -`max_read` argument. +``max_read`` argument. Access recorded data after split @@ -277,13 +277,13 @@ full_audio.play(progress_bar=True) -:class:`Recorder` also accepts a `max_read` argument. +:class:`Recorder` also accepts a ``max_read`` argument. Working with AudioRegions ------------------------- In the following sections, we will review several operations -that can be performed with :class:AudioRegion objects. +that can be performed with :class:`AudioRegion` objects. Basic region information ======================== @@ -355,8 +355,8 @@ the beginning or end of a region, or crop a region by an arbitrary amount as a data augmentation strategy. -The most accurate way to slice an `AudioRegion` is by using indices that -directly refer to raw audio samples. In the following example, assuming +The most accurate way to slice an :class:`AudioRegion` is by using indices +that directly refer to raw audio samples. In the following example, assuming the audio data has a sampling rate of 16000, you can extract a 5-second segment from the main region, starting at the 20th second, as follows: