annotate doc/command_line_usage.rst @ 448:3911ff1d719d

Merge branch 'master' of https://github.com/amsehili/auditok
author www-data <www-data@c4dm-xenserv-virt2.eecs.qmul.ac.uk>
date Thu, 31 Oct 2024 09:17:59 +0000
parents 6cf3ea23fadb
children f91576bf2a29
rev   line source
amine@432 1 Command-line guide
amine@432 2 ==================
amine@379 3
amine@432 4 ``auditok`` can also be used from the command line. For information
amine@432 5 about available parameters and descriptions, type:
amine@379 6
amine@379 7 .. code:: bash
amine@379 8
amine@379 9 auditok -h
amine@379 10
amine@432 11 Below, we provide several examples covering the most common use cases.
amine@379 12
amine@379 13
amine@441 14 Real-Time audio acquisition and event detection
amine@441 15 -----------------------------------------------
amine@379 16
amine@432 17 To try ``auditok`` from the command line with your own voice, you’ll need to
amine@432 18 either install `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_ so
amine@432 19 that ``auditok`` can read directly from the microphone, or record audio with
amine@432 20 an external program (e.g., `sox`) and redirect its output to ``auditok``.
amine@379 21
amine@432 22 To read data directly from the microphone and use default parameters for audio
amine@432 23 data and tokenization, simply type:
amine@379 24
amine@379 25 .. code:: bash
amine@379 26
amine@379 27 auditok
amine@379 28
amine@432 29 This will print the **id**, **start time**, and **end time** of each detected
amine@432 30 audio event. As mentioned above, no additional arguments were passed in the
amine@432 31 previous command, so ``auditok`` will use its default values. The most important
amine@432 32 arguments are:
amine@379 33
amine@379 34
amine@432 35 - ``-n``, ``--min-duration``: minimum duration of a valid audio event in seconds, default: 0.2
amine@432 36 - ``-m``, ``--max-duration``: maximum duration of a valid audio event in seconds, default: 5
amine@432 37 - ``-s``, ``--max-silence``: maximum duration of a continuous silence within a valid audio event in seconds, default: 0.3
amine@432 38 - ``-e``, ``--energy-threshold``: energy threshold for detection, default: 50
amine@379 39
amine@379 40
amine@379 41 Read audio data with an external program
amine@379 42 ----------------------------------------
amine@432 43 You can use an external program, such as `sox` (``sudo apt-get install sox``),
amine@432 44 to record audio data in real-time, redirect it, and have `auditok` read the data
amine@432 45 from standard input:
amine@379 46
amine@379 47 .. code:: bash
amine@379 48
amine@379 49 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok - -r 16000 -w 2 -c 1
amine@379 50
amine@432 51 Note that when reading data from standard input, the same audio parameters must
amine@432 52 be set for both `sox` (or any other data generation/acquisition tool) and ``auditok``.
amine@432 53 The following table provides a summary of the audio parameters:
amine@379 54
amine@379 55 +-----------------+------------+------------------+-----------------------+
amine@379 56 | Audio parameter | sox option | `auditok` option | `auditok` default |
amine@379 57 +=================+============+==================+=======================+
amine@379 58 | Sampling rate | -r | -r | 16000 |
amine@379 59 +-----------------+------------+------------------+-----------------------+
amine@379 60 | Sample width | -b (bits) | -w (bytes) | 2 |
amine@379 61 +-----------------+------------+------------------+-----------------------+
amine@379 62 | Channels | -c | -c | 1 |
amine@379 63 +-----------------+------------+------------------+-----------------------+
amine@379 64 | Encoding | -e | NA | always a signed int |
amine@379 65 +-----------------+------------+------------------+-----------------------+
amine@379 66
amine@432 67 Based on the table, the previous command can be run with the default parameters as:
amine@379 68
amine@379 69 .. code:: bash
amine@379 70
amine@432 71 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -
amine@432 72
amine@379 73
amine@379 74 Play back audio detections
amine@379 75 --------------------------
amine@379 76
amine@432 77 Use the ``-E`` (or ``--echo``) option :
amine@379 78
amine@379 79 .. code:: bash
amine@379 80
amine@379 81 auditok -E
amine@379 82 # or
amine@379 83 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok - -E
amine@379 84
amine@379 85
amine@379 86 Using ``-E`` requires `pyaudio`, if it's not installed you can use the ``-C``
amine@379 87 (used to run an external command with detected audio event as argument):
amine@379 88
amine@379 89 .. code:: bash
amine@379 90
amine@379 91 rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok - -C "play -q {file}"
amine@379 92
amine@379 93 Using the ``-C`` option, ``auditok`` will save a detected event to a temporary wav
amine@379 94 file, fill the ``{file}`` placeholder with the temporary name and run the
amine@379 95 command. In the above example we used ``-C`` to play audio data with an external
amine@379 96 program but you can use it to run any other command.
amine@379 97
amine@379 98
amine@441 99 Output detection details
amine@441 100 ------------------------
amine@379 101
amine@432 102 By default, ``auditok`` outputs the **id**, **start**, and **end** times for each
amine@432 103 detected audio event. The start and end values indicate the beginning and end of
amine@432 104 the event within the input stream (file or microphone) in seconds. Below is an
amine@432 105 example of the output in the default format:
amine@379 106
amine@379 107 .. code:: bash
amine@379 108
amine@379 109 1 1.160 2.390
amine@379 110 2 3.420 4.330
amine@379 111 3 5.010 5.720
amine@379 112 4 7.230 7.800
amine@379 113
amine@379 114 The format of the output is controlled by the ``--printf`` option. Alongside
amine@379 115 ``{id}``, ``{start}`` and ``{end}`` placeholders, you can use ``{duration}`` and
amine@379 116 ``{timestamp}`` (system timestamp of detected event) placeholders.
amine@379 117
amine@379 118 Using the following format for example:
amine@379 119
amine@379 120 .. code:: bash
amine@379 121
amine@379 122 auditok audio.wav --printf "{id}: [{timestamp}] start:{start}, end:{end}, dur: {duration}"
amine@379 123
amine@432 124 the output will look like:
amine@379 125
amine@379 126 .. code:: bash
amine@379 127
amine@379 128 1: [2021/02/17 20:16:02] start:1.160, end:2.390, dur: 1.230
amine@379 129 2: [2021/02/17 20:16:04] start:3.420, end:4.330, dur: 0.910
amine@379 130 3: [2021/02/17 20:16:06] start:5.010, end:5.720, dur: 0.710
amine@379 131 4: [2021/02/17 20:16:08] start:7.230, end:7.800, dur: 0.570
amine@379 132
amine@379 133
amine@379 134 The format of ``{timestamp}`` is controlled by ``--timestamp-format`` (default:
amine@379 135 `"%Y/%m/%d %H:%M:%S"`) whereas that of ``{start}``, ``{end}`` and ``{duration}``
amine@379 136 by ``--time-format`` (default: `%S`, absolute number of seconds). A more detailed
amine@379 137 format with ``--time-format`` using `%h` (hours), `%m` (minutes), `%s` (seconds)
amine@379 138 and `%i` (milliseconds) directives is possible (e.g., "%h:%m:%s.%i).
amine@379 139
amine@379 140 To completely disable printing detection information use ``-q``.
amine@379 141
amine@441 142
amine@379 143 Save detections
amine@379 144 ---------------
amine@379 145
amine@379 146 You can save audio events to disk as they're detected using ``-o`` or
amine@441 147 ``--save-detections-as`` followed by a file name with placeholders. To create
amine@441 148 a uniq file name for each event, you can use ``{id}``, ``{start}``, ``{end}``
amine@441 149 and ``{duration}`` placeholders as in this example:
amine@379 150
amine@379 151
amine@379 152 .. code:: bash
amine@379 153
amine@379 154 auditok --save-detections-as "{id}_{start}_{end}.wav"
amine@379 155
amine@432 156 When using ``{start}``, ``{end}``, and ``{duration}`` placeholders, it is
amine@432 157 recommended to limit the number of decimal places for these values to 3. You
amine@432 158 can do this with a format like:
amine@379 159
amine@379 160 .. code:: bash
amine@379 161
amine@379 162 auditok -o "{id}_{start:.3f}_{end:.3f}.wav"
amine@379 163
amine@379 164
amine@441 165 Save the full audio stream
amine@441 166 --------------------------
amine@379 167
amine@432 168 When reading audio data from the microphone, you may want to save it to disk.
amine@432 169 To do this, use the ``-O`` or ``--save-stream`` option:
amine@379 170
amine@379 171 .. code:: bash
amine@379 172
amine@432 173 auditok --save-stream output.wav
amine@379 174
amine@432 175 Note that this will work even if you read data from a file on disk.
amine@379 176
amine@379 177
amine@437 178 Join detected audio events, inserting a silence between them
amine@437 179 ------------------------------------------------------------
amine@432 180
amine@441 181 Sometimes, you may want to detect audio events and create a new file containing
amine@441 182 these events with pauses of a specific duration between them. This is useful if
amine@441 183 you wish to preserve your original audio data while adjusting the length of pauses
amine@441 184 (either shortening or extending them).
amine@432 185
amine@437 186 To achieve this, use the ``-j`` or ``--join-detections`` option together
amine@432 187 with the ``-O`` / ``--save-stream`` option. In the example below, we
amine@441 188 read data from ``input.wav`` and save audio events to ``output.wav``, adding
amine@432 189 1-second pauses between them:
amine@432 190
amine@432 191 .. code:: bash
amine@432 192
amine@432 193 auditok input.wav --join-detections 1 -O output.wav
amine@432 194
amine@441 195
amine@379 196 Plot detections
amine@379 197 ---------------
amine@379 198
amine@379 199 Audio signal and detections can be plotted using the ``-p`` or ``--plot`` option.
amine@441 200 You can also save the plot to disk using ``--save-image``. The following example
amine@432 201 demonstrates both:
amine@379 202
amine@379 203 .. code:: bash
amine@379 204
amine@379 205 auditok -p --save-image "plot.png" # can also be 'pdf' or another image format
amine@379 206
amine@379 207 output example:
amine@379 208
amine@379 209 .. image:: figures/example_1.png
amine@379 210
amine@379 211 Plotting requires `matplotlib <https://matplotlib.org/stable/index.html>`_.