Mercurial > hg > auditok
comparison doc/examples.rst @ 441:6cf3ea23fadb
Update documentation
author | Amine Sehili <amine.sehili@gmail.com> |
---|---|
date | Thu, 31 Oct 2024 08:26:18 +0100 |
parents | 81bc2375354f |
children | f9d5eb9387d2 |
comparison
equal
deleted
inserted
replaced
440:439463d9cdda | 441:6cf3ea23fadb |
---|---|
6 (as a ``bytes`` object). | 6 (as a ``bytes`` object). |
7 | 7 |
8 From a file | 8 From a file |
9 =========== | 9 =========== |
10 | 10 |
11 If the first argument of :func:`load` is a string or a `Path`, it should | 11 If the first argument of :func:`load` is a string or a ``Path``, it should |
12 refer to an existing audio file. | 12 refer to an existing audio file. |
13 | 13 |
14 .. code:: python | 14 .. code:: python |
15 | 15 |
16 import auditok | 16 import auditok |
17 region = auditok.load("audio.ogg") | 17 region = auditok.load("audio.ogg") |
18 | 18 |
19 If the input file contains raw (headerless) audio data, specifying audio | 19 If the input file contains raw (headerless) audio data, specifying audio |
20 parameters (``sampling_rate``, ``sample_width``, and ``channels``) is required. | 20 parameters (``sampling_rate``, ``sample_width``, and ``channels``) is required. |
21 Additionally, if the file name does not end with 'raw', you should explicitly | 21 Additionally, if the file name does not end with 'raw', you should explicitly |
22 pass `audio_format="raw"` to the function. | 22 pass ``audio_format="raw"`` to the function. |
23 | 23 |
24 In the example below, we provide audio parameters using their abbreviated names: | 24 In the example below, we provide audio parameters using their abbreviated names: |
25 | 25 |
26 .. code:: python | 26 .. code:: python |
27 | 27 |
38 | 38 |
39 from auditok import AudioRegion | 39 from auditok import AudioRegion |
40 region = AudioRegion.load("audio.dat", | 40 region = AudioRegion.load("audio.dat", |
41 audio_format="raw", | 41 audio_format="raw", |
42 sr=44100, # alias for `sampling_rate` | 42 sr=44100, # alias for `sampling_rate` |
43 sw=2, # alias for `sample_width` | 43 sw=2, # alias for `sample_width` |
44 ch=1 # alias for `channels` | 44 ch=1 # alias for `channels` |
45 ) | 45 ) |
46 | 46 |
47 | 47 |
48 From a `bytes` object | 48 From a ``bytes`` object |
49 ===================== | 49 ======================= |
50 | 50 |
51 If the first argument is of type `bytes`, it is interpreted as raw audio data: | 51 If the first argument is of type ``bytes``, it is interpreted as raw audio data: |
52 | 52 |
53 .. code:: python | 53 .. code:: python |
54 | 54 |
55 sr = 16000 | 55 sr = 16000 |
56 sw = 2 | 56 sw = 2 |
68 AudioRegion(duration=1.000, sampling_rate=16000, sample_width=2, channels=1) | 68 AudioRegion(duration=1.000, sampling_rate=16000, sample_width=2, channels=1) |
69 | 69 |
70 From the microphone | 70 From the microphone |
71 =================== | 71 =================== |
72 | 72 |
73 If the first argument is `None`, :func:`load` will attempt to read data from the | 73 If the first argument is ``None``, :func:`load` will attempt to read data from the |
74 microphone. In this case, audio parameters, along with the `max_read` parameter, | 74 microphone. In this case, audio parameters, along with the ``max_read`` parameter, |
75 are required. | 75 are required. |
76 | 76 |
77 .. code:: python | 77 .. code:: python |
78 | 78 |
79 sr = 16000 | 79 sr = 16000 |
129 a valid event may contain multiple silences, as long as none exceed 0.3 seconds. | 129 a valid event may contain multiple silences, as long as none exceed 0.3 seconds. |
130 | 130 |
131 :func:`split` returns a generator of :class:`AudioRegion` objects. Each | 131 :func:`split` returns a generator of :class:`AudioRegion` objects. Each |
132 :class:`AudioRegion` can be played, saved, repeated (multiplied by an integer), | 132 :class:`AudioRegion` can be played, saved, repeated (multiplied by an integer), |
133 and concatenated with another region (see examples below). Note that | 133 and concatenated with another region (see examples below). Note that |
134 :class:`AudioRegion` objects returned by :func:`split` include `start` and `stop` | 134 :class:`AudioRegion` objects returned by :func:`split` include ``start`` and ``stop`` |
135 attributes, which mark the beginning and end of the audio event relative to the | 135 attributes, which mark the beginning and end of the audio event relative to the |
136 input audio stream. | 136 input audio stream. |
137 | 137 |
138 .. code:: python | 138 .. code:: python |
139 | 139 |
155 # Play the audio event | 155 # Play the audio event |
156 r.play(progress_bar=True) | 156 r.play(progress_bar=True) |
157 | 157 |
158 # Save the event with start and end times in the filename | 158 # Save the event with start and end times in the filename |
159 filename = r.save("event_{start:.3f}-{end:.3f}.wav") | 159 filename = r.save("event_{start:.3f}-{end:.3f}.wav") |
160 print(f"Event saved as: {filename}") | 160 print(f"event saved as: {filename}") |
161 | 161 |
162 Example output: | 162 Example output: |
163 | 163 |
164 .. code:: bash | 164 .. code:: bash |
165 | 165 |
166 Event 0: 0.700s -- 1.400s | 166 Event 0: 0.700s -- 1.400s |
167 Event saved as: event_0.700-1.400.wav | 167 event saved as: event_0.700-1.400.wav |
168 Event 1: 3.800s -- 4.500s | 168 Event 1: 3.800s -- 4.500s |
169 Event saved as: event_3.800-4.500.wav | 169 event saved as: event_3.800-4.500.wav |
170 Event 2: 8.750s -- 9.950s | 170 Event 2: 8.750s -- 9.950s |
171 Event saved as: event_8.750-9.950.wav | 171 event saved as: event_8.750-9.950.wav |
172 Event 3: 11.700s -- 12.400s | 172 Event 3: 11.700s -- 12.400s |
173 Event saved as: event_11.700-12.400.wav | 173 event saved as: event_11.700-12.400.wav |
174 Event 4: 15.050s -- 15.850s | 174 Event 4: 15.050s -- 15.850s |
175 Event saved as: event_15.050-15.850.wav | 175 event saved as: event_15.050-15.850.wav |
176 | 176 |
177 Split and plot | 177 Split and plot |
178 -------------- | 178 -------------- |
179 | 179 |
180 Visualize audio signal and detections: | 180 Visualize audio signal and detections: |
213 | 213 |
214 from auditok import split_and_join_with_silence | 214 from auditok import split_and_join_with_silence |
215 audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav") | 215 audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav") |
216 | 216 |
217 | 217 |
218 Read and split data from the microphone | 218 Read audio data from the microphone and perform real-time event detection |
219 --------------------------------------- | 219 ------------------------------------------------------------------------- |
220 | 220 |
221 If the first argument of :func:`split` is ``None``, audio data is read from the | 221 If the first argument of :func:`split` is ``None``, audio data is read from the |
222 microphone (requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_): | 222 microphone (requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_): |
223 | 223 |
224 .. code:: python | 224 .. code:: python |
238 pass | 238 pass |
239 | 239 |
240 | 240 |
241 :func:`split` will continue reading audio data until you press ``Ctrl-C``. To read | 241 :func:`split` will continue reading audio data until you press ``Ctrl-C``. To read |
242 a specific amount of audio data, pass the desired number of seconds using the | 242 a specific amount of audio data, pass the desired number of seconds using the |
243 `max_read` argument. | 243 ``max_read`` argument. |
244 | 244 |
245 | 245 |
246 Access recorded data after split | 246 Access recorded data after split |
247 -------------------------------- | 247 -------------------------------- |
248 | 248 |
275 # alternatively you can use | 275 # alternatively you can use |
276 full_audio = auditok.AudioRegion(rec.data, sr, sw, ch) | 276 full_audio = auditok.AudioRegion(rec.data, sr, sw, ch) |
277 full_audio.play(progress_bar=True) | 277 full_audio.play(progress_bar=True) |
278 | 278 |
279 | 279 |
280 :class:`Recorder` also accepts a `max_read` argument. | 280 :class:`Recorder` also accepts a ``max_read`` argument. |
281 | 281 |
282 Working with AudioRegions | 282 Working with AudioRegions |
283 ------------------------- | 283 ------------------------- |
284 | 284 |
285 In the following sections, we will review several operations | 285 In the following sections, we will review several operations |
286 that can be performed with :class:AudioRegion objects. | 286 that can be performed with :class:`AudioRegion` objects. |
287 | 287 |
288 Basic region information | 288 Basic region information |
289 ======================== | 289 ======================== |
290 | 290 |
291 .. code:: python | 291 .. code:: python |
353 Slicing an :class:`AudioRegion` can be useful in various situations. | 353 Slicing an :class:`AudioRegion` can be useful in various situations. |
354 For example, you can remove a fixed-length portion of audio data from | 354 For example, you can remove a fixed-length portion of audio data from |
355 the beginning or end of a region, or crop a region by an arbitrary amount | 355 the beginning or end of a region, or crop a region by an arbitrary amount |
356 as a data augmentation strategy. | 356 as a data augmentation strategy. |
357 | 357 |
358 The most accurate way to slice an `AudioRegion` is by using indices that | 358 The most accurate way to slice an :class:`AudioRegion` is by using indices |
359 directly refer to raw audio samples. In the following example, assuming | 359 that directly refer to raw audio samples. In the following example, assuming |
360 the audio data has a sampling rate of 16000, you can extract a 5-second | 360 the audio data has a sampling rate of 16000, you can extract a 5-second |
361 segment from the main region, starting at the 20th second, as follows: | 361 segment from the main region, starting at the 20th second, as follows: |
362 | 362 |
363 .. code:: python | 363 .. code:: python |
364 | 364 |