amine@369
|
1 Basic example
|
amine@369
|
2 -------------
|
amine@369
|
3
|
amine@369
|
4 .. code:: python
|
amine@369
|
5
|
amine@369
|
6 from auditok import split
|
amine@369
|
7
|
amine@369
|
8 # split returns a generator of AudioRegion objects
|
amine@369
|
9 audio_regions = split("audio.wav")
|
amine@369
|
10 for region in audio_regions:
|
amine@369
|
11 region.play(progress_bar=True)
|
amine@369
|
12 filename = region.save("/tmp/region_{meta.start:.3f}.wav")
|
amine@369
|
13 print("region saved as: {}".format(filename))
|
amine@369
|
14
|
amine@369
|
15 Example using `AudioRegion`
|
amine@369
|
16 ---------------------------
|
amine@369
|
17
|
amine@369
|
18 .. code:: python
|
amine@369
|
19
|
amine@369
|
20 from auditok import AudioRegion
|
amine@369
|
21 region = AudioRegion.load("audio.wav")
|
amine@369
|
22 regions = region.split_and_plot() # or just region.splitp()
|
amine@369
|
23
|
amine@369
|
24 output figure:
|
amine@369
|
25
|
amine@369
|
26 .. image:: figures/example_1.png
|
amine@369
|
27
|
amine@369
|
28 Working with AudioRegions
|
amine@369
|
29 -------------------------
|
amine@369
|
30
|
amine@369
|
31 Beyond splitting, there are a couple of interesting operations you can do with
|
amine@369
|
32 `AudioRegion` objects.
|
amine@369
|
33
|
amine@369
|
34 Concatenate regions
|
amine@369
|
35 ===================
|
amine@369
|
36
|
amine@369
|
37 .. code:: python
|
amine@369
|
38
|
amine@369
|
39 from auditok import AudioRegion
|
amine@369
|
40 region_1 = AudioRegion.load("audio_1.wav")
|
amine@369
|
41 region_2 = AudioRegion.load("audio_2.wav")
|
amine@369
|
42 region_3 = region_1 + region_2
|
amine@369
|
43
|
amine@369
|
44 Particularly useful if you want to join regions returned by ``split``:
|
amine@369
|
45
|
amine@369
|
46 .. code:: python
|
amine@369
|
47
|
amine@369
|
48 from auditok import AudioRegion
|
amine@369
|
49 regions = AudioRegion.load("audio.wav").split()
|
amine@369
|
50 gapless_region = sum(regions)
|
amine@369
|
51
|
amine@369
|
52 Repeat a region
|
amine@369
|
53 ===============
|
amine@369
|
54
|
amine@369
|
55 Multiply by a positive integer:
|
amine@369
|
56
|
amine@369
|
57 .. code:: python
|
amine@369
|
58
|
amine@369
|
59 from auditok import AudioRegion
|
amine@369
|
60 region = AudioRegion.load("audio.wav")
|
amine@369
|
61 region_x3 = region * 3
|
amine@369
|
62
|
amine@369
|
63 Make slices of equal size out of a region
|
amine@369
|
64 =========================================
|
amine@369
|
65
|
amine@369
|
66 Divide by a positive integer:
|
amine@369
|
67
|
amine@369
|
68 .. code:: python
|
amine@369
|
69
|
amine@369
|
70 from auditok import AudioRegion
|
amine@369
|
71 region = AudioRegion.load("audio.wav")
|
amine@369
|
72 regions = regions / 5
|
amine@369
|
73 assert sum(regions) == region
|
amine@369
|
74
|
amine@369
|
75 Make audio slices of arbitrary size
|
amine@369
|
76 ===================================
|
amine@369
|
77
|
amine@369
|
78 Slicing an ``AudioRegion`` can be interesting in many situations. You can for
|
amine@369
|
79 example remove a fixed-size portion of audio data from the beginning or the end
|
amine@369
|
80 of a region or crop a region by an arbitrary amount as a data augmentation
|
amine@369
|
81 strategy, etc.
|
amine@369
|
82
|
amine@369
|
83 The most accurate way to slice an ``AudioRegion`` is to use indices that
|
amine@369
|
84 directly refer to raw audio samples. In the following example, assuming that the
|
amine@369
|
85 sampling rate of audio data is 16000, you can extract a 5-second region from
|
amine@369
|
86 main region, starting from the 20th second as follows:
|
amine@369
|
87
|
amine@369
|
88 .. code:: python
|
amine@369
|
89
|
amine@369
|
90 from auditok import AudioRegion
|
amine@369
|
91 region = AudioRegion.load("audio.wav")
|
amine@369
|
92 start = 20 * 16000
|
amine@369
|
93 stop = 25 * 16000
|
amine@369
|
94 five_second_region = region[start:stop]
|
amine@369
|
95
|
amine@369
|
96 This allows you to practically start and stop at any sample within the region.
|
amine@369
|
97 Just as with a `list` you can omit one of `start` and `stop`, or both. You can
|
amine@369
|
98 also use negative indices:
|
amine@369
|
99
|
amine@369
|
100 .. code:: python
|
amine@369
|
101
|
amine@369
|
102 from auditok import AudioRegion
|
amine@369
|
103 region = AudioRegion.load("audio.wav")
|
amine@369
|
104 start = -3 * region.sr # `sr` is an alias of `sampling_rate`
|
amine@369
|
105 three_last_seconds = region[start:]
|
amine@369
|
106
|
amine@369
|
107 While slicing by raw samples is accurate, slicing with temporal indices is more
|
amine@369
|
108 intuitive. You can do so by accessing the ``millis`` or ``seconds`` views of
|
amine@369
|
109 ``AudioRegion`` (or their shortcut alias ``ms`` and ``sec``/``s``).
|
amine@369
|
110
|
amine@369
|
111 With the ``millis`` view:
|
amine@369
|
112
|
amine@369
|
113 .. code:: python
|
amine@369
|
114
|
amine@369
|
115 from auditok import AudioRegion
|
amine@369
|
116 region = AudioRegion.load("audio.wav")
|
amine@369
|
117 five_second_region = region.millis[5000:10000]
|
amine@369
|
118
|
amine@369
|
119 or with the ``seconds`` view:
|
amine@369
|
120
|
amine@369
|
121 .. code:: python
|
amine@369
|
122
|
amine@369
|
123 from auditok import AudioRegion
|
amine@369
|
124 region = AudioRegion.load("audio.wav")
|
amine@369
|
125 five_second_region = region.seconds[5:10]
|
amine@369
|
126
|
amine@369
|
127 Get an array of audio samples
|
amine@369
|
128 =============================
|
amine@369
|
129
|
amine@369
|
130 .. code:: python
|
amine@369
|
131
|
amine@369
|
132 from auditok import AudioRegion
|
amine@369
|
133 region = AudioRegion.load("audio.wav")
|
amine@369
|
134 samples = region.samples
|
amine@369
|
135
|
amine@369
|
136 If ``numpy`` is installed, this will return a ``numpy.ndarray``. If audio data
|
amine@369
|
137 is mono the returned array is 1D, otherwise it's 2D. If ``numpy`` is not
|
amine@369
|
138 installed this will return a standard ``array.array`` for mono data, and a list
|
amine@369
|
139 of ``array.array`` for multichannel data.
|
amine@369
|
140
|
amine@369
|
141 Alternatively you can use:
|
amine@369
|
142
|
amine@369
|
143 .. code:: python
|
amine@369
|
144
|
amine@369
|
145 import numpy as np
|
amine@369
|
146 region = AudioRegion.load("audio.wav")
|
amine@369
|
147 samples = np.asarray(region)
|