auditok: doc/examples.rst comparison

comparison doc/examples.rst @ 377:c6308873f239

Improve documentation, add more examples

author	Amine Sehili <amine.sehili@gmail.com>
date	Wed, 17 Feb 2021 21:18:05 +0100
parents	0106c4799906
children	df2a320e10d5

comparison

equal deleted inserted replaced

-:d83cba0f8072
+:c6308873f239
-Basic example
+Loading audio data
--------------
+------------------
-.. code:: python
+From a file
+===========
-from auditok import split
+If the first argument of `load` is a string, it should be a path to an audio
+file.
+.. code:: python
+import auditok
+region = auditok.load("audio.ogg")
+If input file contains a raw (headerless) audio data, passing `audio_format="raw"`
+and other audio parameters (`sampling_rate`, `sample_width` and `channels`) is
+mandatory. In the following example we pass audio parameters with their short
+names:
+.. code:: python
+region = auditok.load("audio.dat",
+audio_format="raw",
+sr=44100,
+sw=2
+ch=1)
+From a `bytes` object
+=====================
+If the first argument is of type `bytes` it's interpreted as raw audio data:
+.. code:: python
+sr = 16000
+sw = 2
+ch = 1
+data = b"\0" * sr * sw * ch
+load(data, sr=sr, sw=sw, ch=ch)
+print(region)
+output:
+.. code:: bash
+AudioRegion(duration=1.000, sampling_rate=16000, sample_width=2, channels=1)
+From the microphone
+===================
+If the first argument is `None`, `load` will try to read data from the microphone.
+Audio parameters, as well as the `max_read` parameter are mandatory:
+.. code:: python
+sr = 16000
+sw = 2
+ch = 1
+five_sec_audio = load(None, sr=sr, sw=sw, ch=ch, max_read=5)
+print(five_sec_audio)
+output:
+.. code:: bash
+AudioRegion(duration=5.000, sampling_rate=16000, sample_width=2, channels=1)
+Skip part of audio data
+=======================
+If the `skip` parameter is > 0, `load` will skip that leading amount of audio
+data:
+.. code:: python
+import auditok
+region = auditok.load("audio.ogg", skip=2) # skip the first 2 seconds
+This argument must be 0 when reading from the microphone.
+Basic split example
+-------------------
+.. code:: python
+import auditok
 # split returns a generator of AudioRegion objects
-audio_regions = split("audio.wav")
+audio_regions = auditok.split(
-for region in audio_regions:
+"audio.wav",
-region.play(progress_bar=True)
+min_dur=0.2,     # minimum duration of a valid audio event in seconds
-filename = region.save("/tmp/region_{meta.start:.3f}.wav")
+max_dur=4,       # maximum duration of an event
+max_silence=0.3, # maximum duration of tolerated continuous silence within an event
+energy_threshold=55 # threshold of detection
+)
+for i, r in enumerate(audio_regions):
+# Regions returned by `split` have 'start' and 'end' metadata fields
+print("Region {i}: {r.meta.start:.3f}s -- {r.meta.end:.3f}s".format(i=i, r=r))
+# play detection
+# r.play(progress_bar=True)
+# region's metadata can also be used with the `save` method
+# (no need to explicitly specify region's object and `format` arguments)
+filename = r.save("region_{meta.start:.3f}-{meta.end:.3f}.wav")
 print("region saved as: {}".format(filename))
-Example using `AudioRegion`
+output example:
----------------------------
+.. code:: bash
-.. code:: python
+Region 0: 0.700s -- 1.400s
-from auditok import AudioRegion
+region saved as: region_0.700-1.400.wav
-region = AudioRegion.load("audio.wav")
+Region 1: 3.800s -- 4.500s
-regions = region.split_and_plot() # or just region.splitp()
+region saved as: region_3.800-4.500.wav
+Region 2: 8.750s -- 9.950s
+region saved as: region_8.750-9.950.wav
+Region 3: 11.700s -- 12.400s
+region saved as: region_11.700-12.400.wav
+Region 4: 15.050s -- 15.850s
+region saved as: region_15.050-15.850.wav
+Split and plot
+--------------
+Visualize audio signal and detections:
+.. code:: python
+import auditok
+region = auditok.load("audio.wav") # returns an AudioRegion object
+regions = region.split_and_plot(...) # or just region.splitp()
 output figure:
 .. image:: figures/example_1.png
+Read and split data from the microphone
+---------------------------------------
+If the first argument of `split` is None, audio data is read from the microphone
+(requires `pyaudio <https://people.csail.mit.edu/hubert/pyaudio>`_):
+.. code:: python
+import auditok
+sr = 16000
+sw = 2
+ch = 1
+eth = 55 # alias for energy_threshold, default value is 50
+try:
+for region in auditok.split(input=None, sr=sr, sw=sw, ch=ch, eth=eth):
+print(region)
+region.play(progress_bar=True) # progress bar requires `tqdm`
+except KeyboardInterrupt:
+pass
+`split` will continue reading audio data until you press ``Ctrl-C``. If you want
+to read a specific amount of audio data, pass the desired number of seconds with
+the `max_read` argument.
+Accessing recorded data after split
+-----------------------------------
+Using a `Recorder` object you can get hold of acquired audio:
+.. code:: python
+import auditok
+sr = 16000
+sw = 2
+ch = 1
+eth = 55 # alias for energy_threshold, default value is 50
+rec = auditok.Recorder(input=None, sr=sr, sw=sw, ch=ch)
+try:
+for region in auditok.split(rec, sr=sr, sw=sw, ch=ch, eth=eth):
+print(region)
+region.play(progress_bar=True) # progress bar requires `tqdm`
+except KeyboardInterrupt:
+pass
+rec.rewind()
+full_audio = load(rec.data, sr=sr, sw=sw, ch=ch)
+`Recorder` also accepts a `max_read` argument.
 Working with AudioRegions
 -------------------------
 Beyond splitting, there are a couple of interesting operations you can do with
 `AudioRegion` objects.
+Basic region information
+========================
+.. code:: python
+import auditok
+region = auditok.load("audio.wav")
+len(region) # number of audio samples int the regions, one channel considered
+region.duration # duration in seconds
+region.sampling_rate # alias `sr`
+region.sample_width # alias `sw`
+region.channels # alias `ch`
 Concatenate regions
 ===================
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region_1 = AudioRegion.load("audio_1.wav")
+region_1 = auditok.load("audio_1.wav")
-region_2 = AudioRegion.load("audio_2.wav")
+region_2 = auditok.load("audio_2.wav")
 region_3 = region_1 + region_2
-Particularly useful if you want to join regions returned by ``split``:
+Particularly useful if you want to join regions returned by `split`:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-regions = AudioRegion.load("audio.wav").split()
+regions = auditok.load("audio.wav").split()
 gapless_region = sum(regions)
 Repeat a region
 ===============
 Multiply by a positive integer:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 region_x3 = region * 3
-Make slices of equal size out of a region
+Split one region into N regions of equal size
-=========================================
+=============================================
 Divide by a positive integer:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 regions = regions / 5
 assert sum(regions) == region
-Make audio slices of arbitrary size
+Note that if perfect division is possible, the last region might be a bit shorter
-===================================
+than the previous N-1 regions.
-Slicing an ``AudioRegion`` can be interesting in many situations. You can for
+Slice a region by samples, seconds or milliseconds
-example remove a fixed-size portion of audio data from the beginning or the end
+==================================================
-of a region or crop a region by an arbitrary amount as a data augmentation
+Slicing an `AudioRegion` can be interesting in many situations. You can for
+example remove a fixed-size portion of audio data from the beginning or from the
+end of a region or crop a region by an arbitrary amount as a data augmentation
 strategy, etc.
-The most accurate way to slice an ``AudioRegion`` is to use indices that
+The most accurate way to slice an `AudioRegion` is to use indices that
 directly refer to raw audio samples. In the following example, assuming that the
 sampling rate of audio data is 16000, you can extract a 5-second region from
 main region, starting from the 20th second as follows:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 start = 20 * 16000
 stop = 25 * 16000
 five_second_region = region[start:stop]
-This allows you to practically start and stop at any sample within the region.
+This allows you to practically start and stop at any audio sample of the region.
 Just as with a `list` you can omit one of `start` and `stop`, or both. You can
 also use negative indices:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 start = -3 * region.sr # `sr` is an alias of `sampling_rate`
 three_last_seconds = region[start:]
 While slicing by raw samples is accurate, slicing with temporal indices is more
-intuitive. You can do so by accessing the ``millis`` or ``seconds`` views of
+intuitive. You can do so by accessing the `millis` or `seconds` views of an
-``AudioRegion`` (or their shortcut alias ``ms`` and ``sec``/``s``).
+`AudioRegion` (or their shortcut alias `ms` and `sec` or `s`).
-With the ``millis`` view:
+With the `millis` view:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 five_second_region = region.millis[5000:10000]
-or with the ``seconds`` view:
+or with the `seconds` view:
 .. code:: python
-from auditok import AudioRegion
+import auditok
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 five_second_region = region.seconds[5:10]
-Get an array of audio samples
+`seconds` indices can also be floats:
-=============================
+.. code:: python
-.. code:: python
+import auditok
-from auditok import AudioRegion
+region = auditok.load("audio.wav")
-region = AudioRegion.load("audio.wav")
+five_second_region = region.seconds[2.5:7.5]
+Get arrays of audio samples
+===========================
+If `numpy` is not installed, the `samples` attributes is a list of audio samples
+arrays (standard `array.array` objects), one per channels. If numpy is installed,
+`samples` is a 2-D `numpy.ndarray` where the fist dimension is the channel
+and the second is the the sample.
+.. code:: python
+import auditok
+region = auditok.load("audio.wav")
 samples = region.samples
-If ``numpy`` is installed, this will return a ``numpy.ndarray``. If audio data
-is mono the returned array is 1D, otherwise it's 2D. If ``numpy`` is not
+If `numpy` is not installed you can use:
-installed this will return a standard ``array.array`` for mono data, and a list
-of ``array.array`` for multichannel data.
-Alternatively you can use:
 .. code:: python
 import numpy as np
-region = AudioRegion.load("audio.wav")
+region = auditok.load("audio.wav")
 samples = np.asarray(region)
+assert len(samples.shape) == 2

Mercurial > hg > auditok

comparison doc/examples.rst @ 377:c6308873f239