changeset 351:3d6e4d8f6903

Update docstring
author Amine Sehili <amine.sehili@gmail.com>
date Tue, 31 Mar 2020 22:21:13 +0200
parents 1076056833c5
children 02f4aa16598a
files auditok/core.py
diffstat 1 files changed, 415 insertions(+), 257 deletions(-) [+]
line wrap: on
line diff
--- a/auditok/core.py	Wed Jan 22 23:21:44 2020 +0100
+++ b/auditok/core.py	Tue Mar 31 22:21:13 2020 +0200
@@ -36,61 +36,99 @@
     strict_min_dur=False,
     **kwargs
 ):
-    """Splits audio data and returns a generator of `AudioRegion`s
+    """
+    Split audio data and return a generator of `AudioRegion`s
 
-    :Parameters:
+    Parameters
+    ----------
+    input : str, bytes, AudioSource, AudioReader, AudioRegion or None
+        input audio data. If str, it should be a path to an existing audio file.
+        If bytes, input is considered as raw audio data. If None, read audio
+        from microphone.
+        Every object that is not an ´AudioReader´ will be transformed into an
+        `AudioReader` before processing. If it is an `str` that refers to a raw
+        audio file, `bytes` or None, audio parameters should be provided using
+        kwargs (i.e., `samplig_rate`, `sample_width` and `channels` or their
+        alias).
+        If ´input´ is str then audio format will be guessed from file extension.
+        `audio_format` (alias `fmt`) kwarg can also be given to specify audio
+        format explicitly. If none of these options is available, rely on
+        backend (currently only pydub is supported) to load data.
+    min_dur : float, default: 0.2
+        minimun duration in seconds of a detected audio event. By using large
+        values for `min_dur`, very short audio events (e.g., very short 1-word
+        utterances like 'yes' or 'no') can be misdetected. Using very short
+        values might result in a high number of short, unuseful audio events.
+    max_dur : float, default: 5
+        maximum duration in seconds of a detected audio event. If an audio event
+        lasts more than `max_dur` it will be truncated. If the continuation of a
+        truncated audio event is shorter than `min_dur` then this continuation
+        is accepted as a valid audio event if `strict_min_dur` is False.
+        Otherwise it is rejected.
+    max_silence : float, default: 0.3
+        maximum duration of continuous silence within an audio event. There
+        might be many silent gaps of this duration within one audio event. If
+        the continuous silence happens at the end of the event than it's kept as
+        part of the event if `drop_trailing_silence` is False (default).
+    drop_trailing_silence : bool, default: False
+        Whether to remove trailing silence from detected events. To avoid abrupt
+        cuts in speech, trailing silence should be kept, therefor
+        `drop_trailing_silence` should be False.s
+        detection, it
+    strict_min_dur : bool, default: False
+        strict minimum duration. Do not accept an audio event if it is shorter
+        than ´min_dur´ even if it is continguous to the latest valid event. This
+        happens if the the latest detected event had reached ´max_dur´.
 
-    input: str, bytes, AudioSource, AudioRegion, AudioReader
-        input audio data. If str, it should be a path to an existing audio
-        file. If bytes, input is considered as raw audio data.
-    min_dur: float
-        minimun duration in seconds of a detected audio event. Default: 0.2.
-        Using large values, very short audio events (e.g., very short 1-word
-        utterances like 'yes' or 'no') can be missed.
-        Using very short values might result in a high number of short,
-        unuseful audio events.
-    max_dur: float
-        maximum duration in seconds of a detected audio event. Default: 5.
-    max_silence: float
-        maximum duration of consecutive silence within an audio event. There
-        might be many silent gaps of this duration within an audio event.
-    drop_trailing_silence: bool
-        drop trailing silence from detected events
-    strict_min_dur: bool
-        strict minimum duration. Drop an event if it is shorter than ´min_dur´
-        even if it is continguous to the latest valid event. This happens if
-        the the latest event had reached ´max_dur´.
-    analysis_window, aw: float
-        duration of analysis window in seconds. Default: 0.05 second (50 ms).
-        A value up to 0.1 second (100 ms) should be good for most use-cases.
-        You might need a different value, especially if you use a custom
-        validator.
-    audio_format, fmt: str
-        type of audio date (e.g., wav, ogg, raw, etc.). This will only be used
-        if ´input´ is a string path to audio file. If not given, audio type
-        will be guessed from file name extension or from file header.
-    sampling_rate, sr: int
-        sampling rate of audio data. Only needed for raw audio files/data.
-    sample_width, sw: int
-        number of bytes used to encode an audio sample, typically 1, 2 or 4.
-        Only needed for raw audio files/data.
-    channels, ch: int
-        nuumber of channels of audio data. Only needed for raw audio files.
-    use_channel, uc: int, str
-        which channel to use if input has multichannel audio data. Can be an
-        int (0 being the first channel), or one of the following values:
-            - None, "any": a valid frame from one any given channel makes
-              parallel frames from all other channels automatically valid.
-            - 'mix': compute average channel (i.e. mix down all channels)
-    max_read, mr: float
-        maximum data to read in seconds. Default: `None`, read until there is
-        no more data to read.
-    validator, val: DataValidator
+    Kwargs
+    ------
+    analysis_window, aw : float, default: 0.05 (50 ms)
+        duration of analysis window in seconds. A value between 0.01 (10 ms) and
+        0.1 (100 ms) should be good for most use-cases.
+    audio_format, fmt : str
+        type of audio data (e.g., wav, ogg, flac, raw, etc.). This will only be
+        used if ´input´ is a string path to an audio file. If not given, audio
+        type will be guessed from file name extension or from file header.
+    sampling_rate, sr : int
+        sampling rate of audio data. Reauired if `input` is a raw audio file, is
+        a bytes object or None (i.e., read from microphone).
+    sample_width, sw : int
+        number of bytes used to encode one audio sample, typically 1, 2 or 4.
+        Required for raw data, see `sampling_rate`.
+    channels, ch : int
+        nuumber of channels of audio data. Required for raw data, see
+        `sampling_rate`.
+    use_channel, uc : {None, "mix"} or int
+        which channel to use for split if `input` has multiple audio channels.
+        Regardless of which channel is used for splitting, returned audio events
+        contain data from *all* channels, just as `input`.
+        The following values are accepted:
+            - None (alias "any"): accept audio activity from any channel, even
+            if other channels are silent. This is the default behavior.
+            - "mix" ("avg" or "average"): mix down all channels (i.e. compute
+            average channel) and split the resulting channel.
+            - int (0 <=, > `channels`): use one channel, specified by integer
+            id, for split.
+    large_file : bool, default: False
+        If True, AND if `input` is a path to a *wav* of a *raw* audio file
+        (and only these two formats) then audio data is lazily loaded to memory
+        (i.e., one analysis window a time). Otherwise the whole file is loaded
+        to memory before split. Set to True if the size of the file is larger
+        than available memory.
+    max_read, mr : float, default: None (read until end of stream)
+        maximum data to read from source in seconds.
+    validator, val : callable, DataValidator
         custom data validator. If ´None´ (default), an `AudioEnergyValidor` is
-        used with the given energy threshold.
-    energy_threshold, eth: float
-        energy threshlod for audio activity detection, default: 50. If a custom
-        validator is given, this argumemt will be ignored.
+        used with the given energy threshold. Can be a callable or an instnace
+        of `DataValidator` that implements `is_valid`. In either case, it'll be
+        called with with a window of audio data as the first parameter.
+    energy_threshold, eth : float, default: 50
+        energy threshlod for audio activity detection. Audio regions that have
+        enough windows of with a signal energy equal to or above this threshold
+        are considered valid audio events. Here we are referring to this quntity
+        as enegry of this signal but to be more accurate, it is the log energy
+        of the signal computed as: 10 . log10 dot(x, x) / |x|
+        If `validator` is given, this argumemt is ignored.
     """
     if min_dur <= 0:
         raise ValueError("'min_dur' ({}) must be > 0".format(min_dur))
@@ -210,22 +248,23 @@
     `duration` and `analysis_window` can be in seconds or milliseconds but
     must be in the same unit.
 
-    :Parameters:
+    Parameters
+    ----------
 
-    duration: float
+    duration : float
         a given duration in seconds or ms.
     analysis_window: float
         size of analysis window, in the same unit as `duration`.
-    round_fn: callable
+    round_fn : callable
         function called to round the result. Default: `round`.
-    epsilon: float
+    epsilon : float
         small value to add to the division result before rounding.
         E.g., `0.3 / 0.1 = 2.9999999999999996`, when called with
         `round_fn=math.floor` returns `2` instead of `3`. Adding a small value
         to `0.3 / 0.1` avoids this error.
 
-    Returns:
-    --------
+    Returns
+    -------
     nb_windows: int
         minimum number of `analysis_window`'s to cover `durartion`. That means
         that `analysis_window * nb_windows >= duration`.
@@ -246,23 +285,28 @@
     sample_width,
     channels,
 ):
-    """Create and return an `AudioRegion`.
+    """
+    Helper function to create an `AudioRegion` from parameters returned by
+    tokenization object. It takes care of setting up region `start` and `end`
+    in metadata.
 
-    :Parameters:
+    Parameters
+    ----------
 
     frame_duration: float
         duration of analysis window in seconds
-    start_frame: int
+    start_frame : int
         index of the fisrt analysis window
-    samling_rate: int
+    samling_rate : int
         sampling rate of audio data
-    sample_width: int
+    sample_width : int
         number of bytes of one audio sample
-    channels: int
+    channels : int
         number of channels of audio data
 
-    Returns:
-    audio_region: AudioRegion
+    Returns
+    -------
+    audio_region : AudioRegion
         AudioRegion whose start time is calculeted as:
         `1000 * start_frame * frame_duration`
     """
@@ -274,6 +318,24 @@
 
 
 def _read_chunks_online(max_read, **kwargs):
+    """
+    Helper function to read audio data from an online blocking source
+    (i.e., microphone). Used to build an `AudioRegion` and can intercept
+    KeyboardInterrupt so that reading stops as soon as this exception is
+    raised. Makes building `AudioRegion`s on [i]python sessions and jupyter
+    notebooks more user friendly.
+
+    Parameters
+    ----------
+    max_read : float
+        maximum amount of data to read in seconds.
+    kwargs :
+        audio parameters (sampling_rate, sample_width and channels).
+
+    See also
+    --------
+    `AudioRegion.build`
+    """
     reader = AudioReader(None, block_dur=0.5, max_read=max_read, **kwargs)
     reader.open()
     data = []
@@ -297,6 +359,28 @@
 
 
 def _read_offline(input, skip=0, max_read=None, **kwargs):
+    """
+    Helper function to read audio data from an offline (i.e., file). Used to
+    build `AudioRegion`s.
+
+    Parameters
+    ----------
+    input : str, bytes
+        path to audio file (if str), or a bytes object representing raw audio
+        data.
+    skip : float, default 0
+        amount of data to skip from the begining of audio source.
+    max_read : float, default: None
+        maximum amount of audio data to read. Default: None, means read until
+        end of stream.
+    kwargs :
+        audio parameters (sampling_rate, sample_width and channels).
+
+    See also
+    --------
+    `AudioRegion.build`
+
+    """
     audio_source = get_audio_source(input, **kwargs)
     audio_source.open()
     if skip is not None and skip > 0:
@@ -329,6 +413,10 @@
 
 
 class _SecondsView:
+    """A class to create a view of `AudioRegion` that can be sliced using
+    indices in seconds.
+    """
+
     def __init__(self, region):
         self._region = region
 
@@ -350,6 +438,10 @@
 
 
 class _MillisView(_SecondsView):
+    """A class to create a view of `AudioRegion` that can be sliced using
+    indices in milliseconds.
+    """
+
     def __getitem__(self, index):
         err_msg = (
             "Slicing AudioRegion by milliseconds requires indices of type "
@@ -376,6 +468,9 @@
 
 
 class _AudioRegionMetadata(dict):
+    """A class to store `AudioRegion`'s metadata.
+    """
+
     def __getattr__(self, name):
         if name in self:
             return self[name]
@@ -396,18 +491,28 @@
 class AudioRegion(object):
     def __init__(self, data, sampling_rate, sample_width, channels, meta=None):
         """
-        A class for detected audio events.
+        AudioRegion encapsulates raw audio data and provides an interface to
+        perform simple operations on it. Use `AudioRegion.load` to build an
+        `AudioRegion` from different types of objects.
 
-        :Parameters:
+        Parameters
+        ----------
+        data : bytes
+            raw audio data as a bytes object
+        samling_rate : int
+            sampling rate of audio data
+        sample_width : int
+            number of bytes of one audio sample
+        channels : int
+            number of channels of audio data
+        meta : dict, default: None
+            any collection of <key:value> elements used to build metadata for this
+            `AudioRegion. Meta data can be accessed via `region.meta.key` if `key`
+            is a valid python attribute name, or via `region.meta[key]` if not.
 
-            data: bytes
-                audio data
-            samling_rate: int
-                sampling rate of audio data
-            sample_width: int
-                number of bytes of one audio sample
-            channels: int
-                number of channels of audio data
+        See also
+        --------
+        AudioRegion.load
         """
         check_audio_data(data, sample_width, channels)
         self._data = data
@@ -423,7 +528,8 @@
             self._meta = None
 
         self._seconds_view = _SecondsView(self)
-        self.s = self.sec
+        self.sec = self.seconds
+        self.s = self.seconds
 
         self._millis_view = _MillisView(self)
         self.ms = self.millis
@@ -438,16 +544,60 @@
 
     @classmethod
     def load(cls, input, skip=0, max_read=None, **kwargs):
+        """
+        Create an `AudioRegion` by loading data from `input`.
+
+        Parameters
+        ----------
+        input : None, str, bytes, AudioSource
+            source to load data from. If None, load data from microphone. If
+            bytes, create region from raw data. If str, load data from file.
+            Input can also an AudioSource object.
+        skip : float, default: 0
+            amount, in seconds, of audio data to skip from source. If read from
+            microphone, ``skip`` must be 0, otherwise a ValueError is raised.
+        max_read : float, default: None
+            amount, in seconds, of audio data to read from source. If read from
+            microphone, `max_read` should not be None, otherwise a ValueError is
+            raised.
+
+        audio_format, fmt : str
+            type of audio data (e.g., wav, ogg, flac, raw, etc.). This will only
+            be used if `input` is a string path to an audio file. If not given,
+            audio type will be guessed from file name extension or from file
+            header.
+        sampling_rate, sr : int
+            sampling rate of audio data. Reauired if `input` is a raw audio file,
+            a bytes object or None (i.e., read from microphone).
+        sample_width, sw : int
+            number of bytes used to encode one audio sample, typically 1, 2 or 4.
+            Required for raw data, see `sampling_rate`.
+        channels, ch : int
+            nuumber of channels of audio data. Required for raw data, see
+            `sampling_rate`.
+        large_file : bool, default: False
+            If True, AND if `input` is a path to a *wav* of a *raw* audio file
+            (and only these two formats) then audio file is not fully loaded to
+            memory. Set to True to only load `max_read` data from file.
+
+        Returns
+        -------
+        region: AudioRegion
+
+        Raises
+        ------
+        ValueError if `input` is None and `skip` != 0 or `max_read` is None.
+        """
         if input is None:
+            if skip > 0:
+                raise ValueError(
+                    "'skip' should be 0 when reading from microphone"
+                )
             if max_read is None or max_read < 0:
                 raise ValueError(
                     "'max_read' should not be None when reading from "
                     "microphone"
                 )
-            if skip > 0:
-                raise ValueError(
-                    "'skip' should be 0 when reading from microphone"
-                )
             data, sampling_rate, sample_width, channels = _read_chunks_online(
                 max_read, **kwargs
             )
@@ -459,7 +609,7 @@
         return cls(data, sampling_rate, sample_width, channels)
 
     @property
-    def sec(self):
+    def seconds(self):
         return self._seconds_view
 
     @property
@@ -500,20 +650,21 @@
         return self._channels
 
     def play(self, progress_bar=False, player=None, **progress_bar_kwargs):
-        """Play audio region
+        """
+        Play audio region.
 
-        :Parameters:
-
-        player: AudioPalyer, default: None
-            audio player to use. if None (default), use `player_for(self)`
+        Parameters
+        ----------
+        progress_bar : bool, default: False
+            whether to use a progress bar while playing audio. Default: False.
+            `progress_bar` requires `tqdm`, if not installed, no progress bar
+            will be shown.
+        player : AudioPalyer, default: None
+            audio player to use. if None (default), use `player_for()`
             to get a new audio player.
-
-        progress_bar bool, default: False
-            whether to use a progress bar while playing audio. Default: False.
-
-        progress_bar_kwargs: kwargs
-            keyword arguments to pass to progress_bar object. Currently only
-            `tqdm` is supported.
+        progress_bar_kwargs : kwargs
+            keyword arguments to pass to `tqdm` progress_bar builder (e.g.,
+            use `leave=False` to clean up screen when play finishes).
         """
         if player is None:
             player = player_for(self)
@@ -521,47 +672,51 @@
             self._data, progress_bar=progress_bar, **progress_bar_kwargs
         )
 
-    def save(self, file, format=None, exists_ok=True, **audio_parameters):
-        """Save audio region to file.
+    def save(
+        self, file, audio_format=None, exists_ok=True, **audio_parameters
+    ):
+        """
+        Save audio region to file.
 
-        :Parameters:
+        Parameters
+        ----------
+        file : str
+            path to output audio file. May contain ´{duration}´ placeholder
+            as well as any place holder that this region's metadata might
+            contain (e.g., regions returned by `split` contain metadata with
+            `start` and `end` attributes that can be used to build output file
+            name as ´{meta.start}´ and ´{meta.end}´. See examples using
+            placeholders with formatting.
 
-        file: str, file-like object
-            path to output file or a file-like object. If ´str´, it may contain
-            and ´{duration}´ place holders as well as any place holder that
-            this region's metadata might contain (e.g., ´{meta.start}´).
+        audio_format : str
+            format used to save audio data. If None (default), format is guessed
+            from file name's extension. If file name has no extension, audio
+            data is saved as a raw (headerless) audio file.
+        exists_ok : bool, default: True
+            If True, overwrite ´file´ if a file with the same name exists.
+            If False, raise an ´IOError´ if `file` exists.
+        audio_parameters: dict
+            any keyword arguments to be passed to audio saving backend.
+            FIXME: this is not yet implemented!
 
+        Returns
+        -------
+        file: str
+            name of output file with replaced placehoders.
 
-        format: str
-            type of audio file. If None (default), file type is guessed from
-            `file`'s extension. If `file` is not a ´str´ or does not have
-            an extension, audio data is as a raw (headerless) audio file.
-        exists_ok: bool, default: True
-            If True, overwrite ´file´ if a file with the same name exists.
-            If False, raise an ´IOError´ if the file exists.
-        audio_parameters: dict
-            any keyword arguments to be passed to audio saving backend
-            (e.g. bitrate, etc.)
+        Raises
+            IOError if ´file´ exists and ´exists_ok´ is False.
 
-        :Returns:
-
-        file: str, file-like object
-            name of the file of file-like object to which audio data was
-            written. If parameter ´file´ was a ´str´ with at least one {start},
-            {end} or {duration} place holders.
-
-        :Raises:
-
-        IOError if ´file´ exists and ´exists_ok´ is False.
-
-        Example:
+        Example
+        -------
 
         .. code:: python
             region = AudioRegion(b'\0' * 2 * 24000,
                                     sampling_rate=16000,
                                     sample_width=2,
                                     channels=1)
-            region.meta = {"start": 2.25, "end": 2.25 + region.duration}
+            region.meta.start = 2.25
+            region.meta.end = 2.25 + region.duration
             region.save('audio_{meta.start}-{meta.end}.wav')
             audio_2.25-3.75.wav
             region.save('region_{meta.start:.3f}_{duration:.3f}.wav')
@@ -574,7 +729,7 @@
         to_file(
             self._data,
             file,
-            format,
+            audio_format,
             sr=self.sr,
             sw=self.sw,
             ch=self.ch,
@@ -591,7 +746,8 @@
         strict_min_dur=False,
         **kwargs
     ):
-        """Split region. See :auditok.split() for split parameters description.
+        """Split audio region. See `auditok.split()` for split parameters
+        description.
         """
         if kwargs.get("max_read", kwargs.get("mr")) is not None:
             warn_msg = "'max_read' (or 'mr') should not be used with "
@@ -827,165 +983,167 @@
     Class for stream tokenizers. It implements a 4-state automaton scheme
     to extract sub-sequences of interest on the fly.
 
-    :Parameters:
+    Parameters
+    ----------
 
-        `validator` :
-            Callable or an instance of DataValidator that implements
-            `is_valid` method.
+    validator : callable, DataValidator (must implement `is_valid`)
+        called with each data frame read from source. Should take one positional
+        argument and return True or False for valid and invalid frames
+        respectively.
 
-        `min_length` : *(int)*
-            Minimum number of frames of a valid token. This includes all
-            tolerated non valid frames within the token.
+    min_length : int
+        Minimum number of frames of a valid token. This includes all
+        tolerated non valid frames within the token.
 
-        `max_length` : *(int)*
-            Maximum number of frames of a valid token. This includes all
-            tolerated non valid frames within the token.
+    max_length : int
+        Maximum number of frames of a valid token. This includes all
+        tolerated non valid frames within the token.
 
-        `max_continuous_silence` : *(int)*
-            Maximum number of consecutive non-valid frames within a token.
-            Note that, within a valid token, there may be many tolerated
-            *silent* regions that contain each a number of non valid frames up
-            to `max_continuous_silence`
+    `max_continuous_silence` : *(int)*
+        Maximum number of consecutive non-valid frames within a token.
+        Note that, within a valid token, there may be many tolerated
+        *silent* regions that contain each a number of non valid frames up
+        to `max_continuous_silence`
 
-        `init_min` : *(int, default=0)*
-            Minimum number of consecutive valid frames that must be
-            **initially** gathered before any sequence of non valid frames can
-            be tolerated. This option is not always needed, it can be used to
-            drop non-valid tokens as early as possible. **Default = 0** means
-            that the option is by default ineffective.
+    `init_min` : *(int, default=0)*
+        Minimum number of consecutive valid frames that must be
+        **initially** gathered before any sequence of non valid frames can
+        be tolerated. This option is not always needed, it can be used to
+        drop non-valid tokens as early as possible. **Default = 0** means
+        that the option is by default ineffective.
 
-        `init_max_silence` : *(int, default=0)*
-            Maximum number of tolerated consecutive non-valid frames if the
-            number already gathered valid frames has not yet reached
-            'init_min'.This argument is normally used if `init_min` is used.
-            **Default = 0**, by default this argument is not taken into
-            consideration.
+    `init_max_silence` : *(int, default=0)*
+        Maximum number of tolerated consecutive non-valid frames if the
+        number already gathered valid frames has not yet reached
+        'init_min'.This argument is normally used if `init_min` is used.
+        **Default = 0**, by default this argument is not taken into
+        consideration.
 
-        `mode` : *(int, default=0)*
-            `mode` can be:
+    `mode` : *(int, default=0)*
+        `mode` can be:
 
-        1. `StreamTokenizer.NORMAL`:
-        Do not drop trailing silence, and accept a token shorter than
-        `min_length` if it is the continuation of the latest delivered token.
+    1. `StreamTokenizer.NORMAL`:
+    Do not drop trailing silence, and accept a token shorter than
+    `min_length` if it is the continuation of the latest delivered token.
 
-        2. `StreamTokenizer.STRICT_MIN_LENGTH`:
-        if token *i* is delivered because `max_length`
-        is reached, and token *i+1* is immediately adjacent to
-        token *i* (i.e. token *i* ends at frame *k* and token *i+1* starts
-        at frame *k+1*) then accept token *i+1* only of it has a size of at
-        least `min_length`. The default behavior is to accept token *i+1*
-        event if it is shorter than `min_length` (given that the above
-        conditions are fulfilled of course).
+    2. `StreamTokenizer.STRICT_MIN_LENGTH`:
+    if token *i* is delivered because `max_length`
+    is reached, and token *i+1* is immediately adjacent to
+    token *i* (i.e. token *i* ends at frame *k* and token *i+1* starts
+    at frame *k+1*) then accept token *i+1* only of it has a size of at
+    least `min_length`. The default behavior is to accept token *i+1*
+    event if it is shorter than `min_length` (given that the above
+    conditions are fulfilled of course).
 
-        :Examples:
+    :Examples:
 
-        In the following code, without `STRICT_MIN_LENGTH`, the 'BB' token is
-        accepted although it is shorter than `min_length` (3), because it
-        immediately follows the latest delivered token:
+    In the following code, without `STRICT_MIN_LENGTH`, the 'BB' token is
+    accepted although it is shorter than `min_length` (3), because it
+    immediately follows the latest delivered token:
+
+    .. code:: python
+
+        from auditok import (StreamTokenizer,
+                                StringDataSource,
+                                DataValidator)
+
+        class UpperCaseChecker(DataValidator):
+            def is_valid(self, frame):
+                return frame.isupper()
+
+
+        dsource = StringDataSource("aaaAAAABBbbb")
+        tokenizer = StreamTokenizer(validator=UpperCaseChecker(),
+                                    min_length=3,
+                                    max_length=4,
+                                    max_continuous_silence=0)
+
+        tokenizer.tokenize(dsource)
+
+    :output:
 
         .. code:: python
 
-            from auditok import (StreamTokenizer,
-                                 StringDataSource,
-                                 DataValidator)
+        [(['A', 'A', 'A', 'A'], 3, 6), (['B', 'B'], 7, 8)]
 
-            class UpperCaseChecker(DataValidator):
-                def is_valid(self, frame):
-                    return frame.isupper()
 
+    The following tokenizer will however reject the 'BB' token:
 
-            dsource = StringDataSource("aaaAAAABBbbb")
-            tokenizer = StreamTokenizer(validator=UpperCaseChecker(),
-                                        min_length=3,
-                                        max_length=4,
-                                        max_continuous_silence=0)
+    .. code:: python
 
+        dsource = StringDataSource("aaaAAAABBbbb")
+        tokenizer = StreamTokenizer(validator=UpperCaseChecker(),
+                                    min_length=3, max_length=4,
+                                    max_continuous_silence=0,
+                                    mode=StreamTokenizer.STRICT_MIN_LENGTH)
+        tokenizer.tokenize(dsource)
+
+    :output:
+
+    .. code:: python
+
+        [(['A', 'A', 'A', 'A'], 3, 6)]
+
+
+    3. `StreamTokenizer.DROP_TRAILING_SILENCE`: drop all tailing non-valid
+    frames from a token to be delivered if and only if it is not
+    **truncated**. This can be a bit tricky. A token is actually delivered
+    if: - a. `max_continuous_silence` is reached
+
+    :or:
+
+    - b. Its length reaches `max_length`. This is called a **truncated**
+    token
+
+    In the current implementation, a `StreamTokenizer`'s decision is only
+    based on already seen data and on incoming data. Thus, if a token is
+    truncated at a non-valid but tolerated frame (`max_length` is reached
+    but `max_continuous_silence` not yet) any tailing silence will be kept
+    because it can potentially be part of valid token (if `max_length` was
+    bigger). But if `max_continuous_silence` is reached before
+    `max_length`, the delivered token will not be considered as truncated
+    but a result of *normal* end of detection (i.e. no more valid data).
+    In that case the tariling silence can be removed if you use the
+    `StreamTokenizer.DROP_TRAILING_SILENCE` mode.
+
+    :Example:
+
+    .. code:: python
+
+            tokenizer = StreamTokenizer(
+                            validator=UpperCaseChecker(),
+                            min_length=3,
+                            max_length=6,
+                            max_continuous_silence=3,
+                            mode=StreamTokenizer.DROP_TRAILING_SILENCE
+                            )
+
+            dsource = StringDataSource("aaaAAAaaaBBbbbb")
             tokenizer.tokenize(dsource)
 
-        :output:
+    :output:
 
-         .. code:: python
+    .. code:: python
 
-            [(['A', 'A', 'A', 'A'], 3, 6), (['B', 'B'], 7, 8)]
+        [(['A', 'A', 'A', 'a', 'a', 'a'], 3, 8), (['B', 'B'], 9, 10)]
 
+    The first token is delivered with its tailing silence because it is
+    truncated while the second one has its tailing frames removed.
 
-        The following tokenizer will however reject the 'BB' token:
+    Without `StreamTokenizer.DROP_TRAILING_SILENCE` the output would be:
 
-        .. code:: python
+    .. code:: python
 
-            dsource = StringDataSource("aaaAAAABBbbb")
-            tokenizer = StreamTokenizer(validator=UpperCaseChecker(),
-                                        min_length=3, max_length=4,
-                                        max_continuous_silence=0,
-                                        mode=StreamTokenizer.STRICT_MIN_LENGTH)
-            tokenizer.tokenize(dsource)
+        [
+            (['A', 'A', 'A', 'a', 'a', 'a'], 3, 8),
+            (['B', 'B', 'b', 'b', 'b'], 9, 13)
+        ]
 
-        :output:
 
-        .. code:: python
-
-            [(['A', 'A', 'A', 'A'], 3, 6)]
-
-
-        3. `StreamTokenizer.DROP_TRAILING_SILENCE`: drop all tailing non-valid
-        frames from a token to be delivered if and only if it is not
-        **truncated**. This can be a bit tricky. A token is actually delivered
-        if: - a. `max_continuous_silence` is reached
-
-        :or:
-
-        - b. Its length reaches `max_length`. This is called a **truncated**
-        token
-
-        In the current implementation, a `StreamTokenizer`'s decision is only
-        based on already seen data and on incoming data. Thus, if a token is
-        truncated at a non-valid but tolerated frame (`max_length` is reached
-        but `max_continuous_silence` not yet) any tailing silence will be kept
-        because it can potentially be part of valid token (if `max_length` was
-        bigger). But if `max_continuous_silence` is reached before
-        `max_length`, the delivered token will not be considered as truncated
-        but a result of *normal* end of detection (i.e. no more valid data).
-        In that case the tariling silence can be removed if you use the
-        `StreamTokenizer.DROP_TRAILING_SILENCE` mode.
-
-        :Example:
-
-        .. code:: python
-
-             tokenizer = StreamTokenizer(
-                                validator=UpperCaseChecker(),
-                                min_length=3,
-                                max_length=6,
-                                max_continuous_silence=3,
-                                mode=StreamTokenizer.DROP_TRAILING_SILENCE
-                                )
-
-             dsource = StringDataSource("aaaAAAaaaBBbbbb")
-             tokenizer.tokenize(dsource)
-
-        :output:
-
-        .. code:: python
-
-            [(['A', 'A', 'A', 'a', 'a', 'a'], 3, 8), (['B', 'B'], 9, 10)]
-
-        The first token is delivered with its tailing silence because it is
-        truncated while the second one has its tailing frames removed.
-
-        Without `StreamTokenizer.DROP_TRAILING_SILENCE` the output would be:
-
-        .. code:: python
-
-            [
-                (['A', 'A', 'A', 'a', 'a', 'a'], 3, 8),
-                (['B', 'B', 'b', 'b', 'b'], 9, 13)
-            ]
-
-
-        4. `(StreamTokenizer.STRICT_MIN_LENGTH |
-             StreamTokenizer.DROP_TRAILING_SILENCE)`:
-        use both options. That means: first remove tailing silence, then ckeck
-        if the token still has at least a length of `min_length`.
+    4. `(StreamTokenizer.STRICT_MIN_LENGTH |
+            StreamTokenizer.DROP_TRAILING_SILENCE)`:
+    use both options. That means: first remove tailing silence, then ckeck
+    if the token still has at least a length of `min_length`.
     """
 
     SILENCE = 0