Mercurial > hg > auditok

--- a/README.md	Wed Dec 02 11:16:27 2015 +0100
+++ b/README.md	Wed Dec 02 23:17:35 2015 +0100
@@ -4,6 +4,8 @@

 `auditok` is an **Audio Activity Detection** tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program and offers an easy to use API.

+A more detailed version of this user guide as well as an API tutorial and API reference can be found at [Readthedocs](http://auditok.readthedocs.org/en/latest/)
+
 - [Two-figure explanation](https://github.com/amsehili/auditok#two-figure-explanation)
 - [Requirements](https://github.com/amsehili/auditok#requirements)
 - [Installation](https://github.com/amsehili/auditok#installation)
@@ -33,6 +35,8 @@
 2. the detector splits an audio activity event into many activities if the within activity silence is over 0.2 second:
 ![](doc/figures/figure_2.png)

+Beyond plotting signal and detections, you can play back audio activities as they are detected, save them or run a user command each time there is an activity,
+using, optionally, the file name of audio activity as an argument for the command.

 Requirements
 ------------
@@ -58,18 +62,18 @@

     auditok

-This will print **id** **start time** and **end time** for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell `auditok` to read data from standard input:
+This will print `id`, `start-time` and `end-time` for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell `auditok` to read data from standard input:

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -r 16000 -w 2 -c 1

 Note that when data is read from standard input the same audio parameters must be used for both `sox` (or any other data generation/acquisition tool) and `auditok`. The following table summarizes audio parameters.

-| Audio parameter | sox	option | auditok option | `auditok` default     |
-| --------------- |------------|----------------|-----------------------|
-| Sampling rate   |     -r     |       -r       |      16000            |
-| Sample width    |  -b (bits) |     -w (bytes) |      2                |
-| Channels        |  -c        |     -c         |      1                |
-| Encoding        |  -e        |     None       | always signed integer |
+| Audio parameter | sox	option | `auditok` option | `auditok` default     |
+| --------------- |------------|------------------|-----------------------|
+| Sampling rate   |     -r     |       -r         |      16000            |
+| Sample width    |  -b (bits) |     -w (bytes)   |      2                |
+| Channels        |  -c        |     -c           |      1                |
+| Encoding        |  -e        |     None         | always signed integer |

 According to this table, the previous command can be run as:

@@ -79,7 +83,7 @@

     auditok -E

-OR
+**or**

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -E

@@ -99,7 +103,7 @@

     auditok -E -e 55

-OR
+**or**

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -e 55 -C "play -q -t raw -r 16000 -c 1 -b 16 -e signed $"

@@ -107,14 +111,14 @@

 ### Set format for printed detections information

-By default, `auditok` prints the `id` `start time` `end time` of each detected activity:
+By default, `auditok` prints the `id` `start-time` `end-time` of each detected activity:

     1 1.87 2.67
     2 3.05 3.73
     3 3.97 4.49
     ...

-If you want to personalize the output format, use `--printf` option:
+If you want to customize the output format, use `--printf` option:

     auditok -e 55 --printf "[{id}]: {start} to {end}"

@@ -204,7 +208,7 @@

     auditok -o det_{N}_{start}_{end}.wav ...

-You can use a free text and place `{N}`, `{start}` and `{end}` wherever you want, they will be replaced by detection number, start time and end time respectively. Another example:
+You can use a free text and place `{N}`, `{start}` and `{end}` wherever you want, they will be replaced by detection number, `start-time` and `end-time` respectively. Another example:

     auditok -o {start}-{end}.wav ...
--- a/auditok/cmdline.py	Wed Dec 02 11:16:27 2015 +0100
+++ b/auditok/cmdline.py	Wed Dec 02 23:17:35 2015 +0100
@@ -14,7 +14,7 @@
 @license:    GPL v3

 @contact:    amine.sehili@gmail.com
-@deffield    updated: 28 nov 2015
+@deffield    updated: 02 Dec 2015
 '''

 import sys
@@ -52,7 +52,7 @@
 __all__ = []
 __version__ = version
 __date__ = '2015-11-23'
-__updated__ = '2015-11-28'
+__updated__ = '2015-12-02'

 DEBUG = 0
 TESTRUN = 1
@@ -550,14 +550,14 @@
         # setup option parser
         parser = OptionParser(version=program_version_string, epilog=program_longdesc, description=program_license)

-        group = OptionGroup(parser, "[Input-Outpu options]")
+        group = OptionGroup(parser, "[Input-Output options]")
         group.add_option("-i", "--input", dest="input", help="Input audio or video file. Use - for stdin [default: read from microphone using pyaudio]", metavar="FILE")
         group.add_option("-t", "--input-type", dest="input_type", help="Input audio file type. Mandatory if file name has no extension [default: %default]", type=str, default=None, metavar="String")
         group.add_option("-M", "--max_time", dest="max_time", help="Max data (in seconds) to read from microphone/file [default: read until the end of file/stream]", type=float, default=None, metavar="FLOAT")
         group.add_option("-O", "--output-main", dest="output_main", help="Save main stream as. If omitted main stream will not be saved [default: omitted]", type=str, default=None, metavar="FILE")
         group.add_option("-o", "--output-tokens", dest="output_tokens", help="Output file name format for detections. Use {N} and {start} and {end} to build file names, example: 'Det_{N}_{start}-{end}.wav'", type=str, default=None, metavar="STRING")
         group.add_option("-T", "--output-type", dest="output_type", help="Audio type used to save detections and/or main stream. If not supplied will: (1). guess from extension or (2). use wav format", type=str, default=None, metavar="STRING")
-        group.add_option("-u", "--use-channel", dest="use_channel", help="Choose channel to use from a multi-channel audio file (requires pydub). 'left', 'right' and 'mix' are accepted values. [Default: 1 (i.e. 1st/left channel)]", type=str, default="1", metavar="STRING")
+        group.add_option("-u", "--use-channel", dest="use_channel", help="Choose channel to use from a multi-channel audio file (requires pydub). 'left', 'right' and 'mix' are accepted values. [Default: 1 (i.e. 1st or left channel)]", type=str, default="1", metavar="STRING")
         parser.add_option_group(group)


@@ -577,16 +577,19 @@
         group.add_option("-w", "--width", dest="sample_width", help="Number of bytes per audio sample [default: %default]", type=int, default=2, metavar="INT")
         parser.add_option_group(group)

-        parser.add_option("-C", "--command", dest="command", help="Command to call when an audio detection occurs. Use $ to represent the file name to use with the command", default=None, type=str, metavar="STRING")
-        parser.add_option("-E", "--echo", dest="echo", help="Play back each detection immediately using pyaudio [default: do not play]",  action="store_true", default=False)
-        parser.add_option("-p", "--plot", dest="plot", help="Plot and show audio signal and detections (requires matplotlib)",  action="store_true", default=False)
-        parser.add_option("", "--save-image", dest="save_image", help="Save plotted audio signal and detections as a picture or a PDF file (requires matplotlib)",  type=str, default=None, metavar="FILE")
+        group = OptionGroup(parser, "[Do something with detections]", "Use these options to print, play or plot detections.")
+        group.add_option("-C", "--command", dest="command", help="Command to call when an audio detection occurs. Use $ to represent the file name to use with the command (e.g. -C 'du -h $')", default=None, type=str, metavar="STRING")
+        group.add_option("-E", "--echo", dest="echo", help="Play back each detection immediately using pyaudio [default: do not play]",  action="store_true", default=False)
+        group.add_option("-p", "--plot", dest="plot", help="Plot and show audio signal and detections (requires matplotlib)",  action="store_true", default=False)
+        group.add_option("", "--save-image", dest="save_image", help="Save plotted audio signal and detections as a picture or a PDF file (requires matplotlib)",  type=str, default=None, metavar="FILE")
+        group.add_option("", "--printf", dest="printf", help="print detections one per line using a user supplied format (e.g. '[{id}]: {start} -- {end}'). Available keywords {id}, {start} and {end}",  type=str, default="{id} {start} {end}", metavar="STRING")
+        group.add_option("", "--time-format", dest="time_format", help="format used to print {start} and {end}. [Default= %default]. %S: absolute time in sec. %I: absolute time in ms. If at least one of (%h, %m, %s, %i) is used, convert time into hours, minutes, seconds and millis (e.g. %h:%m:%s.%i). Only required fields are printed",  type=str, default="%S", metavar="STRING")
+        parser.add_option_group(group)
+
+        parser.add_option("-q", "--quiet", dest="quiet", help="Do not print any information about detections [default: print 'id', 'start' and 'end' of each detection]",  action="store_true", default=False)
         parser.add_option("-D", "--debug", dest="debug", help="Print processing operations to STDOUT",  action="store_true", default=False)
         parser.add_option("", "--debug-file", dest="debug_file", help="Print processing operations to FILE",  type=str, default=None, metavar="FILE")

-        parser.add_option("-q", "--quiet", dest="quiet", help="Do not print any information about detections [default: print start an end of each detection]",  action="store_true", default=False)
-        parser.add_option("", "--printf", dest="printf", help="print detections one per line using a user supplied format (e.g. '[{id}]: {start} -- {end}'). Available keywords {id}, {start} and {end}",  type=str, default="{id} {start} {end}", metavar="STRING")
-        parser.add_option("", "--time-format", dest="time_format", help="format used to print {start} and {end}. [Default= %default]. %S: absolute time in sec. %I: absolute time in ms. If at least one of (%h, %m, %s, %i) is used, convert time into hours, minutes, seconds and millis (e.g. %h:%m:%s.%i). Only required fields are printed",  type=str, default="%S", metavar="STRING")


         # process options
@@ -611,8 +614,9 @@
                 sys.stderr.write("You should either install pyaudio or read data from STDIN\n")
                 sys.exit(2)

-
         logger = logging.getLogger(LOGGER_NAME)
+        logger.setLevel(logging.DEBUG)
+
         handler = logging.StreamHandler(sys.stdout)
         if opts.quiet or not opts.debug:
             # only critical messages will be printed
@@ -630,7 +634,7 @@
             handler.setFormatter(fmt)
             handler.setLevel(logging.DEBUG)
             logger.addHandler(handler)
-
+
         record = opts.output_main is not None or opts.plot

         ads = ADSFactory.ads(audio_source = asource, block_dur = opts.analysis_window, max_time = opts.max_time, record = record)
@@ -763,7 +767,6 @@

         return 2

-
 if __name__ == "__main__":
     if DEBUG:
         sys.argv.append("-h")
@@ -781,4 +784,4 @@
         stats.print_stats()
         statsfile.close()
         sys.exit(0)
-    sys.exit(main())
+    sys.exit(main())
\ No newline at end of file
--- a/auditok/core.py	Wed Dec 02 11:16:27 2015 +0100
+++ b/auditok/core.py	Wed Dec 02 23:17:35 2015 +0100
@@ -54,7 +54,7 @@
         `mode` : *(int, default=0)*
             `mode` can be:

-        #. `StreamTokenizer.STRICT_MIN_LENGTH`:
+        1. `StreamTokenizer.STRICT_MIN_LENGTH`:
         if token *i* is delivered because `max_length`
         is reached, and token *i+1* is immediately adjacent to
         token *i* (i.e. token *i* ends at frame *k* and token *i+1* starts
@@ -112,7 +112,7 @@
             [(['A', 'A', 'A', 'A'], 3, 6)]


-        #. `StreamTokenizer.DROP_TRAILING_SILENCE`: drop all tailing non-valid frames
+        2. `StreamTokenizer.DROP_TRAILING_SILENCE`: drop all tailing non-valid frames
         from a token to be delivered if and only if it is not **truncated**.
         This can be a bit tricky. A token is actually delivered if:

@@ -187,16 +187,13 @@
             raise ValueError("'max_length' must be > 0 (value={0})".format(max_length))

         if min_length <= 0 or min_length > max_length:
-            raise ValueError("'min_length' must be > 0 and <= 'max_length' \
-            (value={0})".format(min_length))
+            raise ValueError("'min_length' must be > 0 and <= 'max_length' (value={0})".format(min_length))

         if max_continuous_silence >= max_length:
-            raise ValueError("'max_continuous_silence' must be < \
-            'max_length' (value={0})".format(max_continuous_silence))
+            raise ValueError("'max_continuous_silence' must be < 'max_length' (value={0})".format(max_continuous_silence))

         if init_min >= max_length:
-            raise ValueError("'init_min' must be < \
-            'max_length' (value={0})".format(max_continuous_silence))
+            raise ValueError("'init_min' must be < 'max_length' (value={0})".format(max_continuous_silence))

         self.validator = validator
         self.min_length = min_length
--- a/doc/apitutorial.rst	Wed Dec 02 11:16:27 2015 +0100
+++ b/doc/apitutorial.rst	Wed Dec 02 23:17:35 2015 +0100
@@ -17,7 +17,7 @@
 criteria in terms of:

 1. Minimum length of a **valid** token (i.e. sub-sequence)
-2. Maximum length of a valid token
+2. Maximum length of a **valid** token
 3. Maximum tolerated consecutive **non-valid** observations within
    a valid token

@@ -142,7 +142,7 @@

 Notice the trailing lower case letters "dd" and "ee" at the end of the two
 tokens. The default behavior of :class:`auditok.core.StreamTokenizer` is to keep the *trailing
-silence* if it doesn't exceed `max_continuous_silence`. This can be changed
+silence* if it does not exceed `max_continuous_silence`. This can be changed
 using the `StreamTokenizer.DROP_TRAILING_SILENCE` mode (see next example).

 Remove trailing silence
@@ -226,25 +226,29 @@
 `auditok` and Audio Data
 ************************

-In this section we will use :class:`auditok.util.ADSFactory`, :class:`auditok.util.AudioEnergyValidator`
-and :class:`auditok.core.StreamTokenizer` for an AAD demonstration using audio data. Before we get any
+In the rest of this document we will use :class:`auditok.util.ADSFactory`, :class:`auditok.util.AudioEnergyValidator`
+and :class:`auditok.core.StreamTokenizer` for Audio Activity Detection demos using audio data. Before we get any
 further it is worth, explaining a certain number of points.

-:func:`auditok.util.ADSFactory.ads` method is called to create an
-:class:`auditok.util.ADSFactory.AudioDataSource` object that can be1
-passed to  :func:`auditok.core.StreamTokenizer.tokenize`. :func:`auditok.util.ADSFactory.ads`
-accepts a number of keyword arguments, of which none is mandatory. The returned
-:class:`auditok.util.ADSFactory.AudioDataSource` object's features and behavior can however greatly differ
-depending on the passed arguments. Further details can be found in the respective
-method documentation. Note however the following two calls that will
-create an :class:`auditok.util.ADSFactory.AudioDataSource` that read data from an
-audio file and from the built-in microphone respectively.
+:func:`auditok.util.ADSFactory.ads` method is used to create an :class:`auditok.util.ADSFactory.AudioDataSource`
+object either from a wave file, the built-in microphone or a user-supplied data buffer. Refer to the API reference
+for more information and examples on :func:`ADSFactory.ads` and :class:`AudioDataSource`.
+
+The created :class:`AudioDataSource` object is then passed to :func:`StreamTokenizer.tokenize` for tokenization.
+
+:func:`auditok.util.ADSFactory.ads` accepts a number of keyword arguments, of which none is mandatory.
+The returned :class:`AudioDataSource` object's features and behavior can however greatly differ
+depending on the passed arguments. Further details can be found in the respective method documentation.
+
+Note however the following two calls that will create an :class:`AudioDataSource`
+that reads data from an audio file and from the built-in microphone respectively.

 .. code:: python

     from auditok import ADSFactory

     # Get an AudioDataSource from a file
+    # use 'filename', alias 'fn' keyword argument
     file_ads = ADSFactory.ads(filename = "path/to/file/")

     # Get an AudioDataSource from the built-in microphone
@@ -253,50 +257,72 @@
     # documentation for customized values
     mic_ads = ADSFactory.ads()

-For `StreamTkenizer`, parameters `min_length`, `max_length` and `max_continuous_silence`
-are expressed in term of number of frames. If you want a `max_length` of *2 seconds* for
-your detected sound events and your *analysis window* is *10 ms* long, you have to specify
-a `max_length` of 200 (`int(2. / (10. / 1000)) == 200`). For a `max_continuous_silence` of *300 ms*
-for instance, the value to pass to StreamTokenizer is 30 (`int(0.3 / (10. / 1000)) == 30`).
+For :class:`StreamTkenizer`, parameters `min_length`, `max_length` and `max_continuous_silence`
+are expressed in terms of number of frames. Each call to :func:`AudioDataSource.read` returns
+one frame of data or None.

+If you want a `max_length` of 2 seconds for your detected sound events and your *analysis window*
+is *10 ms* long, you have to specify a `max_length` of 200 (`int(2. / (10. / 1000)) == 200`).
+For a `max_continuous_silence` of *300 ms* for instance, the value to pass to StreamTokenizer is 30
+(`int(0.3 / (10. / 1000)) == 30`).

-Where do you get the size of the **analysis window** from?
+Each time :class:`StreamTkenizer` calls the :func:`read` (has no argument) method of an
+:class:`AudioDataSource` object, it returns the same amount of data, except if there are no more
+data (returns what's left in stream or None).

+This fixed-length amount of data is referred here to as **analysis window** and is a parameter of
+:func:`ADSFactory.ads` method. By default :func:`ADSFactory.ads` uses an analysis window of 10 ms.

-Well this is a parameter you pass to `ADSFactory.ads`. By default `ADSFactory.ads` uses
-an analysis window of 10 ms. the number of samples that 10 ms of signal contain will
-vary depending on the sampling rate of your audio source (file, microphone, etc.).
+The number of samples that 10 ms of audio data contain will vary, depending on the sampling
+rate of your audio source/data (file, microphone, etc.).
 For a sampling rate of 16KHz (16000 samples per second), we have 160 samples for 10 ms.
-Therefore you can use block sizes of 160, 320, 1600 for analysis windows of 10, 20 and 100
-ms respectively.
+
+You can use the `block_size` keyword (alias `bs`) to define your analysis window:

 .. code:: python

     from auditok import ADSFactory

+    '''
+    Assume you have an audio file with a sampling rate of 16000
+    '''
+
+    # file_ads.read() will return blocks of 160 sample
     file_ads = ADSFactory.ads(filename = "path/to/file/", block_size = 160)

-    file_ads = ADSFactory.ads(filename = "path/to/file/", block_size = 320)
+    # file_ads.read() will return blocks of 320 sample
+    file_ads = ADSFactory.ads(filename = "path/to/file/", bs = 320)

-    # If no sampling rate is specified, ADSFactory use 16KHz as the default
-    # rate for the microphone. If you want to use a window of 100 ms, use
-    # a block size of 1600
-    mic_ads = ADSFactory.ads(block_size = 1600)
-
-So if your not sure what you analysis windows in seconds is, use the following:
+
+Fortunately, you can specify the size of your analysis window in seconds, thanks to keyword `block_dur`
+(alias `bd`):

 .. code:: python

-    my_ads = ADSFactory.ads(...)
-    analysis_win_seconds = float(my_ads.get_block_size()) / my_ads.get_sampling_rate()
+    from auditok import ADSFactory
+    # use an analysis window of 20 ms
+    file_ads = ADSFactory.ads(filename = "path/to/file/", bd = 0.02)
+
+For :class:`StreamTkenizer`, each :func:`read` call that does not return `None` is treated as a processing
+frame. :class:`StreamTkenizer` has no way to figure out the temporal length of that frame (why sould it?). So to
+correctly initialize your :class:`StreamTokenizer`, based on your analysis window duration, use something like:
+
+
+.. code:: python
+
+    analysis_win_seconds = 0.01 # 10 ms
+    my_ads = ADSFactory.ads(block_dur = analysis_win_seconds)
     analysis_window_ms = analysis_win_seconds * 1000

-    # For a `max_continuous_silence` of 300 ms use:
+    # If you want your maximum continuous silence to be 300 ms use:
     max_continuous_silence = int(300. / analysis_window_ms)

-    # Which is the same as
+    # which is the same as:
     max_continuous_silence = int(0.3 / (analysis_window_ms / 1000))

+    # or simply:
+    max_continuous_silence = 30
+

 ******************************
 Examples using real audio data
@@ -411,7 +437,7 @@
 large analysis window (here of 100 ms) to ensure that the brief noise be surrounded by a much
 longer silence and hence the energy of the overall analysis window will be below 50.

-When using a shorter analysis window (of 10ms for instance, block_size == 441), the brief
+When using a shorter analysis window (of 10 ms for instance, block_size == 441), the brief
 noise contributes more to energy calculation which yields an energy of over 50 for the window.
 Again we can deal with this situation by using a higher energy threshold (55 for example).

@@ -444,7 +470,6 @@
     # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence
     trimmer = StreamTokenizer(validator, min_length = 20, max_length=99999999, init_min=3, init_max_silence=1, max_continuous_silence=9999999, mode=StreamTokenizer.DROP_TRAILING_SILENCE)

-
     tokens = trimmer.tokenize(asource)

     # Make sure we only have one token
--- a/doc/cmdline.rst	Wed Dec 02 11:16:27 2015 +0100
+++ b/doc/cmdline.rst	Wed Dec 02 23:17:35 2015 +0100
@@ -3,7 +3,6 @@

 This user guide will go through a few of the most useful operations you can use **auditok** for and present two practical use cases.

-
 .. contents:: `Contents`
    :depth: 3

@@ -12,7 +11,8 @@
 Two-figure explanation
 **********************

-The following two figures illustrate an audio signal (blue) and regions detected as valid audio activities (green rectangles) according to a given threshold (red dashed line). They respectively depict the detection result when:
+The following two figures illustrate an audio signal (blue) and regions detected as valid audio activities (green rectangles) according to
+a given threshold (red dashed line). They respectively depict the detection result when:

 1. the detector tolerates phases of silence of up to 0.3 second (300 ms) within an audio activity (also referred to as acoustic event):

@@ -30,6 +30,8 @@
     :figclass: align-center
     :scale: 40 %

+Beyond plotting signal and detections, you can play back audio activities as they are detected, save them or run a user command each time there is an activity,
+using, optionally, the file name of audio activity as an argument for the command.

 ******************
 Command line usage
@@ -38,30 +40,30 @@
 Try the detector with your voice
 ################################

-The first thing you want to check is perhaps how `auditok` detects your voice. If you have installed `PyAudio` just run (`Ctrl-C` to stop):
+The first thing you want to check is perhaps how **auditok** detects your voice. If you have installed `PyAudio` just run (`Ctrl-C` to stop):

 .. code:: bash

     auditok

-This will print **id** **start time** and **end time** for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell `auditok` to read data from standard input:
+This will print **id** **start-time** and **end-time** for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell **auditok** to read data from standard input:

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -r 16000 -w 2 -c 1

-Note that when data is read from standard input the same audio parameters must be used for both `sox` (or any other data generation/acquisition tool) and `auditok`. The following table summarizes audio parameters.
+Note that when data is read from standard input the same audio parameters must be used for both `sox` (or any other data generation/acquisition tool) and **auditok**. The following table summarizes audio parameters.


-+-----------------+------------+----------------+-----------------------+
-| Audio parameter | sox option | auditok option | `auditok` default     |
-+=================+============+================+=======================+
-| Sampling rate   |     -r     |       -r       |      16000            |
-+-----------------+------------+----------------+-----------------------+
-| Sample width    |  -b (bits) |     -w (bytes) |      2                |
-+-----------------+------------+----------------+-----------------------+
-| Channels        |  -c        |     -c         |      1                |
-+-----------------+------------+----------------+-----------------------+
-| Encoding        |  -e        |     None       | always signed integer |
-+-----------------+------------+----------------+-----------------------+
++-----------------+------------+------------------+-----------------------+
+| Audio parameter | sox option | `auditok` option | `auditok` default     |
++=================+============+==================+=======================+
+| Sampling rate   |     -r     |       -r         |      16000            |
++-----------------+------------+------------------+-----------------------+
+| Sample width    |  -b (bits) |     -w (bytes)   |      2                |
++-----------------+------------+------------------+-----------------------+
+| Channels        |  -c        |     -c           |      1                |
++-----------------+------------+------------------+-----------------------+
+| Encoding        |  -e        |     None         | always signed integer |
++-----------------+------------+------------------+-----------------------+

 According to this table, the previous command can be run as:

@@ -76,19 +78,19 @@

     auditok -E

-OR
+:or:

 .. code:: bash

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -E

-Option `-E` stands for echo, so `auditok` plays back whatever it detects. Using `-E` requires `PyAudio`, if you don't have `PyAudio` and want to play detections with sox, use the `-C` option:
+Option `-E` stands for echo, so **auditok** will play back whatever it detects. Using `-E` requires `PyAudio`, if you don't have `PyAudio` and want to play detections with sox, use the `-C` option:

 .. code:: bash

     rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -C "play -q -t raw -r 16000 -c 1 -b 16 -e signed $"

-The `-C` option tells `auditok` to interpret its content as a command that should be run whenever `auditok` detects an audio activity, replacing the `$` by a name of a temporary file into which the activity is saved as raw audio. Here we use `play` to play the activity, giving the necessary `play` arguments for raw data.
+The `-C` option tells **auditok** to interpret its content as a command that should be run whenever **auditok** detects an audio activity, replacing the `$` by a name of a temporary file into which the activity is saved as raw audio. Here we use `play` to play the activity, giving the necessary `play` arguments for raw data.

 `rec` and `play` are just an alias for `sox`.

@@ -103,7 +105,7 @@

     auditok -E -e 55

-OR
+:or:

 .. code:: bash

@@ -114,7 +116,7 @@
 Set format for printed detections information
 #############################################

-By default, `auditok` prints the `id` `start time` `end time` of each detected activity:
+By default, **auditok** prints the **id**, **start-time** and **end-time** of each detected activity:

 .. code:: bash

@@ -123,11 +125,13 @@
     3 3.97 4.49
     ...

-If you want to personalize the output format, use `--printf` option:
+If you want to customize the output format, use `--printf` option:
+
+.. code:: bash

     auditok -e 55 --printf "[{id}]: {start} to {end}"

-Output:
+:output:

 .. code:: bash

@@ -141,7 +145,7 @@

     auditok -e 55 --printf "[{id}]: {start} to {end}" --time-format "%h:%m:%s.%i"

-Output:
+:output:

 .. code:: bash

@@ -163,7 +167,7 @@

     auditok -e 55 -i input.wav -m 10 --printf "{id}\n{start} --> {end}\nPut some text here...\n" --time-format "%h:%m:%s.%i"

-Output:
+:output:

 .. code:: bash

@@ -204,20 +208,20 @@

     sox -t raw -r 16000 -c 1 -b 16 -e signed raw_input output.flac

-2- Send falc audio to google and get its filtred transcription using `speech-rec.sh <https://github.com/amsehili/gspeech-rec/blob/master/speech-rec.sh>`_ :
+2- Send flac audio data to Google and get its filtered transcription using `speech-rec.sh <https://github.com/amsehili/gspeech-rec/blob/master/speech-rec.sh>`_ :

 .. code:: bash

     speech-rec.sh -i output.flac -r 16000

-3- Use **grep** to select lines that coantain *transcript*:
+3- Use **grep** to select lines that contain *transcript*:

 .. code:: bash

     grep transcript


-4- Launch the followin script, giving it the transcription as input:
+4- Launch the following script, giving it the transcription as input:

 .. code:: bash

@@ -236,18 +240,21 @@

     exit 0

-As you can see, the script can handle one single voice command. It runs firefox if the text it receives contains **run firefox**.
+As you can see, the script can handle one single voice command. It runs firefox if the text it receives contains **open firefox**.
 Save a script into a file named voice-control.sh (don't forget to run a **chmod u+x voice-control.sh**).

-Now, thanks to option `-C`, we will use the three instructions with a pipe and tell auditok to run them for every time it detects
+Now, thanks to option `-C`, we will use the four instructions with a pipe and tell **auditok** to run them each time it detects
 an audio activity. Try the following command and say *open firefox*:


 .. code:: bash

-    rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -M 5 -m 3 -n 1 --debug-file log -e 60 -C "sox -t raw -r 16000 -c 1 -b 16 -e signed $ audio.flac ; speech-rec.sh -i audio.flac -r 16000 | grep transcript | ./voice-control.sh"
+    rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -M 5 -m 3 -n 1 --debug-file file.log -e 60 -C "sox -t raw -r 16000 -c 1 -b 16 -e signed $ audio.flac ; speech-rec.sh -i audio.flac -r 16000 | grep transcript | ./voice-control.sh"

+Here we used option `-M 5` to limit the amount of read audio data to 5 seconds (**auditok** stops if there are no more data) and
+option `-n 1` to tell **auditok** to only accept tokens of 1 second or more and throw any token shorter than 1 second.

+With `--debug-file file.log`, all processing steps are written into file.log with their timestamps, including any run command and the file name the command was given.


 Plot signal and detections
@@ -287,7 +294,7 @@

     auditok -M 12 ...

-Time is in seconds.
+Time is in seconds. This is valid for data read from an audio device, stdin or an audio file.


 Save the whole acquired audio signal
@@ -337,11 +344,79 @@
 +--------+-------------------------------------------------------+---------+------------------+


+Normally, `auditok` does keeps trailing silence of a detected activity. Trailing silence is at most as long as maximum length of a continuous silence (option `-m`) and can be important for some applications such as speech recognition. If you want to drop trailing silence anyway use option `-d`. The following two figures show the output of the detector when it keeps the trailing silence and when it drops it respectively:
+
+
+.. figure:: figures/figure_3_keep_trailing_silence.png
+    :align: center
+    :alt: Output from a detector that keeps trailing silence
+    :figclass: align-center
+    :scale: 40 %
+
+
+.. code:: bash
+
+    auditok ...  -d
+
+
+.. figure:: figures/figure_4_drop_trailing_silence.png
+    :align: center
+    :alt: Output from a detector that drop trailing silence
+    :figclass: align-center
+    :scale: 40 %
+
+You might want to only consider audio activities if they are above a certain duration. The next figure is the result of a detector that only accepts detections of 0.8 second and longer:
+
+.. code:: bash
+
+    auditok ...  -n 0.8
+
+
+.. figure:: figures/figure_5_min_800ms.png
+    :align: center
+    :alt: Output from a detector that detect activities of 800 ms or over
+    :figclass: align-center
+    :scale: 40 %
+
+
+Finally it is almost always interesting to limit the length of detected audio activities. In any case, one does not want a too long audio event such as an alarm or a drill to hog the detector. For illustration purposes, we set the maximum duration to 0.4 second for this detector, so an audio activity is delivered as soon as it reaches 0.4 second:
+
+.. code:: bash
+
+    auditok ...  -m 0.4
+
+
+.. figure:: figures/figure_6_max_400ms.png
+    :align: center
+    :alt: Output from a detector that delivers audio activities that reach 400 ms
+    :figclass: align-center
+    :scale: 40 %
+
+
+Debugging
+#########
+
+If you want to print what happens when something is detected, use option `-D`.
+
+.. code:: bash
+
+    auditok ...  -D
+
+
+If you want to save everything into a log file, use `--debug-file file.log`.
+
+.. code:: bash
+
+    auditok ...  --debug-file file.log
+
+
+
+
 *******
 License
 *******

-`auditok` is published under the GNU General Public License Version 3.
+**auditok** is published under the GNU General Public License Version 3.

 ******
 Author
--- a/doc/conf.py	Wed Dec 02 11:16:27 2015 +0100
+++ b/doc/conf.py	Wed Dec 02 23:17:35 2015 +0100
@@ -119,11 +119,13 @@
 import os
 on_rtd = os.environ.get('READTHEDOCS', None) == 'True'

+
 if not on_rtd:  # only import and set the theme if we're building docs locally
     import sphinx_rtd_theme
     html_theme = 'sphinx_rtd_theme'
     html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]

+
 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
 # documentation.
Binary file doc/figures/figure_3_keep_trailing_silence.png has changed
Binary file doc/figures/figure_4_drop_trailing_silence.png has changed
Binary file doc/figures/figure_5_min_800ms.png has changed
Binary file doc/figures/figure_6_max_400ms.png has changed
--- a/doc/index.rst	Wed Dec 02 11:16:27 2015 +0100
+++ b/doc/index.rst	Wed Dec 02 23:17:35 2015 +0100
@@ -6,6 +6,7 @@

 **auditok** is an **Audio Activity Detection** tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program and offers an easy to use API.

+The latest version of this documentation can be found at `Readthedocs <http://auditok.readthedocs.org/en/latest/>`_.

 Requirements
 ------------
@@ -34,18 +35,28 @@
     sudo python setup.py install


+
 Getting started
 ---------------

 .. toctree::
     :titlesonly:
-    :maxdepth: 3
+    :maxdepth: 2

        Command-line Usage Guide <cmdline.rst>
        API Tutorial <apitutorial.rst>
-       API Reference <apireference.rst>


+API Reference
+-------------
+
+.. toctree::
+    :maxdepth: 3
+
+       auditok.core <core.rst>
+       auditok.util <util.rst>
+       auditok.io <io.rst>
+       auditok.dataset <dataset.rst>


 Indices and tables
--- a/setup.py	Wed Dec 02 11:16:27 2015 +0100
+++ b/setup.py	Wed Dec 02 23:17:35 2015 +0100
@@ -32,10 +32,6 @@
     include_package_data=True,
     package_data={'auditok': ['data/*']},

-    #data_files=[(['README.md', 'quickstart.rst', 'LICENSE', 'INSTALL', 'CHANGELOG']),
-    #            ('share/doc/pdoc', ['doc/pdoc/index.html']),
-    #           ],
-
     zip_safe=False,
     platforms='ANY',
     provides=['auditok'],