# HG changeset patch # User Amine Sehili # Date 1443001826 -7200 # Node ID 6b2cc3ca5b6a64549142e49619664a77fa8c9ca6 # Parent 61aa25f0eb317c40fe011308ddf74de7224dbd69 doc update diff -r 61aa25f0eb31 -r 6b2cc3ca5b6a auditok/__init__.py --- a/auditok/__init__.py Wed Sep 23 11:28:43 2015 +0200 +++ b/auditok/__init__.py Wed Sep 23 11:50:26 2015 +0200 @@ -1,5 +1,4 @@ """ - `auditok` is a module that can be used as a generic tool for data tokenization. Although its core motivation is **Acoustic Activity Detection** (AAD) and extraction from audio streams (i.e. detect @@ -51,7 +50,7 @@ - Limit the amount (i.e. duration) of read data (very useful when reading data from the microphone) - Record and rewind data (also useful if you read data from the microphone - and you want to process it many times offline and/or save it) + and you want to process it many times off-line and/or save it) Last but not least, the current version has only one audio window validator based on @@ -75,7 +74,7 @@ ## Extract sub-sequences of consecutive upper case letters We want to extract sub-sequences of characters that have: - - A minimu length of 1 (`min_length` = 1) + - A minimum length of 1 (`min_length` = 1) - A maximum length of 9999 (`max_length` = 9999) - Zero consecutive lower case characters within them (`max_continuous_silence` = 0) @@ -125,7 +124,7 @@ Notice the tailing lower case letters "dd" and "ee" at the end of the two tokens. The default behavior of `StreamTokenizer` is to keep the *tailing -silence* if it does'nt exceed `max_continuous_silence`. This can be changed +silence* if it doesn't exceed `max_continuous_silence`. This can be changed using the `DROP_TAILING_SILENCE` mode (see next example). ## Remove tailing silence @@ -350,7 +349,7 @@ In other words, it removes the leading and tailing silence. Sampling rate is 44100 sample per second, we'll use an analysis window of 100 ms -(i.e. bloc_ksize == 4410) +(i.e. block_size == 4410) Energy threshold is 50. @@ -368,7 +367,7 @@ is at most 1 silent window. Still with this configuration we can get the tokenizer detect that noise as a valid event -(if it actually contains 3 consecutive noisy frames). To circummvent this we use an enough +(if it actually contains 3 consecutive noisy frames). To circumvent this we use an enough large analysis window (here of 100 ms) to ensure that the brief noise be surrounded by a much longer silence and hence the energy of the overall analysis window will be below 50. @@ -425,12 +424,12 @@ ## Online audio signal processing -In the next example, audio data is directely acquired from the built-in microphone. +In the next example, audio data is directly acquired from the built-in microphone. The `tokenize` method is passed a callback function so that audio activities are delivered as soon as they are detected. Each detected activity is played back using the build-in audio output device. -As mentionned before , Signal energy is strongly related to many factors such +As mentioned before , Signal energy is strongly related to many factors such microphone sensitivity, background noise (including noise inherent to the hardware), distance and your operating system sound settings. Try a lower `energy_threshold` if your noise does not seem to be detected and a higher threshold if you notice @@ -483,8 +482,8 @@ ------- This package is published under GNU GPL Version 3. +""" -""" from core import * from io import * from util import * diff -r 61aa25f0eb31 -r 6b2cc3ca5b6a quickstart.rst --- a/quickstart.rst Wed Sep 23 11:28:43 2015 +0200 +++ b/quickstart.rst Wed Sep 23 11:50:26 2015 +0200 @@ -385,7 +385,7 @@ In other words, it removes the leading and tailing silence. Sampling rate is 44100 sample per second, we'll use an analysis window of 100 ms -(i.e. bloc_ksize == 4410) +(i.e. block_size == 4410) Energy threshold is 50.