Mercurial > hg > auditok
changeset 331:9741b52f194a
Reformat code and documentation
author | Amine Sehili <amine.sehili@gmail.com> |
---|---|
date | Thu, 24 Oct 2019 20:49:51 +0200 |
parents | 9665dc53c394 |
children | 445d8bf27cf2 |
files | CHANGELOG LICENSE README.md auditok/cmdline.py auditok/dataset.py auditok/exceptions.py auditok/signal.py demos/audio_tokenize_demo.py demos/audio_trim_demo.py demos/echo.py doc/apitutorial.rst doc/cmdline.rst doc/conf.py doc/dataset.rst doc/index.rst doc/io.rst setup.py tests/test_cmdline_util.py tests/test_signal.py |
diffstat | 19 files changed, 490 insertions(+), 401 deletions(-) [+] |
line wrap: on
line diff
--- a/CHANGELOG Wed Oct 23 21:24:33 2019 +0100 +++ b/CHANGELOG Thu Oct 24 20:49:51 2019 +0200 @@ -41,7 +41,7 @@ - Read audio data from stdin - Add short alias for all ADSFactory.ads() keyword arguments - Add tests for alias -- Break audio data into chunks for audio play back (to quickly detect interruptions like Ctrl-C) +- Break audio data into chunks for audio play back (to quickly detect interruptions like Ctrl-C) Version 0.1.3
--- a/LICENSE Wed Oct 23 21:24:33 2019 +0100 +++ b/LICENSE Thu Oct 24 20:49:51 2019 +0200 @@ -16,4 +16,4 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN -THE SOFTWARE. \ No newline at end of file +THE SOFTWARE.
--- a/README.md Wed Oct 23 21:24:33 2019 +0100 +++ b/README.md Thu Oct 24 20:49:51 2019 +0200 @@ -1,6 +1,6 @@ [](https://travis-ci.org/amsehili/auditok) [](http://auditok.readthedocs.org/en/latest/?badge=latest) -AUDIo TOKenizer +AUDIo TOKenizer =============== `auditok` is an **Audio Activity Detection** tool that can process online data (read from an audio device or from standard input) as well as audio files. It can be used as a command line program and offers an easy to use API. @@ -72,7 +72,7 @@ This will print `id`, `start-time` and `end-time` for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell `auditok` to read data from standard input: rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -r 16000 -w 2 -c 1 - + Note that when data is read from standard input the same audio parameters must be used for both `sox` (or any other data generation/acquisition tool) and `auditok`. The following table summarizes audio parameters. | Audio parameter | sox option | `auditok` option | `auditok` default | @@ -104,7 +104,7 @@ Option `-E` stands for echo, so `auditok` plays back whatever it detects. Using `-E` requires `PyAudio`, if you don't have `PyAudio` and want to play detections with sox, use the `-C` option: rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -C "play -q -t raw -r 16000 -c 1 -b 16 -e signed $" - + The `-C` option tells `auditok` to interpret its content as a command that should be run whenever `auditok` detects an audio activity, replacing the `$` by a name of a temporary file into which the activity is saved as raw audio. Here we use `play` to play the activity, giving the necessary `play` arguments for raw data. `rec` and `play` are just an alias for `sox`. @@ -131,7 +131,7 @@ 2 3.05 3.73 3 3.97 4.49 ... - + If you want to customize the output format, use `--printf` option: auditok -e 55 --printf "[{id}]: {start} to {end}" @@ -147,7 +147,7 @@ Keywords `{id}`, `{start}` and `{end}` can be placed and repeated anywhere in the text. Time is shown in seconds, if you want a more detailed time information, use `--time-format`: auditok -e 55 --printf "[{id}]: {start} to {end}" --time-format "%h:%m:%s.%i" - + Output: [1]: 00:00:01.080 to 00:00:01.760 @@ -197,7 +197,7 @@ You can use a free text and place `{N}`, `{start}` and `{end}` wherever you want, they will be replaced by detection number, `start-time` and `end-time` respectively. Another example: auditok -o {start}-{end}.wav ... - + Install `pydub` for more audio formats. @@ -221,7 +221,7 @@ ### 1st practical use case: generate a subtitles template -Using `--printf ` and `--time-format`, the following command, used with an input audio or video file, will generate and an **srt** file template that can be later edited with a subtitles editor in a way that reduces the time needed to define when each utterance starts and where it ends: +Using `--printf ` and `--time-format`, the following command, used with an input audio or video file, will generate and an **srt** file template that can be later edited with a subtitles editor in a way that reduces the time needed to define when each utterance starts and where it ends: auditok -e 55 -i input.wav -m 10 --printf "{id}\n{start} --> {end}\nPut some text here...\n" --time-format "%h:%m:%s.%i" @@ -230,7 +230,7 @@ 1 00:00:00.730 --> 00:00:01.460 Put some text here... - + 2 00:00:02.440 --> 00:00:03.900 Put some text here... @@ -263,7 +263,7 @@ 2- Send flac audio data to Google and get its filtered transcription using [speech-rec.sh](https://github.com/amsehili/gspeech-rec/blob/master/speech-rec.sh): speech-rec.sh -i output.flac -r 16000 - + 3- Use **grep** to select lines that contain *transcript*: grep transcript @@ -307,4 +307,3 @@ Author ------ Amine Sehili (<amine.sehili@gmail.com>) -
--- a/auditok/cmdline.py Wed Oct 23 21:24:33 2019 +0100 +++ b/auditok/cmdline.py Thu Oct 24 20:49:51 2019 +0200 @@ -11,7 +11,7 @@ @copyright: 2015-2019 Mohamed El Amine SEHILI @license: MIT @contact: amine.sehili@gmail.com -@deffield updated: 13 Oct 2019 +@deffield updated: 24 Oct 2019 """ import sys @@ -29,9 +29,8 @@ __all__ = [] -version = __version__ __date__ = "2015-11-23" -__updated__ = "2018-10-13" +__updated__ = "2018-10-24" def main(argv=None): @@ -43,7 +42,7 @@ prog=program_name, description="An Audio Tokenization tool" ) parser.add_argument( - "--version", "-v", action="version", version=version + "--version", "-v", action="version", version=__version__ ) group = parser.add_argument_group("Input-Output options") group.add_argument( @@ -51,8 +50,8 @@ help="Input audio or video file. Use '-' for stdin " "[default: read from microphone using pyaudio]", metavar="input", - nargs='?', - default=None + nargs="?", + default=None, ) group.add_argument( "-I", @@ -136,8 +135,8 @@ type=str, default=None, help="Audio format used to save detections and/or main stream. " - "If not supplied, then it will: (1. be guessed from extension or (2. " - "use raw format", + "If not supplied, then it will: (1. be guessed from extension or " + "(2. use raw format", metavar="STRING", ) group.add_argument( @@ -168,7 +167,8 @@ dest="analysis_window", default=0.01, type=float, - help="Size of analysis window in seconds [default: %(default)s (10ms)]", + help="Size of analysis window in seconds [default: %(default)s " + "(10ms)]", metavar="FLOAT", ) group.add_argument( @@ -177,7 +177,8 @@ dest="min_duration", type=float, default=0.2, - help="Min duration of a valid audio event in seconds [default: %(default)s]", + help="Min duration of a valid audio event in seconds " + "[default: %(default)s]", metavar="FLOAT", ) group.add_argument( @@ -186,7 +187,8 @@ dest="max_duration", type=float, default=5, - help="Max duration of a valid audio event in seconds [default: %(default)s]", + help="Max duration of a valid audio event in seconds " + "[default: %(default)s]", metavar="FLOAT", ) group.add_argument( @@ -195,8 +197,8 @@ dest="max_silence", type=float, default=0.3, - help="Max duration of a consecutive silence within a valid audio event " - "in seconds [default: %(default)s]", + help="Max duration of a consecutive silence within a valid audio " + "event in seconds [default: %(default)s]", metavar="FLOAT", ) group.add_argument( @@ -298,14 +300,15 @@ dest="plot", action="store_true", default=False, - help="Plot and show audio signal and detections (requires matplotlib)", + help="Plot and show audio signal and detections (requires " + "matplotlib)", ) group.add_argument( "--save-image", dest="save_image", type=str, - help="Save plotted audio signal and detections as a picture or a PDF " - "file (requires matplotlib)", + help="Save plotted audio signal and detections as a picture or a " + "PDF file (requires matplotlib)", metavar="FILE", ) group.add_argument( @@ -332,9 +335,10 @@ "placeholders used with --printf [default= %(default)s]. The " "following formats are accepted:\n" "%%S: absolute time in seconds. %%I: absolute time in ms. If at " - "least one of (%%h, %%m, %%s, %%i) is used, convert time into hours, " - "minutes, seconds and millis (e.g. %%h:%%m:%%s.%%i). Only supplied " - "fields are printed. Note that %%S and %%I can only be used alone", + "least one of (%%h, %%m, %%s, %%i) is used, convert time into " + "hours, minutes, seconds and millis (e.g. %%h:%%m:%%s.%%i). Only " + "supplied fields are printed. Note that %%S and %%I can only be " + "used alone", metavar="STRING", ) group.add_argument( @@ -351,8 +355,8 @@ dest="quiet", action="store_true", default=False, - help="Do not print any information about detections [default: print " - "'id', 'start' and 'end' of each detection]", + help="Do not print any information about detections [default: " + "print 'id', 'start' and 'end' of each detection]", ) parser.add_argument( "-D",
--- a/auditok/dataset.py Wed Oct 23 21:24:33 2019 +0100 +++ b/auditok/dataset.py Thu Oct 24 20:49:51 2019 +0200 @@ -4,15 +4,22 @@ import os -__all__ = ["one_to_six_arabic_16000_mono_bc_noise", "was_der_mensch_saet_mono_44100_lead_trail_silence"] +__all__ = [ + "one_to_six_arabic_16000_mono_bc_noise", + "was_der_mensch_saet_mono_44100_lead_trail_silence", +] _current_dir = os.path.dirname(os.path.realpath(__file__)) one_to_six_arabic_16000_mono_bc_noise = "{cd}{sep}data{sep}1to6arabic_\ -16000_mono_bc_noise.wav".format(cd=_current_dir, sep=os.path.sep) +16000_mono_bc_noise.wav".format( + cd=_current_dir, sep=os.path.sep +) """A wave file that contains a pronunciation of Arabic numbers from 1 to 6""" was_der_mensch_saet_mono_44100_lead_trail_silence = "{cd}{sep}data{sep}was_\ der_mensch_saet_das_wird_er_vielfach_ernten_44100Hz_mono_lead_trail_\ -silence.wav".format(cd=_current_dir, sep=os.path.sep) +silence.wav".format( + cd=_current_dir, sep=os.path.sep +) """ A wave file that contains a sentence between long leading and trailing periods of silence"""
--- a/auditok/exceptions.py Wed Oct 23 21:24:33 2019 +0100 +++ b/auditok/exceptions.py Thu Oct 24 20:49:51 2019 +0200 @@ -16,7 +16,7 @@ class EndOfProcessing(Exception): - """Raised within command line script's main function to jump to + """Raised within command line script's main function to jump to postprocessing code"""
--- a/auditok/signal.py Wed Oct 23 21:24:33 2019 +0100 +++ b/auditok/signal.py Thu Oct 24 20:49:51 2019 +0200 @@ -17,6 +17,7 @@ samples = array(fmt, data) return samples[selected::channels] + def average_channels(data, fmt, channels): all_channels = array(fmt, data) mono_channels = [ @@ -28,6 +29,7 @@ ) return avg_arr + def average_channels_stereo(data, sample_width): fmt = FORMAT[sample_width] arr = array(fmt, audioop.tomono(data, sample_width, 0.5, 0.5))
--- a/demos/audio_tokenize_demo.py Wed Oct 23 21:24:33 2019 +0100 +++ b/demos/audio_tokenize_demo.py Thu Oct 24 20:49:51 2019 +0200 @@ -3,60 +3,78 @@ September, 2015 """ -from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for, dataset +from auditok import ( + ADSFactory, + AudioEnergyValidator, + StreamTokenizer, + player_for, + dataset, +) import sys try: - # We set the `record` argument to True so that we can rewind the source - asource = ADSFactory.ads(filename=dataset.one_to_six_arabic_16000_mono_bc_noise, record=True) + # We set the `record` argument to True so that we can rewind the source + asource = ADSFactory.ads( + filename=dataset.one_to_six_arabic_16000_mono_bc_noise, record=True + ) - validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=65) + validator = AudioEnergyValidator( + sample_width=asource.get_sample_width(), energy_threshold=65 + ) - # Default analysis window is 10 ms (float(asource.get_block_size()) / asource.get_sampling_rate()) - # min_length=20 : minimum length of a valid audio activity is 20 * 10 == 200 ms - # max_length=400 : maximum length of a valid audio activity is 400 * 10 == 4000 ms == 4 seconds - # max_continuous_silence=30 : maximum length of a tolerated silence within a valid audio activity is 30 * 30 == 300 ms - tokenizer = StreamTokenizer(validator=validator, min_length=20, max_length=400, max_continuous_silence=30) + # Default analysis window is 10 ms (float(asource.get_block_size()) / asource.get_sampling_rate()) + # min_length=20 : minimum length of a valid audio activity is 20 * 10 == 200 ms + # max_length=400 : maximum length of a valid audio activity is 400 * 10 == 4000 ms == 4 seconds + # max_continuous_silence=30 : maximum length of a tolerated silence within a valid audio activity is 30 * 30 == 300 ms + tokenizer = StreamTokenizer( + validator=validator, + min_length=20, + max_length=400, + max_continuous_silence=30, + ) - asource.open() - tokens = tokenizer.tokenize(asource) + asource.open() + tokens = tokenizer.tokenize(asource) - # Play detected regions back - player = player_for(asource) + # Play detected regions back + player = player_for(asource) - # Rewind and read the whole signal - asource.rewind() - original_signal = [] + # Rewind and read the whole signal + asource.rewind() + original_signal = [] - while True: - w = asource.read() - if w is None: - break - original_signal.append(w) - + while True: + w = asource.read() + if w is None: + break + original_signal.append(w) - original_signal = b''.join(original_signal) - player.play(original_signal) + original_signal = b"".join(original_signal) + player.play(original_signal) - print("\n ** playing detected regions...\n") - for i,t in enumerate(tokens): - print("Token [{0}] starts at {1} and ends at {2}".format(i+1, t[1], t[2])) - data = b''.join(t[0]) - player.play(data) + print("\n ** playing detected regions...\n") + for i, t in enumerate(tokens): + print( + "Token [{0}] starts at {1} and ends at {2}".format( + i + 1, t[1], t[2] + ) + ) + data = b"".join(t[0]) + player.play(data) - assert len(tokens) == 8 + assert len(tokens) == 8 - asource.close() - player.stop() + asource.close() + player.stop() except KeyboardInterrupt: - player.stop() - asource.close() - sys.exit(0) + player.stop() + asource.close() + sys.exit(0) except Exception as e: - sys.stderr.write(str(e) + "\n") - sys.exit(1) + sys.stderr.write(str(e) + "\n") + sys.exit(1)
--- a/demos/audio_trim_demo.py Wed Oct 23 21:24:33 2019 +0100 +++ b/demos/audio_trim_demo.py Thu Oct 24 20:49:51 2019 +0200 @@ -5,13 +5,19 @@ # Trim leading and trailing silence from a record -from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for, dataset +from auditok import ( + ADSFactory, + AudioEnergyValidator, + StreamTokenizer, + player_for, + dataset, +) import pyaudio import sys """ The tokenizer in the following example is set up to remove the silence -that precedes the first acoustic activity or follows the last activity +that precedes the first acoustic activity or follows the last activity in a record. It preserves whatever it founds between the two activities. In other words, it removes the leading and trailing silence. @@ -21,12 +27,12 @@ Energy threshold is 50. The tokenizer will start accumulating windows up from the moment it encounters -the first analysis window of an energy >= 50. ALL the following windows will be +the first analysis window of an energy >= 50. ALL the following windows will be kept regardless of their energy. At the end of the analysis, it will drop trailing windows with an energy below 50. This is an interesting example because the audio file we're analyzing contains a very -brief noise that occurs within the leading silence. We certainly do want our tokenizer +brief noise that occurs within the leading silence. We certainly do want our tokenizer to stop at this point and considers whatever it comes after as a useful signal. To force the tokenizer to ignore that brief event we use two other parameters `init_min` ans `init_max_silence`. By `init_min`=3 and `init_max_silence`=1 we tell the tokenizer @@ -41,62 +47,74 @@ When using a shorter analysis window (of 10ms for instance, block_size == 441), the brief noise contributes more to energy calculation which yields an energy of over 50 for the window. Again we can deal with this situation by using a higher energy threshold (55 for example) - + """ try: - # record = True so that we'll be able to rewind the source. - asource = ADSFactory.ads(filename=dataset.was_der_mensch_saet_mono_44100_lead_tail_silence, - record=True, block_size=4410) - asource.open() + # record = True so that we'll be able to rewind the source. + asource = ADSFactory.ads( + filename=dataset.was_der_mensch_saet_mono_44100_lead_tail_silence, + record=True, + block_size=4410, + ) + asource.open() - original_signal = [] - # Read the whole signal - while True: - w = asource.read() - if w is None: - break - original_signal.append(w) + original_signal = [] + # Read the whole signal + while True: + w = asource.read() + if w is None: + break + original_signal.append(w) - original_signal = b''.join(original_signal) + original_signal = b"".join(original_signal) + # rewind source + asource.rewind() - # rewind source - asource.rewind() + # Create a validator with an energy threshold of 50 + validator = AudioEnergyValidator( + sample_width=asource.get_sample_width(), energy_threshold=50 + ) - # Create a validator with an energy threshold of 50 - validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=50) + # Create a tokenizer with an unlimited token length and continuous silence within a token + # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence + trimmer = StreamTokenizer( + validator, + min_length=20, + max_length=99999999, + max_continuous_silence=9999999, + mode=StreamTokenizer.DROP_TRAILING_SILENCE, + init_min=3, + init_max_silence=1, + ) - # Create a tokenizer with an unlimited token length and continuous silence within a token - # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence - trimmer = StreamTokenizer(validator, min_length = 20, max_length=99999999, - max_continuous_silence=9999999, mode=StreamTokenizer.DROP_TRAILING_SILENCE, init_min=3, init_max_silence=1) + tokens = trimmer.tokenize(asource) + # Make sure we only have one token + assert len(tokens) == 1, "Should have detected one single token" - tokens = trimmer.tokenize(asource) + trimmed_signal = b"".join(tokens[0][0]) - # Make sure we only have one token - assert len(tokens) == 1, "Should have detected one single token" + player = player_for(asource) - trimmed_signal = b''.join(tokens[0][0]) + print( + "\n ** Playing original signal (with leading and trailing silence)..." + ) + player.play(original_signal) + print("\n ** Playing trimmed signal...") + player.play(trimmed_signal) - player = player_for(asource) - - print("\n ** Playing original signal (with leading and trailing silence)...") - player.play(original_signal) - print("\n ** Playing trimmed signal...") - player.play(trimmed_signal) - - player.stop() - asource.close() + player.stop() + asource.close() except KeyboardInterrupt: - player.stop() - asource.close() - sys.exit(0) + player.stop() + asource.close() + sys.exit(0) except Exception as e: - - sys.stderr.write(str(e) + "\n") - sys.exit(1) + + sys.stderr.write(str(e) + "\n") + sys.exit(1)
--- a/demos/echo.py Wed Oct 23 21:24:33 2019 +0100 +++ b/demos/echo.py Thu Oct 24 20:49:51 2019 +0200 @@ -1,49 +1,64 @@ - -from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for +from auditok import ( + ADSFactory, + AudioEnergyValidator, + StreamTokenizer, + player_for, +) import pyaudio import sys try: - energy_threshold = 45 - duration = 10 # seconds + energy_threshold = 45 + duration = 10 # seconds + if len(sys.argv) > 1: + energy_threshold = float(sys.argv[1]) - if len(sys.argv) > 1: - energy_threshold = float(sys.argv[1]) + if len(sys.argv) > 2: + duration = float(sys.argv[2]) - if len(sys.argv) > 2: - duration = float(sys.argv[2]) + # record = True so that we'll be able to rewind the source. + # max_time = 10: read 10 seconds from the microphone + asource = ADSFactory.ads(record=True, max_time=duration) - # record = True so that we'll be able to rewind the source. - # max_time = 10: read 10 seconds from the microphone - asource = ADSFactory.ads(record=True, max_time = duration) + validator = AudioEnergyValidator( + sample_width=asource.get_sample_width(), + energy_threshold=energy_threshold, + ) + tokenizer = StreamTokenizer( + validator=validator, + min_length=20, + max_length=250, + max_continuous_silence=30, + ) - validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold = energy_threshold) - tokenizer = StreamTokenizer(validator=validator, min_length=20, max_length=250, max_continuous_silence=30) + player = player_for(asource) - player = player_for(asource) + def echo(data, start, end): + print("Acoustic activity at: {0}--{1}".format(start, end)) + player.play(b"".join(data)) - def echo(data, start, end): - print("Acoustic activity at: {0}--{1}".format(start, end)) - player.play(b''.join(data)) + asource.open() - asource.open() + print( + "\n ** Make some noise (dur:{}, energy:{})...".format( + duration, energy_threshold + ) + ) - print("\n ** Make some noise (dur:{}, energy:{})...".format(duration, energy_threshold)) + tokenizer.tokenize(asource, callback=echo) - tokenizer.tokenize(asource, callback=echo) - - asource.close() - player.stop() + asource.close() + player.stop() except KeyboardInterrupt: - player.stop() - asource.close() - sys.exit(0) + player.stop() + asource.close() + sys.exit(0) except Exception as e: - - sys.stderr.write(str(e) + "\n") - sys.exit(1) + + sys.stderr.write(str(e) + "\n") + sys.exit(1)
--- a/doc/apitutorial.rst Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/apitutorial.rst Thu Oct 24 20:49:51 2019 +0200 @@ -6,7 +6,7 @@ **auditok** is a module that can be used as a generic tool for data -tokenization. Although its core motivation is **Acoustic Activity +tokenization. Although its core motivation is **Acoustic Activity Detection** (AAD) and extraction from audio streams (i.e. detect where a noise/an acoustic activity occurs within an audio stream and extract the corresponding portion of signal), it can easily be @@ -28,7 +28,7 @@ windows (whatever kind of noise: speech, baby cry, laughter, etc.). The most important component of `auditok` is the :class:`auditok.core.StreamTokenizer` -class. An instance of this class encapsulates a :class:`auditok.util.DataValidator` and can be +class. An instance of this class encapsulates a :class:`auditok.util.DataValidator` and can be configured to detect the desired regions from a stream. The :func:`auditok.core.StreamTokenizer.tokenize` method accepts a :class:`auditok.util.DataSource` object that has a `read` method. Read data can be of any type accepted @@ -43,7 +43,7 @@ - A file on the disk - A buffer of data - The built-in microphone (requires PyAudio) - + The :class:`auditok.util.ADSFactory.AudioDataSource` class inherits from :class:`auditok.util.DataSource` and supplies a higher abstraction level @@ -57,7 +57,7 @@ instead of raw audio samples. - Limit the amount (i.e. duration) of read data (if keyword `max_time` or `mt` is used, very useful when reading data from the microphone) - Record all read data and rewind if necessary (if keyword `record` or `rec` , also useful if you read data from the microphone and - you want to process it many times off-line and/or save it) + you want to process it many times off-line and/or save it) See :class:`auditok.util.ADSFactory` documentation for more information. @@ -69,7 +69,7 @@ ********************************** Let us look at some examples using the :class:`auditok.util.StringDataSource` class -created for test and illustration purposes. Imagine that each character of +created for test and illustration purposes. Imagine that each character of :class:`auditok.util.StringDataSource` data represents an audio slice of 100 ms for example. In the following examples we will use upper case letters to represent noisy audio slices (i.e. analysis windows or frames) and lower case letter for @@ -81,26 +81,26 @@ We want to extract sub-sequences of characters that have: - + - A minimum length of 1 (`min_length` = 1) - A maximum length of 9999 (`max_length` = 9999) - Zero consecutive lower case characters within them (`max_continuous_silence` = 0) -We also create the `UpperCaseChecker` with a `read` method that returns `True` if the -checked character is in upper case and `False` otherwise. +We also create the `UpperCaseChecker` with a `read` method that returns `True` if the +checked character is in upper case and `False` otherwise. .. code:: python - + from auditok import StreamTokenizer, StringDataSource, DataValidator - + class UpperCaseChecker(DataValidator): def is_valid(self, frame): return frame.isupper() - + dsource = StringDataSource("aaaABCDEFbbGHIJKccc") - tokenizer = StreamTokenizer(validator=UpperCaseChecker(), + tokenizer = StreamTokenizer(validator=UpperCaseChecker(), min_length=1, max_length=9999, max_continuous_silence=0) - + tokenizer.tokenize(dsource) The output is a list of two tuples, each contains the extracted sub-sequence and its @@ -109,9 +109,9 @@ .. code:: python - + [(['A', 'B', 'C', 'D', 'E', 'F'], 3, 8), (['G', 'H', 'I', 'J', 'K'], 11, 15)] - + Tolerate up to two non-valid (lower case) letters within an extracted sequence ############################################################################## @@ -122,24 +122,24 @@ from auditok import StreamTokenizer, StringDataSource, DataValidator - + class UpperCaseChecker(DataValidator): def is_valid(self, frame): return frame.isupper() - + dsource = StringDataSource("aaaABCDbbEFcGHIdddJKee") - tokenizer = StreamTokenizer(validator=UpperCaseChecker(), + tokenizer = StreamTokenizer(validator=UpperCaseChecker(), min_length=1, max_length=9999, max_continuous_silence=2) - + tokenizer.tokenize(dsource) output: .. code:: python - + [(['A', 'B', 'C', 'D', 'b', 'b', 'E', 'F', 'c', 'G', 'H', 'I', 'd', 'd'], 3, 16), (['J', 'K', 'e', 'e'], 18, 21)] - + Notice the trailing lower case letters "dd" and "ee" at the end of the two tokens. The default behavior of :class:`auditok.core.StreamTokenizer` is to keep the *trailing silence* if it does not exceed `max_continuous_silence`. This can be changed @@ -157,16 +157,16 @@ .. code:: python from auditok import StreamTokenizer, StringDataSource, DataValidator - + class UpperCaseChecker(DataValidator): def is_valid(self, frame): return frame.isupper() - + dsource = StringDataSource("aaaABCDbbEFcGHIdddJKee") - tokenizer = StreamTokenizer(validator=UpperCaseChecker(), + tokenizer = StreamTokenizer(validator=UpperCaseChecker(), min_length=1, max_length=9999, max_continuous_silence=2, mode=StreamTokenizer.DROP_TRAILING_SILENCE) - + tokenizer.tokenize(dsource) output: @@ -182,36 +182,36 @@ Imagine that you just want to detect and recognize a small part of a long -acoustic event (e.g. engine noise, water flow, etc.) and avoid that that +acoustic event (e.g. engine noise, water flow, etc.) and avoid that that event hogs the tokenizer and prevent it from feeding the event to the next processing step (i.e. a sound recognizer). You can do this by: - limiting the length of a detected token. - + and - + - using a callback function as an argument to :class:`auditok.core.StreamTokenizer.tokenize` so that the tokenizer delivers a token as soon as it is detected. The following code limits the length of a token to 5: .. code:: python - + from auditok import StreamTokenizer, StringDataSource, DataValidator - + class UpperCaseChecker(DataValidator): def is_valid(self, frame): return frame.isupper() - + dsource = StringDataSource("aaaABCDEFGHIJKbbb") tokenizer = StreamTokenizer(validator=UpperCaseChecker(), min_length=1, max_length=5, max_continuous_silence=0) - + def print_token(data, start, end): print("token = '{0}', starts at {1}, ends at {2}".format(''.join(data), start, end)) - + tokenizer.tokenize(dsource, callback=print_token) - + output: @@ -226,7 +226,7 @@ `auditok` and Audio Data ************************ -In the rest of this document we will use :class:`auditok.util.ADSFactory`, :class:`auditok.util.AudioEnergyValidator` +In the rest of this document we will use :class:`auditok.util.ADSFactory`, :class:`auditok.util.AudioEnergyValidator` and :class:`auditok.core.StreamTokenizer` for Audio Activity Detection demos using audio data. Before we get any further it is worth, explaining a certain number of points. @@ -237,31 +237,31 @@ The created :class:`AudioDataSource` object is then passed to :func:`StreamTokenizer.tokenize` for tokenization. :func:`auditok.util.ADSFactory.ads` accepts a number of keyword arguments, of which none is mandatory. -The returned :class:`AudioDataSource` object's features and behavior can however greatly differ +The returned :class:`AudioDataSource` object's features and behavior can however greatly differ depending on the passed arguments. Further details can be found in the respective method documentation. Note however the following two calls that will create an :class:`AudioDataSource` that reads data from an audio file and from the built-in microphone respectively. .. code:: python - + from auditok import ADSFactory - + # Get an AudioDataSource from a file # use 'filename', alias 'fn' keyword argument file_ads = ADSFactory.ads(filename = "path/to/file/") - + # Get an AudioDataSource from the built-in microphone # The returned object has the default values for sampling # rate, sample width an number of channels. see method's - # documentation for customized values + # documentation for customized values mic_ads = ADSFactory.ads() - + For :class:`StreamTkenizer`, parameters `min_length`, `max_length` and `max_continuous_silence` are expressed in terms of number of frames. Each call to :func:`AudioDataSource.read` returns one frame of data or None. -If you want a `max_length` of 2 seconds for your detected sound events and your *analysis window* +If you want a `max_length` of 2 seconds for your detected sound events and your *analysis window* is *10 ms* long, you have to specify a `max_length` of 200 (`int(2. / (10. / 1000)) == 200`). For a `max_continuous_silence` of *300 ms* for instance, the value to pass to StreamTokenizer is 30 (`int(0.3 / (10. / 1000)) == 30`). @@ -270,7 +270,7 @@ :class:`AudioDataSource` object, it returns the same amount of data, except if there are no more data (returns what's left in stream or None). -This fixed-length amount of data is referred here to as **analysis window** and is a parameter of +This fixed-length amount of data is referred here to as **analysis window** and is a parameter of :func:`ADSFactory.ads` method. By default :func:`ADSFactory.ads` uses an analysis window of 10 ms. The number of samples that 10 ms of audio data contain will vary, depending on the sampling @@ -280,29 +280,29 @@ You can use the `block_size` keyword (alias `bs`) to define your analysis window: .. code:: python - + from auditok import ADSFactory - + ''' Assume you have an audio file with a sampling rate of 16000 ''' - + # file_ads.read() will return blocks of 160 sample file_ads = ADSFactory.ads(filename = "path/to/file/", block_size = 160) - + # file_ads.read() will return blocks of 320 sample file_ads = ADSFactory.ads(filename = "path/to/file/", bs = 320) - + Fortunately, you can specify the size of your analysis window in seconds, thanks to keyword `block_dur` (alias `bd`): .. code:: python - + from auditok import ADSFactory # use an analysis window of 20 ms file_ads = ADSFactory.ads(filename = "path/to/file/", bd = 0.02) - + For :class:`StreamTkenizer`, each :func:`read` call that does not return `None` is treated as a processing frame. :class:`StreamTkenizer` has no way to figure out the temporal length of that frame (why sould it?). So to correctly initialize your :class:`StreamTokenizer`, based on your analysis window duration, use something like: @@ -313,16 +313,16 @@ analysis_win_seconds = 0.01 # 10 ms my_ads = ADSFactory.ads(block_dur = analysis_win_seconds) analysis_window_ms = analysis_win_seconds * 1000 - + # If you want your maximum continuous silence to be 300 ms use: max_continuous_silence = int(300. / analysis_window_ms) - + # which is the same as: max_continuous_silence = int(0.3 / (analysis_window_ms / 1000)) - + # or simply: max_continuous_silence = 30 - + ****************************** Examples using real audio data @@ -332,36 +332,36 @@ Extract isolated phrases from an utterance ########################################## -We will build an :class:`auditok.util.ADSFactory.AudioDataSource` using a wave file from +We will build an :class:`auditok.util.ADSFactory.AudioDataSource` using a wave file from the database. The file contains of isolated pronunciation of digits from 1 to 1 in Arabic as well as breath-in/out between 2 and 3. The code will play the -original file then the detected sounds separately. Note that we use an +original file then the detected sounds separately. Note that we use an `energy_threshold` of 65, this parameter should be carefully chosen. It depends -on microphone quality, background noise and the amplitude of events you want to +on microphone quality, background noise and the amplitude of events you want to detect. .. code:: python from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for, dataset - + # We set the `record` argument to True so that we can rewind the source asource = ADSFactory.ads(filename=dataset.one_to_six_arabic_16000_mono_bc_noise, record=True) - + validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=65) - + # Default analysis window is 10 ms (float(asource.get_block_size()) / asource.get_sampling_rate()) # min_length=20 : minimum length of a valid audio activity is 20 * 10 == 200 ms # max_length=4000 : maximum length of a valid audio activity is 400 * 10 == 4000 ms == 4 seconds - # max_continuous_silence=30 : maximum length of a tolerated silence within a valid audio activity is 30 * 30 == 300 ms + # max_continuous_silence=30 : maximum length of a tolerated silence within a valid audio activity is 30 * 30 == 300 ms tokenizer = StreamTokenizer(validator=validator, min_length=20, max_length=400, max_continuous_silence=30) - + asource.open() tokens = tokenizer.tokenize(asource) - + # Play detected regions back - + player = player_for(asource) - + # Rewind and read the whole signal asource.rewind() original_signal = [] @@ -371,46 +371,46 @@ if w is None: break original_signal.append(w) - + original_signal = ''.join(original_signal) - + print("Playing the original file...") player.play(original_signal) - + print("playing detected regions...") for t in tokens: print("Token starts at {0} and ends at {1}".format(t[1], t[2])) data = ''.join(t[0]) player.play(data) - + assert len(tokens) == 8 - + The tokenizer extracts 8 audio regions from the signal, including all isolated digits (from 1 to 6) as well as the 2-phase respiration of the subject. You might have noticed -that, in the original file, the last three digit are closer to each other than the +that, in the original file, the last three digit are closer to each other than the previous ones. If you wan them to be extracted as one single phrase, you can do so by tolerating a larger continuous silence within a detection: - + .. code:: python - + tokenizer.max_continuous_silence = 50 asource.rewind() tokens = tokenizer.tokenize(asource) - + for t in tokens: print("Token starts at {0} and ends at {1}".format(t[1], t[2])) data = ''.join(t[0]) player.play(data) - + assert len(tokens) == 6 - - + + Trim leading and trailing silence ################################# - + The tokenizer in the following example is set up to remove the silence -that precedes the first acoustic activity or follows the last activity +that precedes the first acoustic activity or follows the last activity in a record. It preserves whatever it founds between the two activities. In other words, it removes the leading and trailing silence. @@ -420,12 +420,12 @@ Energy threshold is 50. The tokenizer will start accumulating windows up from the moment it encounters -the first analysis window of an energy >= 50. ALL the following windows will be +the first analysis window of an energy >= 50. ALL the following windows will be kept regardless of their energy. At the end of the analysis, it will drop trailing windows with an energy below 50. This is an interesting example because the audio file we're analyzing contains a very -brief noise that occurs within the leading silence. We certainly do want our tokenizer +brief noise that occurs within the leading silence. We certainly do want our tokenizer to stop at this point and considers whatever it comes after as a useful signal. To force the tokenizer to ignore that brief event we use two other parameters `init_min` and `init_max_silence`. By `init_min` = 3 and `init_max_silence` = 1 we tell the tokenizer @@ -457,33 +457,33 @@ if w is None: break original_signal.append(w) - + original_signal = ''.join(original_signal) - + # rewind source asource.rewind() - + # Create a validator with an energy threshold of 50 validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=50) - + # Create a tokenizer with an unlimited token length and continuous silence within a token # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence trimmer = StreamTokenizer(validator, min_length = 20, max_length=99999999, init_min=3, init_max_silence=1, max_continuous_silence=9999999, mode=StreamTokenizer.DROP_TRAILING_SILENCE) - + tokens = trimmer.tokenize(asource) - + # Make sure we only have one token assert len(tokens) == 1, "Should have detected one single token" - + trimmed_signal = ''.join(tokens[0][0]) - + player = player_for(asource) - + print("Playing original signal (with leading and trailing silence)...") player.play(original_signal) print("Playing trimmed signal...") player.play(trimmed_signal) - + Online audio signal processing ############################## @@ -494,7 +494,7 @@ activity is played back using the build-in audio output device. As mentioned before , Signal energy is strongly related to many factors such -microphone sensitivity, background noise (including noise inherent to the hardware), +microphone sensitivity, background noise (including noise inherent to the hardware), distance and your operating system sound settings. Try a lower `energy_threshold` if your noise does not seem to be detected and a higher threshold if you notice an over detection (echo method prints a detection where you have made no noise). @@ -502,22 +502,22 @@ .. code:: python from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for - + # record = True so that we'll be able to rewind the source. # max_time = 10: read 10 seconds from the microphone asource = ADSFactory.ads(record=True, max_time=10) - + validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=50) tokenizer = StreamTokenizer(validator=validator, min_length=20, max_length=250, max_continuous_silence=30) - + player = player_for(asource) - + def echo(data, start, end): print("Acoustic activity at: {0}--{1}".format(start, end)) player.play(''.join(data)) - + asource.open() - + tokenizer.tokenize(asource, callback=echo) If you want to re-run the tokenizer after changing of one or many parameters, use the following code: @@ -534,7 +534,7 @@ .. code:: python player.play(asource.get_audio_source().get_data_buffer()) - + ************ Contributing
--- a/doc/cmdline.rst Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/cmdline.rst Thu Oct 24 20:49:51 2019 +0200 @@ -1,7 +1,7 @@ `auditok` Command-line Usage Guide ================================== -This user guide will go through a few of the most useful operations you can use **auditok** for and present two practical use cases. +This user guide will go through a few of the most useful operations you can use **auditok** for and present two practical use cases. .. contents:: `Contents` :depth: 3 @@ -49,7 +49,7 @@ This will print **id** **start-time** and **end-time** for each detected activity. If you don't have `PyAudio`, you can use `sox` for data acquisition (`sudo apt-get install sox`) and tell **auditok** to read data from standard input: rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -r 16000 -w 2 -c 1 - + Note that when data is read from standard input the same audio parameters must be used for both `sox` (or any other data generation/acquisition tool) and **auditok**. The following table summarizes audio parameters. @@ -89,7 +89,7 @@ .. code:: bash rec -q -t raw -r 16000 -c 1 -b 16 -e signed - | auditok -i - -C "play -q -t raw -r 16000 -c 1 -b 16 -e signed $" - + The `-C` option tells **auditok** to interpret its content as a command that should be run whenever **auditok** detects an audio activity, replacing the `$` by a name of a temporary file into which the activity is saved as raw audio. Here we use `play` to play the activity, giving the necessary `play` arguments for raw data. `rec` and `play` are just an alias for `sox`. @@ -124,7 +124,7 @@ 2 3.05 3.73 3 3.97 4.49 ... - + If you want to customize the output format, use `--printf` option: .. code:: bash @@ -144,7 +144,7 @@ Keywords `{id}`, `{start}` and `{end}` can be placed and repeated anywhere in the text. Time is shown in seconds, if you want a more detailed time information, use `--time-format`: auditok -e 55 --printf "[{id}]: {start} to {end}" --time-format "%h:%m:%s.%i" - + :output: .. code:: bash @@ -161,7 +161,7 @@ 1st Practical use case example: generate a subtitles template ############################################################# -Using `--printf ` and `--time-format`, the following command, used with an input audio or video file, will generate and an **srt** file template that can be later edited with a subtitles editor in a way that reduces the time needed to define when each utterance starts and where it ends: +Using `--printf ` and `--time-format`, the following command, used with an input audio or video file, will generate and an **srt** file template that can be later edited with a subtitles editor in a way that reduces the time needed to define when each utterance starts and where it ends: .. code:: bash @@ -174,7 +174,7 @@ 1 00:00:00.730 --> 00:00:01.460 Put some text here... - + 2 00:00:02.440 --> 00:00:03.900 Put some text here... @@ -213,7 +213,7 @@ .. code:: bash speech-rec.sh -i output.flac -r 16000 - + 3- Use **grep** to select lines that contain *transcript*: .. code:: bash @@ -319,7 +319,7 @@ .. code:: bash auditok -o {start}-{end}.wav ... - + Install `pydub` for more audio formats. @@ -352,7 +352,7 @@ :alt: Output from a detector that keeps trailing silence :figclass: align-center :scale: 40 % - + .. code:: bash @@ -364,7 +364,7 @@ :alt: Output from a detector that drop trailing silence :figclass: align-center :scale: 40 % - + You might want to only consider audio activities if they are above a certain duration. The next figure is the result of a detector that only accepts detections of 0.8 second and longer: .. code:: bash @@ -377,8 +377,8 @@ :alt: Output from a detector that detect activities of 800 ms or over :figclass: align-center :scale: 40 % - - + + Finally it is almost always interesting to limit the length of detected audio activities. In any case, one does not want a too long audio event such as an alarm or a drill to hog the detector. For illustration purposes, we set the maximum duration to 0.4 second for this detector, so an audio activity is delivered as soon as it reaches 0.4 second: .. code:: bash @@ -391,7 +391,7 @@ :alt: Output from a detector that delivers audio activities that reach 400 ms :figclass: align-center :scale: 40 % - + Debugging #########
--- a/doc/conf.py Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/conf.py Thu Oct 24 20:49:51 2019 +0200 @@ -19,49 +19,51 @@ # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. -#sys.path.insert(0, os.path.abspath('.')) +# sys.path.insert(0, os.path.abspath('.')) # -- General configuration ------------------------------------------------ # If your documentation needs a minimal Sphinx version, state it here. -#needs_sphinx = '1.0' +# needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ - 'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.autosummary' + "sphinx.ext.viewcode", + "sphinx.ext.autodoc", + "sphinx.ext.autosummary", ] -sys.path.insert(0, '../auditok/') +sys.path.insert(0, "../auditok/") # Add any paths that contain templates here, relative to this directory. -templates_path = ['_templates'] +templates_path = ["_templates"] # The suffix(es) of source filenames. # You can specify multiple suffix as a list of string: # source_suffix = ['.rst', '.md'] -source_suffix = '.rst' +source_suffix = ".rst" # The encoding of source files. -#source_encoding = 'utf-8-sig' +# source_encoding = 'utf-8-sig' # The master toctree document. -master_doc = 'index' +master_doc = "index" # General information about the project. -project = u'auditok' -copyright = u'2015-2016, Amine Sehili' -author = u'Amine Sehili' +project = u"auditok" +copyright = u"2015-2016, Amine Sehili" +author = u"Amine Sehili" # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The short X.Y version. -version = '0.1.5' +version = "0.1.5" # The full version, including alpha/beta/rc tags. -release = '0.1.5' +release = "0.1.5" # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. @@ -72,37 +74,37 @@ # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: -#today = '' +# today = '' # Else, today_fmt is used as the format for a strftime call. -#today_fmt = '%B %d, %Y' +# today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. -exclude_patterns = ['_build'] +exclude_patterns = ["_build"] # The reST default role (used for this markup: `text`) to use for all # documents. -#default_role = None +# default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. -#add_function_parentheses = True +# add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). -#add_module_names = True +# add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. -#show_authors = False +# show_authors = False # The name of the Pygments (syntax highlighting) style to use. -pygments_style = 'sphinx' +pygments_style = "sphinx" # A list of ignored prefixes for module index sorting. -#modindex_common_prefix = [] +# modindex_common_prefix = [] # If true, keep warnings as "system message" paragraphs in the built documents. -#keep_warnings = False +# keep_warnings = False # If true, `todo` and `todoList` produce output, else they produce nothing. todo_include_todos = False @@ -112,168 +114,163 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. -#html_theme = 'sphinxdoc' +# html_theme = 'sphinxdoc' # on_rtd is whether we are on readthedocs.org -import os -on_rtd = os.environ.get('READTHEDOCS', None) == 'True' - -""" -if not on_rtd: # only import and set the theme if we're building docs locally - import sphinx_rtd_theme - html_theme = 'sphinx_rtd_theme' - html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] -""" +on_rtd = os.environ.get("READTHEDOCS", None) == "True" +# if not on_rtd: # only import and set the theme if we're building docs locally +# import sphinx_rtd_theme +# html_theme = 'sphinx_rtd_theme' +# html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. -#html_theme_options = {} +# html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. -#html_theme_path = [] +# html_theme_path = [] # The name for this set of Sphinx documents. If None, it defaults to # "<project> v<release> documentation". -#html_title = None +# html_title = None # A shorter title for the navigation bar. Default is the same as html_title. -#html_short_title = None +# html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. -#html_logo = None +# html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. -#html_favicon = None +# html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". -html_static_path = ['_static'] +html_static_path = ["_static"] # Add any extra paths that contain custom files (such as robots.txt or # .htaccess) here, relative to this directory. These files are copied # directly to the root of the documentation. -#html_extra_path = [] +# html_extra_path = [] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. -#html_last_updated_fmt = '%b %d, %Y' +# html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. -#html_use_smartypants = True +# html_use_smartypants = True # Custom sidebar templates, maps document names to template names. -#html_sidebars = {} +# html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. -#html_additional_pages = {} +# html_additional_pages = {} # If false, no module index is generated. -#html_domain_indices = True +# html_domain_indices = True # If false, no index is generated. -#html_use_index = True +# html_use_index = True # If true, the index is split into individual pages for each letter. -#html_split_index = False +# html_split_index = False # If true, links to the reST sources are added to the pages. -#html_show_sourcelink = True +# html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. -#html_show_sphinx = True +# html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. -#html_show_copyright = True +# html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a <link> tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. -#html_use_opensearch = '' +# html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). -#html_file_suffix = None +# html_file_suffix = None # Language to be used for generating the HTML full-text search index. # Sphinx supports the following languages: # 'da', 'de', 'en', 'es', 'fi', 'fr', 'hu', 'it', 'ja' # 'nl', 'no', 'pt', 'ro', 'ru', 'sv', 'tr' -#html_search_language = 'en' +# html_search_language = 'en' # A dictionary with options for the search language support, empty by default. # Now only 'ja' uses this config value -#html_search_options = {'type': 'default'} +# html_search_options = {'type': 'default'} # The name of a javascript file (relative to the configuration directory) that # implements a search results scorer. If empty, the default will be used. -#html_search_scorer = 'scorer.js' +# html_search_scorer = 'scorer.js' # Output file base name for HTML help builder. -htmlhelp_basename = 'auditokdoc' +htmlhelp_basename = "auditokdoc" # -- Options for LaTeX output --------------------------------------------- latex_elements = { -# The paper size ('letterpaper' or 'a4paper'). -#'papersize': 'letterpaper', - -# The font size ('10pt', '11pt' or '12pt'). -#'pointsize': '10pt', - -# Additional stuff for the LaTeX preamble. -#'preamble': '', - -# Latex figure (float) alignment -#'figure_align': 'htbp', + # The paper size ('letterpaper' or 'a4paper'). + # 'papersize': 'letterpaper', + # The font size ('10pt', '11pt' or '12pt'). + # 'pointsize': '10pt', + # Additional stuff for the LaTeX preamble. + # 'preamble': '', + # Latex figure (float) alignment + # 'figure_align': 'htbp', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, # author, documentclass [howto, manual, or own class]). latex_documents = [ - (master_doc, 'auditok.tex', u'auditok Documentation', - u'Amine Sehili', 'manual'), + ( + master_doc, + "auditok.tex", + u"auditok Documentation", + u"Amine Sehili", + "manual", + ), ] # The name of an image file (relative to this directory) to place at the top of # the title page. -#latex_logo = None +# latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. -#latex_use_parts = False +# latex_use_parts = False # If true, show page references after internal links. -#latex_show_pagerefs = False +# latex_show_pagerefs = False # If true, show URL addresses after external links. -#latex_show_urls = False +# latex_show_urls = False # Documents to append as an appendix to all manuals. -#latex_appendices = [] +# latex_appendices = [] # If false, no module index is generated. -#latex_domain_indices = True +# latex_domain_indices = True # -- Options for manual page output --------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). -man_pages = [ - (master_doc, 'auditok', u'auditok Documentation', - [author], 1) -] +man_pages = [(master_doc, "auditok", u"auditok Documentation", [author], 1)] # If true, show URL addresses after external links. -#man_show_urls = False +# man_show_urls = False # -- Options for Texinfo output ------------------------------------------- @@ -282,19 +279,25 @@ # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ - (master_doc, 'auditok', u'auditok Documentation', - author, 'auditok', 'One line description of project.', - 'Miscellaneous'), + ( + master_doc, + "auditok", + u"auditok Documentation", + author, + "auditok", + "One line description of project.", + "Miscellaneous", + ), ] # Documents to append as an appendix to all manuals. -#texinfo_appendices = [] +# texinfo_appendices = [] # If false, no module index is generated. -#texinfo_domain_indices = True +# texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. -#texinfo_show_urls = 'footnote' +# texinfo_show_urls = 'footnote' # If true, do not generate a @detailmenu in the "Top" node's menu. -#texinfo_no_detailmenu = False +# texinfo_no_detailmenu = False
--- a/doc/dataset.rst Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/dataset.rst Thu Oct 24 20:49:51 2019 +0200 @@ -2,4 +2,4 @@ --------------- .. automodule:: auditok.dataset - :members: \ No newline at end of file + :members:
--- a/doc/index.rst Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/index.rst Thu Oct 24 20:49:51 2019 +0200 @@ -3,7 +3,7 @@ .. image:: https://travis-ci.org/amsehili/auditok.svg?branch=master :target: https://travis-ci.org/amsehili/auditok - + .. image:: https://readthedocs.org/projects/auditok/badge/?version=latest :target: http://auditok.readthedocs.org/en/latest/?badge=latest :alt: Documentation Status
--- a/doc/io.rst Wed Oct 23 21:24:33 2019 +0100 +++ b/doc/io.rst Thu Oct 24 20:49:51 2019 +0200 @@ -2,4 +2,4 @@ ---------- .. automodule:: auditok.io - :members: \ No newline at end of file + :members:
--- a/setup.py Wed Oct 23 21:24:33 2019 +0100 +++ b/setup.py Thu Oct 24 20:49:51 2019 +0200 @@ -4,56 +4,56 @@ from setuptools import setup -_version_re = re.compile(r'__version__\s+=\s+(.*)') +_version_re = re.compile(r"__version__\s+=\s+(.*)") if sys.version_info >= (3, 0): - with open('auditok/__init__.py', 'rt') as f: - version = str(ast.literal_eval(_version_re.search( - f.read()).group(1))) - long_desc = open('doc/index.rst', 'rt').read() + with open("auditok/__init__.py", "rt") as f: + version = str(ast.literal_eval(_version_re.search(f.read()).group(1))) + long_desc = open("doc/index.rst", "rt").read() else: - with open('auditok/__init__.py', 'rb') as f: - version = str(ast.literal_eval(_version_re.search( - f.read().decode('utf-8')).group(1))) - long_desc = open('doc/index.rst', 'rt').read().decode('utf-8') + with open("auditok/__init__.py", "rb") as f: + version = str( + ast.literal_eval( + _version_re.search(f.read().decode("utf-8")).group(1) + ) + ) + long_desc = open("doc/index.rst", "rt").read().decode("utf-8") setup( - name='auditok', + name="auditok", version=version, - url='http://github.com/amsehili/auditok/', - license='GNU General Public License v3 (GPLv3)', - author='Amine Sehili', - author_email='amine.sehili@gmail.com', - description='A module for Audio/Acoustic Activity Detection', - long_description= long_desc, - packages=['auditok'], + url="http://github.com/amsehili/auditok/", + license="GNU General Public License v3 (GPLv3)", + author="Amine Sehili", + author_email="amine.sehili@gmail.com", + description="A module for Audio/Acoustic Activity Detection", + long_description=long_desc, + packages=["auditok"], include_package_data=True, - package_data={'auditok': ['data/*']}, - + package_data={"auditok": ["data/*"]}, zip_safe=False, - platforms='ANY', - provides=['auditok'], - requires=['PyAudio'], + platforms="ANY", + provides=["auditok"], + requires=["PyAudio"], classifiers=[ - 'Development Status :: 3 - Alpha', - 'Environment :: Console', - 'Intended Audience :: Science/Research', - 'Intended Audience :: Developers', - 'Intended Audience :: Information Technology', - 'Intended Audience :: Telecommunications Industry', - 'License :: OSI Approved :: GNU General Public License v3 (GPLv3)', - 'Operating System :: OS Independent', - 'Programming Language :: Python', - 'Programming Language :: Python :: 2.7', - 'Programming Language :: Python :: 3', - 'Programming Language :: Python :: 3.2', - 'Programming Language :: Python :: 3.3', - 'Programming Language :: Python :: 3.4', - 'Topic :: Multimedia :: Sound/Audio :: Analysis', - 'Topic :: Scientific/Engineering :: Information Analysis' + "Development Status :: 3 - Alpha", + "Environment :: Console", + "Intended Audience :: Science/Research", + "Intended Audience :: Developers", + "Intended Audience :: Information Technology", + "Intended Audience :: Telecommunications Industry", + "License :: OSI Approved :: GNU General Public License v3 (GPLv3)", + "Operating System :: OS Independent", + "Programming Language :: Python", + "Programming Language :: Python :: 2.7", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.2", + "Programming Language :: Python :: 3.3", + "Programming Language :: Python :: 3.4", + "Topic :: Multimedia :: Sound/Audio :: Analysis", + "Topic :: Scientific/Engineering :: Information Analysis", ], - entry_points = {'console_scripts': ['auditok = auditok.cmdline:main']} - + entry_points={"console_scripts": ["auditok = auditok.cmdline:main"]}, )
--- a/tests/test_cmdline_util.py Wed Oct 23 21:24:33 2019 +0100 +++ b/tests/test_cmdline_util.py Thu Oct 24 20:49:51 2019 +0200 @@ -75,7 +75,15 @@ record_plot=(None, None, True, None, None, None, True), record_save_image=(None, None, False, "image.png", None, None, True), int_use_channel=("stream.ogg", None, False, None, "1", 1, False), - save_detections_as=("stream.ogg", "{id}.wav", False, None, None, None, False) + save_detections_as=( + "stream.ogg", + "{id}.wav", + False, + None, + None, + None, + False, + ), ) def test_make_kwargs( self, @@ -112,7 +120,15 @@ False, 55, ) - misc = (False, False, None, True, None, "TIME_FORMAT", "TIMESTAMP_FORMAT") + misc = ( + False, + False, + None, + True, + None, + "TIME_FORMAT", + "TIMESTAMP_FORMAT", + ) args_ns = _ArgsNamespece(*(args + misc)) io_kwargs = {
--- a/tests/test_signal.py Wed Oct 23 21:24:33 2019 +0100 +++ b/tests/test_signal.py Thu Oct 24 20:49:51 2019 +0200 @@ -13,19 +13,20 @@ self.data = b"012345679ABC" self.numpy_fmt = {"b": np.int8, "h": np.int16, "i": np.int32} - @genty_dataset( int8_mono=(1, [48, 49, 50, 51, 52, 53, 54, 55, 57, 65, 66, 67]), int16_mono=(2, [12592, 13106, 13620, 14134, 16697, 17218]), int32_mono=(4, [858927408, 926299444, 1128415545]), - int8_stereo=(1, [[48, 50, 52, 54, 57, 66], [49, 51, 53, 55, 65, 67]]), + int8_stereo=(1, [[48, 50, 52, 54, 57, 66], [49, 51, 53, 55, 65, 67]]), int16_stereo=(2, [[12592, 13620, 16697], [13106, 14134, 17218]]), int32_3channel=(4, [[858927408], [926299444], [1128415545]]), ) def test_to_array(self, sample_width, expected): if isinstance(expected[0], list): channels = len(expected) - expected = [array_(signal_.FORMAT[sample_width], xi) for xi in expected] + expected = [ + array_(signal_.FORMAT[sample_width], xi) for xi in expected + ] else: channels = 1 expected = array_(signal_.FORMAT[sample_width], expected) @@ -146,7 +147,6 @@ energy = signal_numpy.calculate_energy_single_channel(x, sample_width) self.assertEqual(energy, expected) - @genty_dataset( min_=( [[300, 320, 400, 600], [150, 160, 200, 300]], @@ -161,13 +161,20 @@ 52.50624901923348, ), ) - def test_calculate_energy_multichannel(self, x, sample_width, aggregation_fn, expected): + def test_calculate_energy_multichannel( + self, x, sample_width, aggregation_fn, expected + ): x = [array_(signal_.FORMAT[sample_width], xi) for xi in x] - energy = signal_.calculate_energy_multichannel(x, sample_width, aggregation_fn) + energy = signal_.calculate_energy_multichannel( + x, sample_width, aggregation_fn + ) self.assertEqual(energy, expected) - energy = signal_numpy.calculate_energy_multichannel(x, sample_width, aggregation_fn) + energy = signal_numpy.calculate_energy_multichannel( + x, sample_width, aggregation_fn + ) self.assertEqual(energy, expected) + if __name__ == "__main__": - unittest.main() \ No newline at end of file + unittest.main()