comparison quickstart.rst @ 23:2beb3fb562f3

doc update
author Amine Sehili <amine.sehili@gmail.com>
date Sun, 29 Nov 2015 11:52:56 +0100
parents 6b2cc3ca5b6a
children
comparison
equal deleted inserted replaced
22:aceb9bc3d74e 23:2beb3fb562f3
140 140
141 .. code:: python 141 .. code:: python
142 142
143 [(['A', 'B', 'C', 'D', 'b', 'b', 'E', 'F', 'c', 'G', 'H', 'I', 'd', 'd'], 3, 16), (['J', 'K', 'e', 'e'], 18, 21)] 143 [(['A', 'B', 'C', 'D', 'b', 'b', 'E', 'F', 'c', 'G', 'H', 'I', 'd', 'd'], 3, 16), (['J', 'K', 'e', 'e'], 18, 21)]
144 144
145 Notice the tailing lower case letters "dd" and "ee" at the end of the two 145 Notice the trailing lower case letters "dd" and "ee" at the end of the two
146 tokens. The default behavior of `StreamTokenizer` is to keep the *tailing 146 tokens. The default behavior of `StreamTokenizer` is to keep the *trailing
147 silence* if it doesn't exceed `max_continuous_silence`. This can be changed 147 silence* if it doesn't exceed `max_continuous_silence`. This can be changed
148 using the `DROP_TAILING_SILENCE` mode (see next example). 148 using the `DROP_TRAILING_SILENCE` mode (see next example).
149 149
150 Remove tailing silence 150 Remove trailing silence
151 ----------------------- 151 -----------------------
152 152
153 Tailing silence can be useful for many sound recognition applications, including 153 Trailing silence can be useful for many sound recognition applications, including
154 speech recognition. Moreover, from the human auditory system point of view, tailing 154 speech recognition. Moreover, from the human auditory system point of view, trailing
155 low energy signal helps removing abrupt signal cuts. 155 low energy signal helps removing abrupt signal cuts.
156 156
157 If you want to remove it anyway, you can do it by setting `mode` to `StreamTokenizer.DROP_TAILING_SILENCE`: 157 If you want to remove it anyway, you can do it by setting `mode` to `StreamTokenizer.DROP_TRAILING_SILENCE`:
158 158
159 .. code:: python 159 .. code:: python
160 160
161 from auditok import StreamTokenizer, StringDataSource, DataValidator 161 from auditok import StreamTokenizer, StringDataSource, DataValidator
162 162
165 return frame.isupper() 165 return frame.isupper()
166 166
167 dsource = StringDataSource("aaaABCDbbEFcGHIdddJKee") 167 dsource = StringDataSource("aaaABCDbbEFcGHIdddJKee")
168 tokenizer = StreamTokenizer(validator=UpperCaseChecker(), 168 tokenizer = StreamTokenizer(validator=UpperCaseChecker(),
169 min_length=1, max_length=9999, max_continuous_silence=2, 169 min_length=1, max_length=9999, max_continuous_silence=2,
170 mode=StreamTokenizer.DROP_TAILING_SILENCE) 170 mode=StreamTokenizer.DROP_TRAILING_SILENCE)
171 171
172 tokenizer.tokenize(dsource) 172 tokenizer.tokenize(dsource)
173 173
174 output: 174 output:
175 175
374 player.play(data) 374 player.play(data)
375 375
376 assert len(tokens) == 6 376 assert len(tokens) == 6
377 377
378 378
379 Trim leading and tailing silence 379 Trim leading and trailing silence
380 --------------------------------- 380 ---------------------------------
381 381
382 The tokenizer in the following example is set up to remove the silence 382 The tokenizer in the following example is set up to remove the silence
383 that precedes the first acoustic activity or follows the last activity 383 that precedes the first acoustic activity or follows the last activity
384 in a record. It preserves whatever it founds between the two activities. 384 in a record. It preserves whatever it founds between the two activities.
385 In other words, it removes the leading and tailing silence. 385 In other words, it removes the leading and trailing silence.
386 386
387 Sampling rate is 44100 sample per second, we'll use an analysis window of 100 ms 387 Sampling rate is 44100 sample per second, we'll use an analysis window of 100 ms
388 (i.e. block_size == 4410) 388 (i.e. block_size == 4410)
389 389
390 Energy threshold is 50. 390 Energy threshold is 50.
391 391
392 The tokenizer will start accumulating windows up from the moment it encounters 392 The tokenizer will start accumulating windows up from the moment it encounters
393 the first analysis window of an energy >= 50. ALL the following windows will be 393 the first analysis window of an energy >= 50. ALL the following windows will be
394 kept regardless of their energy. At the end of the analysis, it will drop tailing 394 kept regardless of their energy. At the end of the analysis, it will drop trailing
395 windows with an energy below 50. 395 windows with an energy below 50.
396 396
397 This is an interesting example because the audio file we're analyzing contains a very 397 This is an interesting example because the audio file we're analyzing contains a very
398 brief noise that occurs within the leading silence. We certainly do want our tokenizer 398 brief noise that occurs within the leading silence. We certainly do want our tokenizer
399 to stop at this point and considers whatever it comes after as a useful signal. 399 to stop at this point and considers whatever it comes after as a useful signal.
414 .. code:: python 414 .. code:: python
415 415
416 from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for, dataset 416 from auditok import ADSFactory, AudioEnergyValidator, StreamTokenizer, player_for, dataset
417 417
418 # record = True so that we'll be able to rewind the source. 418 # record = True so that we'll be able to rewind the source.
419 asource = ADSFactory.ads(filename=dataset.was_der_mensch_saet_mono_44100_lead_tail_silence, 419 asource = ADSFactory.ads(filename=dataset.was_der_mensch_saet_mono_44100_lead_trail_silence,
420 record=True, block_size=4410) 420 record=True, block_size=4410)
421 asource.open() 421 asource.open()
422 422
423 original_signal = [] 423 original_signal = []
424 # Read the whole signal 424 # Read the whole signal
435 435
436 # Create a validator with an energy threshold of 50 436 # Create a validator with an energy threshold of 50
437 validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=50) 437 validator = AudioEnergyValidator(sample_width=asource.get_sample_width(), energy_threshold=50)
438 438
439 # Create a tokenizer with an unlimited token length and continuous silence within a token 439 # Create a tokenizer with an unlimited token length and continuous silence within a token
440 # Note the DROP_TAILING_SILENCE mode that will ensure removing tailing silence 440 # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence
441 trimmer = StreamTokenizer(validator, min_length = 20, max_length=99999999, init_min=3, init_max_silence=1, max_continuous_silence=9999999, mode=StreamTokenizer.DROP_TAILING_SILENCE) 441 trimmer = StreamTokenizer(validator, min_length = 20, max_length=99999999, init_min=3, init_max_silence=1, max_continuous_silence=9999999, mode=StreamTokenizer.DROP_TRAILING_SILENCE)
442 442
443 443
444 tokens = trimmer.tokenize(asource) 444 tokens = trimmer.tokenize(asource)
445 445
446 # Make sure we only have one token 446 # Make sure we only have one token
448 448
449 trimmed_signal = ''.join(tokens[0][0]) 449 trimmed_signal = ''.join(tokens[0][0])
450 450
451 player = player_for(asource) 451 player = player_for(asource)
452 452
453 print("Playing original signal (with leading and tailing silence)...") 453 print("Playing original signal (with leading and trailing silence)...")
454 player.play(original_signal) 454 player.play(original_signal)
455 print("Playing trimmed signal...") 455 print("Playing trimmed signal...")
456 player.play(trimmed_signal) 456 player.play(trimmed_signal)
457 457
458 458