Mercurial > hg > auditok
comparison quickstart.rst @ 5:252d698ae642
Version 1.3, bug and typos fixes
author | Amine Sehili <amine.sehili@gmail.com> |
---|---|
date | Wed, 23 Sep 2015 11:26:58 +0200 |
parents | 364eeb8e8bd2 |
children | 6b2cc3ca5b6a |
comparison
equal
deleted
inserted
replaced
4:31c97510b16b | 5:252d698ae642 |
---|---|
52 handy features: | 52 handy features: |
53 | 53 |
54 - Define a fixed-length block_size (i.e. analysis window) | 54 - Define a fixed-length block_size (i.e. analysis window) |
55 - Allow overlap between two consecutive analysis windows (hop_size < block_size). This can be very important if your validator use the **spectral** information of audio data instead of raw audio samples. | 55 - Allow overlap between two consecutive analysis windows (hop_size < block_size). This can be very important if your validator use the **spectral** information of audio data instead of raw audio samples. |
56 - Limit the amount (i.e. duration) of read data (very useful when reading data from the microphone) | 56 - Limit the amount (i.e. duration) of read data (very useful when reading data from the microphone) |
57 - Record and rewind data (also useful if you read data from the microphone and you want to process it many times offline and/or save it) | 57 - Record and rewind data (also useful if you read data from the microphone and you want to process it many times off-line and/or save it) |
58 | 58 |
59 | 59 |
60 Last but not least, the current version has only one audio window validator based on | 60 Last but not least, the current version has only one audio window validator based on |
61 signal energy. | 61 signal energy. |
62 | 62 |
82 ------------------------------------------------------- | 82 ------------------------------------------------------- |
83 | 83 |
84 | 84 |
85 We want to extract sub-sequences of characters that have: | 85 We want to extract sub-sequences of characters that have: |
86 | 86 |
87 - A minimu length of 1 (`min_length` = 1) | 87 - A minimum length of 1 (`min_length` = 1) |
88 - A maximum length of 9999 (`max_length` = 9999) | 88 - A maximum length of 9999 (`max_length` = 9999) |
89 - Zero consecutive lower case characters within them (`max_continuous_silence` = 0) | 89 - Zero consecutive lower case characters within them (`max_continuous_silence` = 0) |
90 | 90 |
91 We also create the `UpperCaseChecker` whose `read` method returns `True` if the | 91 We also create the `UpperCaseChecker` whose `read` method returns `True` if the |
92 checked character is in upper case and `False` otherwise. | 92 checked character is in upper case and `False` otherwise. |
142 | 142 |
143 [(['A', 'B', 'C', 'D', 'b', 'b', 'E', 'F', 'c', 'G', 'H', 'I', 'd', 'd'], 3, 16), (['J', 'K', 'e', 'e'], 18, 21)] | 143 [(['A', 'B', 'C', 'D', 'b', 'b', 'E', 'F', 'c', 'G', 'H', 'I', 'd', 'd'], 3, 16), (['J', 'K', 'e', 'e'], 18, 21)] |
144 | 144 |
145 Notice the tailing lower case letters "dd" and "ee" at the end of the two | 145 Notice the tailing lower case letters "dd" and "ee" at the end of the two |
146 tokens. The default behavior of `StreamTokenizer` is to keep the *tailing | 146 tokens. The default behavior of `StreamTokenizer` is to keep the *tailing |
147 silence* if it does'nt exceed `max_continuous_silence`. This can be changed | 147 silence* if it doesn't exceed `max_continuous_silence`. This can be changed |
148 using the `DROP_TAILING_SILENCE` mode (see next example). | 148 using the `DROP_TAILING_SILENCE` mode (see next example). |
149 | 149 |
150 Remove tailing silence | 150 Remove tailing silence |
151 ----------------------- | 151 ----------------------- |
152 | 152 |
401 ans `init_max_silence`. By `init_min` = 3 and `init_max_silence` = 1 we tell the tokenizer | 401 ans `init_max_silence`. By `init_min` = 3 and `init_max_silence` = 1 we tell the tokenizer |
402 that a valid event must start with at least 3 noisy windows, between which there | 402 that a valid event must start with at least 3 noisy windows, between which there |
403 is at most 1 silent window. | 403 is at most 1 silent window. |
404 | 404 |
405 Still with this configuration we can get the tokenizer detect that noise as a valid event | 405 Still with this configuration we can get the tokenizer detect that noise as a valid event |
406 (if it actually contains 3 consecutive noisy frames). To circummvent this we use an enough | 406 (if it actually contains 3 consecutive noisy frames). To circumvent this we use an enough |
407 large analysis window (here of 100 ms) to ensure that the brief noise be surrounded by a much | 407 large analysis window (here of 100 ms) to ensure that the brief noise be surrounded by a much |
408 longer silence and hence the energy of the overall analysis window will be below 50. | 408 longer silence and hence the energy of the overall analysis window will be below 50. |
409 | 409 |
410 When using a shorter analysis window (of 10ms for instance, block_size == 441), the brief | 410 When using a shorter analysis window (of 10ms for instance, block_size == 441), the brief |
411 noise contributes more to energy calculation which yields an energy of over 50 for the window. | 411 noise contributes more to energy calculation which yields an energy of over 50 for the window. |
457 | 457 |
458 | 458 |
459 Online audio signal processing | 459 Online audio signal processing |
460 ------------------------------ | 460 ------------------------------ |
461 | 461 |
462 In the next example, audio data is directely acquired from the built-in microphone. | 462 In the next example, audio data is directly acquired from the built-in microphone. |
463 The `tokenize` method is passed a callback function so that audio activities | 463 The `tokenize` method is passed a callback function so that audio activities |
464 are delivered as soon as they are detected. Each detected activity is played | 464 are delivered as soon as they are detected. Each detected activity is played |
465 back using the build-in audio output device. | 465 back using the build-in audio output device. |
466 | 466 |
467 As mentionned before , Signal energy is strongly related to many factors such | 467 As mentioned before , Signal energy is strongly related to many factors such |
468 microphone sensitivity, background noise (including noise inherent to the hardware), | 468 microphone sensitivity, background noise (including noise inherent to the hardware), |
469 distance and your operating system sound settings. Try a lower `energy_threshold` | 469 distance and your operating system sound settings. Try a lower `energy_threshold` |
470 if your noise does not seem to be detected and a higher threshold if you notice | 470 if your noise does not seem to be detected and a higher threshold if you notice |
471 an over detection (echo method prints a detection where you have made no noise). | 471 an over detection (echo method prints a detection where you have made no noise). |
472 | 472 |