amine@2: """ amine@2: @author: Amine SEHILI amine@2: September, 2015 amine@2: """ amine@2: amine@10: # Trim leading and trailing silence from a record amine@2: amine@331: from auditok import ( amine@331: ADSFactory, amine@331: AudioEnergyValidator, amine@331: StreamTokenizer, amine@331: player_for, amine@331: dataset, amine@331: ) amine@2: import pyaudio amine@10: import sys amine@2: amine@2: """ amine@2: The tokenizer in the following example is set up to remove the silence amine@331: that precedes the first acoustic activity or follows the last activity amine@2: in a record. It preserves whatever it founds between the two activities. amine@10: In other words, it removes the leading and trailing silence. amine@2: amine@2: Sampling rate is 44100 sample per second, we'll use an analysis window of 100 ms amine@2: (i.e. bloc_ksize == 4410) amine@2: amine@2: Energy threshold is 50. amine@2: amine@2: The tokenizer will start accumulating windows up from the moment it encounters amine@331: the first analysis window of an energy >= 50. ALL the following windows will be amine@10: kept regardless of their energy. At the end of the analysis, it will drop trailing amine@2: windows with an energy below 50. amine@2: amine@2: This is an interesting example because the audio file we're analyzing contains a very amine@331: brief noise that occurs within the leading silence. We certainly do want our tokenizer amine@2: to stop at this point and considers whatever it comes after as a useful signal. amine@2: To force the tokenizer to ignore that brief event we use two other parameters `init_min` amine@2: ans `init_max_silence`. By `init_min`=3 and `init_max_silence`=1 we tell the tokenizer amine@2: that a valid event must start with at least 3 noisy windows, between which there amine@2: is at most 1 silent window. amine@2: amine@2: Still with this configuration we can get the tokenizer detect that noise as a valid event amine@2: (if it actually contains 3 consecutive noisy frames). To circummvent this we use an enough amine@2: large analysis window (here of 100 ms) to ensure that the brief noise be surrounded by a much amine@2: longer silence and hence the energy of the overall analysis window will be below 50. amine@2: amine@2: When using a shorter analysis window (of 10ms for instance, block_size == 441), the brief amine@2: noise contributes more to energy calculation which yields an energy of over 50 for the window. amine@2: Again we can deal with this situation by using a higher energy threshold (55 for example) amine@331: amine@2: """ amine@2: amine@10: try: amine@331: # record = True so that we'll be able to rewind the source. amine@331: asource = ADSFactory.ads( amine@331: filename=dataset.was_der_mensch_saet_mono_44100_lead_tail_silence, amine@331: record=True, amine@331: block_size=4410, amine@331: ) amine@331: asource.open() amine@2: amine@331: original_signal = [] amine@331: # Read the whole signal amine@331: while True: amine@331: w = asource.read() amine@331: if w is None: amine@331: break amine@331: original_signal.append(w) amine@2: amine@331: original_signal = b"".join(original_signal) amine@2: amine@331: # rewind source amine@331: asource.rewind() amine@2: amine@331: # Create a validator with an energy threshold of 50 amine@331: validator = AudioEnergyValidator( amine@331: sample_width=asource.get_sample_width(), energy_threshold=50 amine@331: ) amine@2: amine@331: # Create a tokenizer with an unlimited token length and continuous silence within a token amine@331: # Note the DROP_TRAILING_SILENCE mode that will ensure removing trailing silence amine@331: trimmer = StreamTokenizer( amine@331: validator, amine@331: min_length=20, amine@331: max_length=99999999, amine@331: max_continuous_silence=9999999, amine@331: mode=StreamTokenizer.DROP_TRAILING_SILENCE, amine@331: init_min=3, amine@331: init_max_silence=1, amine@331: ) amine@2: amine@331: tokens = trimmer.tokenize(asource) amine@2: amine@331: # Make sure we only have one token amine@331: assert len(tokens) == 1, "Should have detected one single token" amine@2: amine@331: trimmed_signal = b"".join(tokens[0][0]) amine@2: amine@331: player = player_for(asource) amine@2: amine@331: print( amine@331: "\n ** Playing original signal (with leading and trailing silence)..." amine@331: ) amine@331: player.play(original_signal) amine@331: print("\n ** Playing trimmed signal...") amine@331: player.play(trimmed_signal) amine@2: amine@331: player.stop() amine@331: asource.close() amine@2: amine@10: except KeyboardInterrupt: amine@10: amine@331: player.stop() amine@331: asource.close() amine@331: sys.exit(0) amine@10: amine@10: except Exception as e: amine@331: amine@331: sys.stderr.write(str(e) + "\n") amine@331: sys.exit(1)