Feature #926: Plot normalized waveform - Tony: a tool for melody transcription - Sound Software .ac.uk

Feature #926

Plot normalized waveform

Added by Rachel Bittner about 11 years ago. Updated almost 11 years ago.

Status:

Closed

Start date:

2014-04-09

Priority:

Urgent

Due date:

Assignee:

% Done:

Category:

Target version:

Description

The view of the waveform has proven to be very useful - in particular for onsets/offsets.
However, it doesn't appear to be normalized because in quiet tracks you don't see much.

History

#1 Updated by Matthias Mauch almost 11 years ago

Priority changed from Normal to High

This does not just apply to the waveform visualisation, but also to spectrogram visualisation and the actual PYIN processing, so I think normalisation is actually an important issue. Just discussing this with Simon: we would need that for our project.

So I set to high priority.

#2 Updated by Matthias Mauch almost 11 years ago

So I've been looking around a little bit to get a feel how the normalisation should work.

Conceptually I think it's always going to look like this (sorry if this is obvious):

1. measure input gain
2. amplify so that new gain = x (where x is our chosen level)

Re 1: I think we should follow the Replay Gain guys, who calculate the gain of the original signal as the 95%ile of framewise (50ms) loudness measurements. This is apparently grounded in perception theory: http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_specification
For simplicity we could just use the RMS on frames as our loudness measurement, so by taking the 95%ile of that we have our input gain.

Re 2: We need to decide how conservative we want to be on the output gain. If we do normalisation internally in Tony, then it's not a big issue, and we can adjust the floating point representation to look and sound right, e.g x = -10 dB, and abs values > 1 are fine. If we want to write a new file, which clips at abs values > 1, then we might need to be more conservative. x = -20dB seems to be what Brecht suggested, and in a few informal experiments on our singing data no clipping happened in that case.