https://code.soundsoftware.ac.uk/https://code.soundsoftware.ac.uk/favicon.ico?15040038542014-05-20T13:02:30ZSound Software .ac.ukTony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=19032014-05-20T13:02:30ZMatthias Mauch
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>This does not just apply to the waveform visualisation, but also to spectrogram visualisation and the actual PYIN processing, so I think normalisation is actually an important issue. Just discussing this with Simon: we would need that for our project.</p>
<p>So I set to high priority.</p> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=19042014-05-21T11:23:30ZMatthias Mauch
<ul></ul><p>So I've been looking around a little bit to get a feel how the normalisation should work.</p>
<p>Conceptually I think it's always going to look like this (sorry if this is obvious):</p>
<p>1. measure input gain<br />2. amplify so that new gain = x (where x is our chosen level)</p>
<p>Re 1: I think we should follow the Replay Gain guys, who calculate the gain of the original signal as the 95%ile of framewise (50ms) loudness measurements. This is apparently grounded in perception theory: <a class="external" href="http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_specification">http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_specification</a><br />For simplicity we could just use the RMS on frames as our loudness measurement, so by taking the 95%ile of that we have our input gain.</p>
<p>Re 2: We need to decide how conservative we want to be on the output gain. If we do normalisation internally in Tony, then it's not a big issue, and we can adjust the floating point representation to look and sound right, e.g x = -10 dB, and abs values > 1 are fine. If we want to write a new file, which clips at abs values > 1, then we might need to be more conservative. x = -20dB seems to be what Brecht suggested, and in a few informal experiments on our singing data no clipping happened in that case.</p> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=19482014-06-03T16:14:04ZMatthias Mauch
<ul><li><strong>Priority</strong> changed from <i>High</i> to <i>Urgent</i></li></ul> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=20032014-06-13T13:27:23ZMatthias Mauch
<ul></ul><p>A quick fix could be to simply normalise to maximum level, i.e. in Matlabby code:<br /><pre>
x = x/max(abs(x));
</pre></p>
<p>That would, in many cases, be enough, and would introduce no clipping. (It might introduce some quantisation error, but I assume that's negligible.)</p> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=20122014-06-13T15:35:25ZChris Cannamcannam@all-day-breakfast.com
<ul></ul><p>OK, as of <a href="https://code.soundsoftware.ac.uk/projects/tony/repository/revisions/ed9296a27a276a10f38abd06cbc0098b403d8df2" class="changeset" title="Merge">ed9296a27a27</a> we normalise to max level == 1. Try it out.</p> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=20132014-06-13T15:35:46ZChris Cannamcannam@all-day-breakfast.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li></ul> Tony: a tool for melody transcription - Feature #926: Plot normalized waveformhttps://code.soundsoftware.ac.uk/issues/926?journal_id=20142014-06-13T16:22:35ZMatthias Mauch
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Closed</i></li></ul><p>very nice. works.</p>