Feature #597: algorithm for interactive tracking - Tony: a tool for melody transcription - Sound Software .ac.uk

Feature #597

algorithm for interactive tracking

Added by Matthias Mauch almost 12 years ago. Updated about 11 years ago.

Status:

New

Start date:

Priority:

Normal

Due date:

Assignee:

Matthias Mauch

% Done:

Category:

Target version:

Description

This is research. The idea is that the note and pitch track are modelled in one probabilistic model, which enables users to automatically update the pitch track by editing the note track and vice versa. Maybe some even funkier concepts.

History

#1 Updated by Matthias Mauch almost 12 years ago

Due date set to 2013-08-31
Priority changed from Normal to Low
Estimated time set to 40.00

#2 Updated by Matthias Mauch over 11 years ago

Project changed from [tonioni] Tony: a tool for melody annotation to Tony: a tool for melody transcription

#3 Updated by Matthias Mauch about 11 years ago

Priority changed from Low to High

Changed this to high because it's likely to be important for the NYU project.

What would be relatively easy to do is this

select a time-pitch region (something like a rectangle in the main Tony pane.
- this could be done by either directly selecting a rectangle, or by selecting a time region and then drag-moving the mouse to the centre of the desired vertical region
pYIN algorithm re-calculates the pitch (and note?) track for that time region with the constraint of being in the selected frequency region.

(I know the original idea was to focus on moving notes to then let the pitch track follow, but we don't need the notes as such for now -- selected rectangle is enough.)

Problems:

pYIN currently works as a Vamp plugin, i.e. Audio > Plugin > Tony. We need to have an interface that allows Tony to talk back to pYIN, something like Audio > Plugin < > Tony.

Do we need a different architecture for that? I anticipate two opinions:

No. We can just add additional parameters to pYIN for minimum and maximum frequency. Then Tony can send only a limited part of the audio to pYIN (the selected time region) and update the values in that region accordingly.
Yes. Doing it like above is a cheap hack, and will eventually leave us frustrated because

we cannot include more complex parameters (such as "for the first second, consider all frequencies, then for 0.35 seconds only the ones selected, and then another second of all frequencies")