Feature #848

good interaction for pitch track annotation

Added by Matthias Mauch about 11 years ago. Updated about 11 years ago.

Status:ClosedStart date:2014-01-21
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Current (octave shift)

There currently is only one way of changing the pitch track and that is by

  • selecting the relevant time interval on the selection bar (or panel, or whatever)
  • moving it by one octave (up or down) by pressing page up/down (or by using the corresponding Edit menu entry)

This is, of course, not enough.

Missing functionality

  1. delete pitch track where it's been erroneously detected
  2. add pitch track where it's been erroneously detected as not pitched
  3. indicate a corrected pitch where the program should look for the most likely pitch track and extract it
    • move pitch track by an octave but get frequency estimation from the octave shifted frequency rather than a parallel translation of the original

UI suggestion: paint thick pitch corridor

In my view it would be good to let the user allow to paint a thick pitch corridor around where she believes the correct pitch to be. Then the underlying pitch extractor can be constrained to extract the pitch only in that region. Ideally, the user could choose how thick that corridor would be

This would make it unnecessary for the user to exactly draw the pitch track, but it would give a local feel. The pYIN pitch tracker could then be instructed to search for a pitch track only in that region. I attach a mockup of that.

Erasing

Erasing could be done in a similar paint-y way.

pitch corridor mockup.png 137 KB, downloaded 53 times Matthias Mauch, 2014-01-21 02:32 PM

pyin.dylib 232 KB, downloaded 30 times Matthias Mauch, 2014-01-24 01:37 PM

Screen Shot 2014-02-02 at 12.35.24 PM.png - happy birthday pitch track candidates 67.5 KB, downloaded 81 times Rachel Bittner, 2014-02-02 05:55 PM

Screen Shot 2014-02-02 at 12.42.05 PM.png 56.1 KB, downloaded 87 times Rachel Bittner, 2014-02-02 05:55 PM

pyin.dylib 249 KB, downloaded 36 times Matthias Mauch, 2014-02-02 10:07 PM

SwingKlarinett.wav 7.95 MB, downloaded 10 times Rachel Bittner, 2014-02-02 11:21 PM

History

#1 Updated by Matthias Mauch about 11 years ago

  • Private changed from Yes to No

#2 Updated by Chris Cannam about 11 years ago

I'm also interested in whether features like "nudge up or down a bit" might be useful (i.e. any very simple-to-implement-and-use mechanisms). Anything involving painting a region and updating interactively is obviously more complicated to implement, but it may also be fiddly to use.

Then of course there might be "choose an alternative pitch candidate" (from a limited set of returned pitch candidates, e.g. just switching a region to a different candidate rather than selecting by pitch).

Would appreciate more feedback, and input from prospective users from NYU.

#3 Updated by Matthias Mauch about 11 years ago

First of all apologies... I'd written the original message as "private" and expected it to appear to the project members. Apparently it was private to me alone though...

Chris Cannam wrote:

I'm also interested in whether features like "nudge up or down a bit" might be useful (i.e. any very simple-to-implement-and-use mechanisms). Anything involving painting a region and updating interactively is obviously more complicated to implement, but it may also be fiddly to use.

Ok, good point, maybe the painting is too fiddly. But simply using a nudge feature might get you results that are not compatible with the audio any more.
Maybe both can be done: one could say "nudge up", and then the system calculates the most likely solution in the vicinity of the pitch indicated by the nudge.

Then of course there might be "choose an alternative pitch candidate" (from a limited set of returned pitch candidates, e.g. just switching a region to a different candidate rather than selecting by pitch).

This does sound attractive, but, again, I'm quite sure that a simple solution wouldn't necessarily work. Here's why: when pYIN hasn't found the right pitch track, then the reason is usually that the correct alternative has a gap, i.e. the system as it is does not provide a correct alternative. However pYIN can usually find a correct candidate if the pitch range is restricted. So what one could do here is

  1. select time interval that encloses the incorrect piece of pitch track (of at most, say 10 seconds)
  2. calculate
    • pYIN salience once
    • pYIN smoothed pitch track with different frequency constraints (i.e. overlapping frequency windows)
  3. let user choose the best unique one of these

Would indeed be good to know what Justin thinks of this, since he's the one who has worked with such systems.

#4 Updated by Justin Salamon about 11 years ago

Hello gents!

My sincere apologies for not responding until now - but I'm afraid I haven't received any email notifications at all on this issue, and only found it by chance whilst browsing the repository. Perhaps due to its original settings as private?

Anyway, back to the matter at hand:

Yes, the most important functionalities currently missing are the ones indicated by Matthias:
1. Delete pitch (=remove voicing false alarm)
2. Request pitch where non provided (=increase voicing recall)
3. Correct pitch where incorrect values returned

Of these, 1 is the most straight forward I guess. For 2, I imagine the user being able to select a region and click a button to obtain a pitch estimate, internally forcing pYin to return a series of values for the selected time range. The trickiest is 3, as you have already noted. I'm personally not entirely sure whether the nudge up/down will be useful - it's very hard (or practically impossible) for users to correct pitch on a sub-semitone level manually.

I think that in an ideal world, the procedure for correcting pitch would be something like this:
a) the user indicates a time range where the pitch needs to be corrected
b) the system calculates several alternative pitch tracks (i.e. groups of continuous pitch values in the selected range)
c) the user chooses the correct pitch group by clicking on it.

Which I think is what Matthias was proposing. If we ignore the fact that we're using continuous f0 for a moment, this would be the note equivalent of selecting a time range and asking the system for alternative notes, and then being able to select the correct series of notes out of of all the suggested notes. This way there is no requirement to draw any region.

The option of jumping to an alternative candidate is basically the same except that rather than displaying all candidates and letting the user choose, the system just switches to the next candidate. This might be easier to implement, though perhaps a little less convenient to use?

#5 Updated by Matthias Mauch about 11 years ago

My sincere apologies for not responding until now - but I'm afraid I haven't received any email notifications at all on this issue, and only found it by chance whilst browsing the repository. Perhaps due to its original settings as private?

Don't think this was your mistake at all. I mis-interpreted the private tick box to mean private to members when it meant private to me alone.

1. Delete pitch (=remove voicing false alarm)
2. Request pitch where non provided (=increase voicing recall)
3. Correct pitch where incorrect values returned

I think that in an ideal world, the procedure for correcting pitch would be something like this:
a) the user indicates a time range where the pitch needs to be corrected
b) the system calculates several alternative pitch tracks (i.e. groups of continuous pitch values in the selected range)
c) the user chooses the correct pitch group by clicking on it.

I agree that this is much more straight-forward than my initial proposal, so I'm happy to shelve that.

Maybe 2 and 3 could be done in the same way. (Actually, one could possibly do 1 in the same way as well by allowing the null pitch track as a choice via a button or so.)

I'm quite excited at the idea of making a plugin that returns n pitch track candidates for a segment. What would stretch my capabilities is how to make it interface with Tony. So technically, what should happen is (sorry for re-iterating your list Justin, just wanted to make the steps clearer):

  1. user selects time interval
  2. Tony sends relevant audio (+- 1 second) to Vamp plugin
  3. Vamp plugin calculates n pitch tracks in that region
  4. Tony displays each of them, in red! :)
  5. ...and makes them clickable?
  6. user selects preferred pitch track
  7. Tony replaces original pitch track with the selected pitch track

If we can agree on something like this, then I think this might actually be a more urgent issue than the volume controls (because our current target users Justin/Rachel could---in the worst case---actually set their preferred volume/pan settings in the code and compile).

I could try to tackle the interface as well...

#6 Updated by Chris Cannam about 11 years ago

this might actually be a more urgent issue than the volume controls

Relative priorities only matter if the two tasks would involve the same people -- in this case I wouldn't suggest that someone new to the SV code should take on the interactive editing work, while the playback adjustment stuff is a better fit.

I plan to work on this feature (on the Tony side rather than the pYIN side) next week, so the more information (and consensus) we can get before then about how best it should work, the better.

In the mean time anything you can do to support this in pYIN would be excellent.

#7 Updated by Matthias Mauch about 11 years ago

That sounds good, and I think we should try it.

It's likely that we can decide whether our mode of interaction is good enough only after we've tested it.

So I'll wait for objections for a day or so and shall be working on the pYIN side then.

#8 Updated by Justin Salamon about 11 years ago

Sound good!

Matthias, I like your breakdown of the pitch correction steps. Something perhaps obvious to keep in mind is that we probably want the user to be able to still select the original pitch track (or click cancel or something like that) in case they're not happy with the alternatives proposed. Another issue I think we might run into is what happens at the limits (start/end) of a time selection (i.e. how precise will a user have to be when selecting a time segment), but let's just see what happens first.

Regarding work load, I'll have a second look at the sonification controls now that Chris has added some pointers, hopefully Rachel and I can tackle that and let you two focus on the pitch editing (both on the pYin and Tony side).

#9 Updated by Matthias Mauch about 11 years ago

Justin Salamon wrote:

Sound good!

Matthias, I like your breakdown of the pitch correction steps. Something perhaps obvious to keep in mind is that we probably want the user to be able to still select the original pitch track (or click cancel or something like that) in case they're not happy with the alternatives proposed. Another issue I think we might run into is what happens at the limits (start/end) of a time selection (i.e. how precise will a user have to be when selecting a time segment), but let's just see what happens first.

Great.

I have a rudimentary, bit hacky new plugin "Local Candidate PYIN" as part of the pyin library now that extracts multiple pitch track candidates.

I believe that this is more than enough to get multiple candidates for a single note, and it seems to work for that. One current limitation of the implementation is that each candidate track can span at most nRange semitones (atm: nRange = 9), so the method is very likely produce unusable results when the melody's actual range is larger (probably in longer phrases).

How much of a problem is that?

The reason I'm asking is that I don't really see an easy way around that atm, even if I were to choose a softer range; but maybe we could discuss.

The code is pushed to branch "tony" on the pyin repository. I append a OSX build that might just work for you. Maybe you can try in Sonic Visualiser by selecting a time range and then running the plugin "Local Candidate PYIN".

#10 Updated by Justin Salamon about 11 years ago

I'm having trouble getting the plugin to appear in SV - neither the appended dylib nor my own build from source get listed by SV (as opposed to the pre-compiled pyin-v1.0-osx.tar.gz which does work). OSX 10.9, SV 2.2-2.3. Any clue?

#11 Updated by Matthias Mauch about 11 years ago

Justin Salamon wrote:

I'm having trouble getting the plugin to appear in SV - neither the appended dylib nor my own build from source get listed by SV (as opposed to the pre-compiled pyin-v1.0-osx.tar.gz which does work). OSX 10.9, SV 2.2-2.3. Any clue?

Sadly, no clue... I'll just write some things, maybe they help maybe they won't.

Have you tried with Sonic Annotator?

I get this:

sonic-annotator -l | grep pyin
vamp:pyin:localcandidatepyin:pitchtrackcandidates
vamp:pyin:yin:f0
vamp:pyin:yin:periodicity
vamp:pyin:yin:rms
vamp:pyin:yin:salience
vamp:pyin:pyin:f0probs
vamp:pyin:pyin:candidatesalience
vamp:pyin:pyin:f0candidates
vamp:pyin:pyin:notes
vamp:pyin:pyin:smoothedpitchtrack
vamp:pyin:pyin:voicedprob

... the first one is new.

Is the new stuff definitely in the Vamp plugin directory? Maybe you could try giving the path explicitly?

VAMP_PATH=<whereYouPutThePlugin> sonic-annotator -l | grep pyin

I also use SV 2.3 on OSX 10.9, so I really don't know what would throw a Vamp buff like you... Will keep thinking.

#12 Updated by Rachel Bittner about 11 years ago

Matthias Mauch wrote:

Justin Salamon wrote:

I'm having trouble getting the plugin to appear in SV - neither the appended dylib nor my own build from source get listed by SV (as opposed to the pre-compiled pyin-v1.0-osx.tar.gz which does work). OSX 10.9, SV 2.2-2.3. Any clue?

Sadly, no clue... I'll just write some things, maybe they help maybe they won't.

Have you tried with Sonic Annotator?

I get this:

[...]

... the first one is new.

Is the new stuff definitely in the Vamp plugin directory? Maybe you could try giving the path explicitly?

VAMP_PATH=<whereYouPutThePlugin> sonic-annotator -l | grep pyin

I also use SV 2.3 on OSX 10.9, so I really don't know what would throw a Vamp buff like you... Will keep thinking.

I'm having the same issue as Justin. Setting the vamp path did something though it still seems a bit unhappy:

mtechdoc3:~ rachelbittner$ sonic-annotator -l | grep pyin
mtechdoc3:~ rachelbittner$ VAMP_PATH=/Library/Audio/Plug-Ins/Vamp/ sonic-annotator -l | grep pyin
WARNING: FeatureExtractionPluginFactory::getPluginIdentifiers: No descriptor function in /Library/Audio/Plug-Ins/Vamp/libvamp-hostsdk.dylib
WARNING: FeatureExtractionPluginFactory::getPluginIdentifiers: No descriptor function in /Library/Audio/Plug-Ins/Vamp/libvamp-sdk.dylib
WARNING: FeatureExtractionPluginFactory::getPluginIdentifiers: Failed to load library /Library/Audio/Plug-Ins/Vamp/pyin_newbuggy.dylib: dlopen(/Library/Audio/Plug-Ins/Vamp/pyin_newbuggy.dylib, 5): Library not loaded: libvamp-sdk.dylib
  Referenced from: /Library/Audio/Plug-Ins/Vamp/pyin_newbuggy.dylib
  Reason: image not found
vamp:pyin:yin:f0
vamp:pyin:yin:periodicity
vamp:pyin:yin:rms
vamp:pyin:yin:salience
vamp:pyin:pyin:f0probs
vamp:pyin:pyin:candidatesalience
vamp:pyin:pyin:f0candidates
vamp:pyin:pyin:notes
vamp:pyin:pyin:smoothedpitchtrack
vamp:pyin:pyin:voicedprob

#13 Updated by Matthias Mauch about 11 years ago

Sadly, this is a bit beyond me... Chris might have a better idea on what the issue is. Might it be because you have both plugins (old and new) in the same directory? They have the same output names, so that might lead to conflicts. Maybe you could try temporarily removing the old plugin from the

/Library/Audio/Plug-Ins/Vamp/

directory. -- Not sure, but might be worth a try.

#14 Updated by Chris Cannam about 11 years ago

I can't quite tell from the output which library is which, but here's my supposition.

I'm going to guess that pyin.dylib is the "old" library and pyin_newbuggy.dylib is the "new" one, because it looks as if the pyin.dylib library works.

The reason pyin_newbuggy.dylib is not loading, I think, is that it has a dependency on both the Vamp SDK libraries (the host-side one as well as the plugin-side one, for some reason!) which the dynamic linker can't resolve.

(As an aside, if you run "otool -L whatever.dylib" for a library, OS/X will report what its dependencies are and whether it can find them.)

Since these libraries are not in the general dynamic linker search path, it won't load them -- just putting them in the same directory doesn't work (except sometimes on Windows).

You could try installing the Vamp SDK libraries to /usr/lib or /usr/local/lib or similar. But to be honest I think a better way, when building plugins, is to link the plugin with a static version of the Vamp SDK library (the libvamp-sdk.a file rather than the .dylib file) so that it doesn't have to be loaded at runtime.

A simple, if hacky, way to achieve that is just to delete the libvamp-sdk.dylib and libvamp-hostsdk.dylib from the Vamp SDK directory after building the SDK and before building any plugins, leaving only the .a files, so that any plugins compiled against the libraries in that directory will then load without these additional dependencies.

If that makes no sense, let me know and I'll try to rephrase.

Finally, note that the plugin library's name is part of the plugin identifier, so you can't have two copies of the pyin plugin installed at once (well, you can, but only one will show up to any host looking for vamp:pyin:...)

#15 Updated by Rachel Bittner about 11 years ago

Ok, I think I have it all working now. What fixed it was to:
1. remove the old pyin.dylib
2. add the vamp sdk libraries to /usr/local/bin

I've played around with Chris' updates. When I use your test file (happy birthday) it looks nice. On most of my test files I'm getting "blobs" of pitch tracks. Don't know if that's just a visualization bug, or maybe I'm selecting regions that are too long?

Side question - how many candidate pitch tracks should it be returning?

#16 Updated by Chris Cannam about 11 years ago

Interesting -- looks like a bug that is perhaps putting more than one pitch candidate into the same layer (the underlying layer code would "connect" the two if that happened, and "blobs" like these would be the expected result).

Can you either attach, or email separately, a test file that exhibits this problem?

#17 Updated by Matthias Mauch about 11 years ago

Did you maybe use the pyin binary I posted here? I think what you see might have to do with the order of the output being different after Chrs's changes (Tony now expects them in a different order). I attach my newest build.

#18 Updated by Rachel Bittner about 11 years ago

Chris Cannam wrote:

Interesting -- looks like a bug that is perhaps putting more than one pitch candidate into the same layer (the underlying layer code would "connect" the two if that happened, and "blobs" like these would be the expected result).

Can you either attach, or email separately, a test file that exhibits this problem?

Test file attached.

#19 Updated by Rachel Bittner about 11 years ago

Matthias Mauch wrote:

Did you maybe use the pyin binary I posted here? I think what you see might have to do with the order of the output being different after Chrs's changes (Tony now expects them in a different order). I attach my newest build.

Yeah, that was the problem. Blob bug has disappeared :)

#20 Updated by Justin Salamon about 11 years ago

Finally got pYin appearing in SV (and Tony), for me the solution was copying the SV libraries to /usr/lib/.

#21 Updated by Rachel Bittner about 11 years ago

It would be useful to have a key/option to "revert to original". The functionality I have in mind would be to (1) select a region (2) select revert to original (3) the original pyin output would appear for the selected region.

My thinking for this is that if somehow the user completely screws things up in a section, they can go back to the start without too much pain.

#22 Updated by Chris Cannam about 11 years ago

  • Status changed from New to Closed

I think this issue has become a bit unwieldy and we'd be better off with individual issues for the remaining outstanding points. Hope you agree (you can always reopen this one if you disagree strongly!)

Also available in: Atom PDF