Wiki » History » Version 5

Matthias Mauch, 2012-11-12 04:11 PM

1 1 Matthias Mauch
h1. Wiki
2 1 Matthias Mauch
3 1 Matthias Mauch
h2. Specification
4 1 Matthias Mauch
5 1 Matthias Mauch
The Tony tool will be a very simple user interface for the exact annotation of notes, note pitches and performance of human singing and other monophonic instruments.
6 2 Matthias Mauch
The reason it can be kept very simple is that it exports beautifully to other applications such as Sonic Visualiser, which have different strengths.
7 1 Matthias Mauch
8 4 Matthias Mauch
h3. Minimum Use Case
9 4 Matthias Mauch
10 4 Matthias Mauch
The first prototype is ready if this scenario is possible:
11 4 Matthias Mauch
12 4 Matthias Mauch
# User loads a .wav file of a monophonic singing recording,
13 4 Matthias Mauch
# Tony automatically extracts pitch track and note track,
14 4 Matthias Mauch
# User corrects the onsets and offsets of the pitch track, inserting new ones if necessary, or setting notes to rests,
15 4 Matthias Mauch
# User exports the note track to a csv file, with columns ONSET, DURATION, MEDIAN_PITCH.
16 4 Matthias Mauch
17 1 Matthias Mauch
h3. Users
18 1 Matthias Mauch
19 1 Matthias Mauch
We expect users to be unfamiliar with programming, but to be familiar with music and research in general. In order to illustrate this, these are some possible users:
20 1 Matthias Mauch
21 1 Matthias Mauch
# Friedrich (56) is a musicologist at a university in northern Germany. Sample publication: "Zur Ästhetik der Stimme bei Wagner - Eine vergleichende Analyse historischer und moderner Aufnahmen". He uses a Windows Vista laptop with, mainly for Email correspondence and keeping lists of audio recordings in an Excel spreadsheet. He listens to Music from CDs, noting down singing characteristics on a paper notepad. He plays the cello and organises outings to the Berlin philharmonics with his Seminar students.
22 2 Matthias Mauch
# Sam (31) is an ethnomusicologist at the University of Essex studying the music of Brazilian immigrants in Portugal. She records the immigrants' music on location in Lisbon and compares the rhythmic and melodic variations of the music between Brazil and Portugal. Sam uses a MacBook from 2007. She organises the field recordings and edits them using Audacity. She regularly dances Capoeira in the local Capoeira group in Colchester, Essex.
23 2 Matthias Mauch
# Etienne (25) is a graduate student in the psychology department of a Belgian university and part of a project that aims at replicating the "Levithin" effect. He's a Windows user with a bit of a knack for writing Python scripts. He's frustrated with the way his boss asks him to annotate melodies in Praat, but doesn't quite have the experience or support to build a tool that does it better. Etienne is also associated with a computer science guy who's looked into African rhythms.
24 2 Matthias Mauch
# David (25) is a super-bright statistics graduate who's also a clarinet player and just started his PhD on music transcription of wind instruments at a Canadian university. He realises that there's not enough training data out there to learn his models, so he would like to annotate lots of clarinet recordings.
25 1 Matthias Mauch
26 5 Matthias Mauch
h3. Framework
27 5 Matthias Mauch
28 5 Matthias Mauch
It's not entirely decided what the framework should be, but there are some restrictions:
29 5 Matthias Mauch
30 5 Matthias Mauch
* it should work on PCs and Macs (and Linux boxes)
31 5 Matthias Mauch
** tablets are not a priority because interaction with free audio editing and annotating programs is needed, i.e. this is a work tool, not a play tool.
32 5 Matthias Mauch
** PCs are often used as standard by scientists and teachers outside of CS/maths/stats
33 5 Matthias Mauch
** Macs are often used by the creative guys, and MIR people
34 5 Matthias Mauch
35 1 Matthias Mauch
h3. Components
36 1 Matthias Mauch
37 1 Matthias Mauch
   * automatic pitch and note transcription methods
38 1 Matthias Mauch
   * GUI for correction, and additional note-based or phrase-based annotations
39 1 Matthias Mauch
   * export to RDF and graphics
40 1 Matthias Mauch
41 2 Matthias Mauch
h4. Automatic Transcription Methods
42 1 Matthias Mauch
43 2 Matthias Mauch
   * pitch extraction
44 3 Matthias Mauch
   ** monophonic via Yin
45 3 Matthias Mauch
   ** interface for other algorithms, potentially polyphonic
46 3 Matthias Mauch
   ** output of salience function, or at least N candidate pitches (even if pitch estimation thinks there's no pitch)
47 2 Matthias Mauch
   * Note alignment: align notes to MIDI-like representation given a pitch file
48 2 Matthias Mauch
   * Non-alignment note detection
49 2 Matthias Mauch
   * Methods should be easily re-executable with time-dependent parameter settings
50 1 Matthias Mauch
51 2 Matthias Mauch
h4. GUI
52 1 Matthias Mauch
53 2 Matthias Mauch
The most important aspect of the GUI is that it should be simple because the users are unlikely to be familiar with complicated programs. The most important point should hence be to keep the layout simple. Hence first some *forbidden things*:
54 1 Matthias Mauch
55 2 Matthias Mauch
* polyphonic melody processing (diverts attention from main use, prevents easy interaction with monophonic interface)
56 2 Matthias Mauch
* audio editing (this is an analysis and annotation program, leave that to Audacity)
57 2 Matthias Mauch
* have more than one central pane (too confusing, this is a one task application -- audio and visual are already two dimensions, let's not add more)
58 2 Matthias Mauch
* load more than one audio file (we leave that to Sonic Visualiser)
59 2 Matthias Mauch
* more than one "instrument" (pointer, scissors, eraser, move tool should be superfluous in a simple monophonic environment)
60 2 Matthias Mauch
* additional structural annotation: beats, bars... that can be done post-hoc in Sonic Visualiser
61 2 Matthias Mauch
62 2 Matthias Mauch
So here are some requirements:
63 2 Matthias Mauch
64 2 Matthias Mauch
* One *main pane* that always stays in roughly the same, central place.
65 3 Matthias Mauch
** displays a fixed set of features: pitch track, overlaid note track; hidden: note salience, other pitch track candidates.
66 3 Matthias Mauch
** note track is editable
67 3 Matthias Mauch
** pitch track is not editable separately (why should it?), only via the note track
68 2 Matthias Mauch
* 2 permanent *single-purpose strips* across the length of the main pane
69 3 Matthias Mauch
** select (notes or note ranges)
70 3 Matthias Mauch
** annotate note (double click for simple annotation, drop-down for specific fields)
71 2 Matthias Mauch
* Note editing
72 3 Matthias Mauch
** edit note start and end times,
73 3 Matthias Mauch
** edit continuous note pitch,
74 3 Matthias Mauch
** turn note to rest (unpitched)
75 3 Matthias Mauch
** impossible to delete notes (notes can only be made into rests -- we assume a contiguous, monophonic melody)
76 3 Matthias Mauch
** (lock notes, i.e. prevent from re-estimation)
77 2 Matthias Mauch
* Audio Playback
78 3 Matthias Mauch
** playback original audio
79 3 Matthias Mauch
** sonify notes
80 3 Matthias Mauch
** sonify pitch track
81 1 Matthias Mauch
* Undo note and pitch track changes
82 2 Matthias Mauch
* Export notes, pitchtrack, annotations in the following formats
83 3 Matthias Mauch
** as csv files (very important)
84 3 Matthias Mauch
** as graphics (quite important): pdf, possibly self-contained R script
85 3 Matthias Mauch
** as SV session file
86 3 Matthias Mauch
** as RDF
87 3 Matthias Mauch
** as MIDI
88 2 Matthias Mauch
89 2 Matthias Mauch
Some explicit differences to Sonic Visualiser:
90 2 Matthias Mauch
91 2 Matthias Mauch
* Tony has only one pane (I know I mentioned that before, can't ever mention it too often)
92 2 Matthias Mauch
* arrow keys should only change the playback position, and only via that the visualisation position
93 2 Matthias Mauch
* zoom sliders should be very large -- if possible along the whole sides of the main pane
94 2 Matthias Mauch
* only one instrument (i.e. not have a separate selection tool, or scissors tool)
95 3 Matthias Mauch
** necessitates permanent selection strip (at the top of the main pane), in order to select contiguous notes)
96 3 Matthias Mauch
** double click splits/joins
97 2 Matthias Mauch
98 2 Matthias Mauch
Some explicit differences to Praat:
99 2 Matthias Mauch
100 2 Matthias Mauch
* Tony has only one window
101 2 Matthias Mauch
* Tony has a concept of a contiguous melody
102 2 Matthias Mauch
103 2 Matthias Mauch
h4. Further possible GUI features
104 3 Matthias Mauch
 
105 3 Matthias Mauch
* automatic re-estimation after edits (kind of Maximum A Posteriori or something)