gyorgyf@175
|
1 Paper title.
|
gyorgyf@175
|
2 Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications
|
gyorgyf@175
|
3
|
gyorgyf@175
|
4 Abstract.
|
gyorgyf@175
|
5 We present **Tony**, a free, open-source software tool for
|
gyorgyf@175
|
6 computer-aided pitch track and note annotation of melodic audio content.
|
gyorgyf@175
|
7 The accurate annotation of fundamental frequencies and notes
|
gyorgyf@175
|
8 is essential to the scientific study of
|
gyorgyf@175
|
9 intonation in singing and other instruments.
|
gyorgyf@175
|
10 Unlike commercial applications for singers and producers
|
gyorgyf@175
|
11 or other academic tools for generic music annotation and visualisation
|
gyorgyf@175
|
12 **Tony** has been designed for the scientific study of monophonic music:
|
gyorgyf@175
|
13 a) it implements state-of-the art algorithms for pitch and note estimation from audio,
|
gyorgyf@175
|
14 b) it provides visual and auditory feedback of the extracted pitches
|
gyorgyf@175
|
15 for the identification of detection errors,
|
gyorgyf@175
|
16 b) it provides an intelligent graphical user interface
|
gyorgyf@175
|
17 through which the user can identify and rapidly correct estimation errors,
|
gyorgyf@175
|
18 c) it provides functions for exporting pitch track and note track
|
gyorgyf@175
|
19 enabling further processing in spreadsheets or other applications.
|
gyorgyf@175
|
20 Software versions for Windows, OSX and Linux platforms can be downloaded from
|
gyorgyf@175
|
21 http://code.soundsoftware.ac.uk/projects/tony
|
gyorgyf@175
|
22
|
gyorgyf@175
|
23 Keyword 1.
|
gyorgyf@175
|
24 Pitch/Note Analysis
|
gyorgyf@175
|
25
|
gyorgyf@175
|
26 Keyword 2.
|
gyorgyf@175
|
27 Software
|
gyorgyf@175
|
28
|
gyorgyf@175
|
29 Keyword 3.
|
gyorgyf@175
|
30 Singing.
|
gyorgyf@175
|
31
|
gyorgyf@175
|
32 Aims.
|
gyorgyf@175
|
33 We aim to make the scientific annotation of melodic content more efficient.
|
gyorgyf@175
|
34 ==> We aim to make the annotation of melodic content for scientific purposes more efficient.
|
gyorgyf@175
|
35 (also, possibly move this sentence to the end)
|
gyorgyf@175
|
36
|
gyorgyf@175
|
37 Music psychologists interested in the analysis of pitch and intonation
|
gyorgyf@175
|
38 usually use software programs originally aimed at the analysis of speech
|
gyorgyf@175
|
39 (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation
|
gyorgyf@175
|
40 tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/)
|
gyorgyf@175
|
41 to extract pitches of notes from audio recordings.
|
gyorgyf@175
|
42 Since these programs were not conceived for musical pitch analysis,
|
gyorgyf@175
|
43 the process of extracting note frequencies remains laborious and can take
|
gyorgyf@175
|
44 many times the duration of the recording.
|
gyorgyf@175
|
45
|
gyorgyf@175
|
46 On the other hand, commercial tools such as
|
gyorgyf@175
|
47 Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
|
gyorgyf@175
|
48 Sing&See (http://www.singandsee.com/) have
|
gyorgyf@175
|
49 unknown frequency estimation procedures (proprietary code)
|
gyorgyf@175
|
50 and do not provide export formats needed for scientific analysis.
|
gyorgyf@175
|
51
|
gyorgyf@175
|
52 ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
|
gyorgyf@175
|
53 Sing&See (http://www.singandsee.com/) also exists for these purposes, however
|
gyorgyf@175
|
54 their frequency estimation procedures are typically not public (proprietary code),
|
gyorgyf@175
|
55 and they do not provide export formats suitable for scientific analysis.
|
gyorgyf@175
|
56
|
gyorgyf@175
|
57
|
gyorgyf@175
|
58 An academic note annotation system [1] exists, but does not feature
|
gyorgyf@175
|
59 note extraction. It is also not openly available.
|
gyorgyf@175
|
60
|
gyorgyf@175
|
61 ==> An note annotation system [1] developed for academic purposes exists, but it does not feature
|
gyorgyf@175
|
62 note extraction. It is also not openly available. (openly ?? => open source, free/prop.? )
|
gyorgyf@175
|
63
|
gyorgyf@175
|
64 This is why, during our own research on intonation [2],
|
gyorgyf@175
|
65 we decided to code our own pitch extraction tool that would avoid the shortcomings.
|
gyorgyf@175
|
66
|
gyorgyf@175
|
67 ==> This is why we decided to develop our own pitch extraction tool that would avoid
|
gyorgyf@175
|
68 the above shortcomings during our own research on intonation [2].
|
gyorgyf@175
|
69
|
gyorgyf@175
|
70
|
gyorgyf@175
|
71 Methods.
|
gyorgyf@175
|
72 For automatic pitch and note estimation we use the pYIN method [3].
|
gyorgyf@175
|
73 The method provides precise pitch and note estimates and
|
gyorgyf@175
|
74 automatically determines which parts of the recording are voiced.
|
gyorgyf@175
|
75
|
gyorgyf@175
|
76 The graphical user interface is based upon the
|
gyorgyf@175
|
77 open source software libraries from Sonic Visualiser.
|
gyorgyf@175
|
78
|
gyorgyf@175
|
79 ==> The graphical user interface is based upon
|
gyorgyf@175
|
80 open source software libraries originally developed for the Sonic Visualiser software.
|
gyorgyf@175
|
81
|
gyorgyf@175
|
82 It features the audio waveform, a spectrogram representation,
|
gyorgyf@175
|
83 the pitch track and notes. Users can scroll and zoom in time.
|
gyorgyf@175
|
84 **Tony** does not only play back the original audio,
|
gyorgyf@175
|
85 but also, optionally, sonifications of the pitch track (melody line)
|
gyorgyf@175
|
86 and the note track (discrete pitches with durations).
|
gyorgyf@175
|
87 Notes' pitches are robustly estimated as the median of the pitch track
|
gyorgyf@175
|
88 that occurs during the duration of the note.
|
gyorgyf@175
|
89
|
gyorgyf@175
|
90 (robustly? I know it's good, but nothing really supports the fairly strong statement hereā¦)
|
gyorgyf@175
|
91
|
gyorgyf@175
|
92 The user can delete, move, cut, merge, crop and extend notes,
|
gyorgyf@175
|
93 and the note's frequency is adapted accordingly.
|
gyorgyf@175
|
94 The user can delete spurious parts of the pitch track
|
gyorgyf@175
|
95 and shift the pitch track in frequency.
|
gyorgyf@175
|
96 In order to efficiently correct erroneous pitch tracks, the user can select
|
gyorgyf@175
|
97 a time interval, and **Tony** will provide various alternative
|
gyorgyf@175
|
98 pitch tracks. The user can then pick the correct one.
|
gyorgyf@175
|
99
|
gyorgyf@175
|
100 Outcomes.
|
gyorgyf@175
|
101 The system is currently being used for two projects:
|
gyorgyf@175
|
102 for the generation of new training and test data for Music Informatics research,
|
gyorgyf@175
|
103 and for a new project on intonation in unaccompanied solo singing.
|
gyorgyf@175
|
104
|
gyorgyf@175
|
105 ==> The system is currently being used for two projects:
|
gyorgyf@175
|
106 1) for the generation of new training and test data for Music Informatics research,
|
gyorgyf@175
|
107 and 2) a research project on intonation in unaccompanied solo singing.
|
gyorgyf@175
|
108
|
gyorgyf@175
|
109 Preliminary feedback by the users suggests that
|
gyorgyf@175
|
110 the system does indeed facilitate pitch annotation
|
gyorgyf@175
|
111 and provides vital features that cannot be found in other tools.
|
gyorgyf@175
|
112
|
gyorgyf@175
|
113
|
gyorgyf@175
|
114 Title for final section.
|
gyorgyf@175
|
115 Conclusions
|
gyorgyf@175
|
116
|
gyorgyf@175
|
117 [Q37].
|
gyorgyf@175
|
118 We presented **Tony** a new software tool for computer-assisted
|
gyorgyf@175
|
119 annotation of melodic audio content for scientific analysis.
|
gyorgyf@175
|
120 No other existing program combines pitch and note estimation,
|
gyorgyf@175
|
121 a graphical user interface with auditory feedback,
|
gyorgyf@175
|
122 rapid, computer-aided correction of pitches and
|
gyorgyf@175
|
123 and extensive exporting facilities.
|
gyorgyf@175
|
124 **Tony** is freely available for use on Windows, OSX and Linux platforms
|
gyorgyf@175
|
125 from http://code.soundsoftware.ac.uk/projects/tony/.
|
gyorgyf@175
|
126
|
gyorgyf@175
|
127 Acknowledgements.
|
gyorgyf@175
|
128 Matthias Mauch is funded by the Royal Academy of Engineering.
|
gyorgyf@175
|
129 We would like to thank Justin Salamon, Rachel Bittner and Juan Bello
|
gyorgyf@175
|
130 for their comments and coding help.
|
gyorgyf@175
|
131
|
gyorgyf@175
|
132 Three key references. (APA v6)
|
gyorgyf@175
|
133 [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010.
|
gyorgyf@175
|
134 [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory.
|
gyorgyf@175
|
135 [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).
|
gyorgyf@175
|
136
|
gyorgyf@175
|
137 Comments/queries to organisers.
|
gyorgyf@175
|
138
|