annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 698:ee97c742d184 tip

Default branch is now named default on git as well as hg, in case we ever want to switch to mirroring in the other direction
author Chris Cannam
date Thu, 27 Aug 2020 15:58:43 +0100
parents 26224791546f
children
rev   line source
gyorgyf@175 1 Paper title.
gyorgyf@175 2 Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications
gyorgyf@175 3
gyorgyf@175 4 Abstract.
gyorgyf@175 5 We present **Tony**, a free, open-source software tool for
gyorgyf@175 6 computer-aided pitch track and note annotation of melodic audio content.
gyorgyf@175 7 The accurate annotation of fundamental frequencies and notes
gyorgyf@175 8 is essential to the scientific study of
gyorgyf@175 9 intonation in singing and other instruments.
gyorgyf@175 10 Unlike commercial applications for singers and producers
gyorgyf@175 11 or other academic tools for generic music annotation and visualisation
gyorgyf@175 12 **Tony** has been designed for the scientific study of monophonic music:
gyorgyf@175 13 a) it implements state-of-the art algorithms for pitch and note estimation from audio,
gyorgyf@175 14 b) it provides visual and auditory feedback of the extracted pitches
gyorgyf@175 15 for the identification of detection errors,
gyorgyf@175 16 b) it provides an intelligent graphical user interface
gyorgyf@175 17 through which the user can identify and rapidly correct estimation errors,
gyorgyf@175 18 c) it provides functions for exporting pitch track and note track
gyorgyf@175 19 enabling further processing in spreadsheets or other applications.
gyorgyf@175 20 Software versions for Windows, OSX and Linux platforms can be downloaded from
gyorgyf@175 21 http://code.soundsoftware.ac.uk/projects/tony
gyorgyf@175 22
gyorgyf@175 23 Keyword 1.
gyorgyf@175 24 Pitch/Note Analysis
gyorgyf@175 25
gyorgyf@175 26 Keyword 2.
gyorgyf@175 27 Software
gyorgyf@175 28
gyorgyf@175 29 Keyword 3.
gyorgyf@175 30 Singing.
gyorgyf@175 31
gyorgyf@175 32 Aims.
gyorgyf@175 33 We aim to make the scientific annotation of melodic content more efficient.
gyorgyf@175 34 ==> We aim to make the annotation of melodic content for scientific purposes more efficient.
gyorgyf@175 35 (also, possibly move this sentence to the end)
gyorgyf@175 36
gyorgyf@175 37 Music psychologists interested in the analysis of pitch and intonation
gyorgyf@175 38 usually use software programs originally aimed at the analysis of speech
gyorgyf@175 39 (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation
gyorgyf@175 40 tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/)
gyorgyf@175 41 to extract pitches of notes from audio recordings.
gyorgyf@175 42 Since these programs were not conceived for musical pitch analysis,
gyorgyf@175 43 the process of extracting note frequencies remains laborious and can take
gyorgyf@175 44 many times the duration of the recording.
gyorgyf@175 45
gyorgyf@175 46 On the other hand, commercial tools such as
gyorgyf@175 47 Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175 48 Sing&See (http://www.singandsee.com/) have
gyorgyf@175 49 unknown frequency estimation procedures (proprietary code)
gyorgyf@175 50 and do not provide export formats needed for scientific analysis.
gyorgyf@175 51
gyorgyf@175 52 ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175 53 Sing&See (http://www.singandsee.com/) also exists for these purposes, however
gyorgyf@175 54 their frequency estimation procedures are typically not public (proprietary code),
gyorgyf@175 55 and they do not provide export formats suitable for scientific analysis.
gyorgyf@175 56
gyorgyf@175 57
gyorgyf@175 58 An academic note annotation system [1] exists, but does not feature
gyorgyf@175 59 note extraction. It is also not openly available.
gyorgyf@175 60
gyorgyf@175 61 ==> An note annotation system [1] developed for academic purposes exists, but it does not feature
gyorgyf@175 62 note extraction. It is also not openly available. (openly ?? => open source, free/prop.? )
gyorgyf@175 63
gyorgyf@175 64 This is why, during our own research on intonation [2],
gyorgyf@175 65 we decided to code our own pitch extraction tool that would avoid the shortcomings.
gyorgyf@175 66
gyorgyf@175 67 ==> This is why we decided to develop our own pitch extraction tool that would avoid
gyorgyf@175 68 the above shortcomings during our own research on intonation [2].
gyorgyf@175 69
gyorgyf@175 70
gyorgyf@175 71 Methods.
gyorgyf@175 72 For automatic pitch and note estimation we use the pYIN method [3].
gyorgyf@175 73 The method provides precise pitch and note estimates and
gyorgyf@175 74 automatically determines which parts of the recording are voiced.
gyorgyf@175 75
gyorgyf@175 76 The graphical user interface is based upon the
gyorgyf@175 77 open source software libraries from Sonic Visualiser.
gyorgyf@175 78
gyorgyf@175 79 ==> The graphical user interface is based upon
gyorgyf@175 80 open source software libraries originally developed for the Sonic Visualiser software.
gyorgyf@175 81
gyorgyf@175 82 It features the audio waveform, a spectrogram representation,
gyorgyf@175 83 the pitch track and notes. Users can scroll and zoom in time.
gyorgyf@175 84 **Tony** does not only play back the original audio,
gyorgyf@175 85 but also, optionally, sonifications of the pitch track (melody line)
gyorgyf@175 86 and the note track (discrete pitches with durations).
gyorgyf@175 87 Notes' pitches are robustly estimated as the median of the pitch track
gyorgyf@175 88 that occurs during the duration of the note.
gyorgyf@175 89
gyorgyf@175 90 (robustly? I know it's good, but nothing really supports the fairly strong statement here…)
gyorgyf@175 91
gyorgyf@175 92 The user can delete, move, cut, merge, crop and extend notes,
gyorgyf@175 93 and the note's frequency is adapted accordingly.
gyorgyf@175 94 The user can delete spurious parts of the pitch track
gyorgyf@175 95 and shift the pitch track in frequency.
gyorgyf@175 96 In order to efficiently correct erroneous pitch tracks, the user can select
gyorgyf@175 97 a time interval, and **Tony** will provide various alternative
gyorgyf@175 98 pitch tracks. The user can then pick the correct one.
gyorgyf@175 99
gyorgyf@175 100 Outcomes.
gyorgyf@175 101 The system is currently being used for two projects:
gyorgyf@175 102 for the generation of new training and test data for Music Informatics research,
gyorgyf@175 103 and for a new project on intonation in unaccompanied solo singing.
gyorgyf@175 104
gyorgyf@175 105 ==> The system is currently being used for two projects:
gyorgyf@175 106 1) for the generation of new training and test data for Music Informatics research,
gyorgyf@175 107 and 2) a research project on intonation in unaccompanied solo singing.
gyorgyf@175 108
gyorgyf@175 109 Preliminary feedback by the users suggests that
gyorgyf@175 110 the system does indeed facilitate pitch annotation
gyorgyf@175 111 and provides vital features that cannot be found in other tools.
gyorgyf@175 112
gyorgyf@175 113
gyorgyf@175 114 Title for final section.
gyorgyf@175 115 Conclusions
gyorgyf@175 116
gyorgyf@175 117 [Q37].
gyorgyf@175 118 We presented **Tony** a new software tool for computer-assisted
gyorgyf@175 119 annotation of melodic audio content for scientific analysis.
gyorgyf@175 120 No other existing program combines pitch and note estimation,
gyorgyf@175 121 a graphical user interface with auditory feedback,
gyorgyf@175 122 rapid, computer-aided correction of pitches and
gyorgyf@175 123 and extensive exporting facilities.
gyorgyf@175 124 **Tony** is freely available for use on Windows, OSX and Linux platforms
gyorgyf@175 125 from http://code.soundsoftware.ac.uk/projects/tony/.
gyorgyf@175 126
gyorgyf@175 127 Acknowledgements.
gyorgyf@175 128 Matthias Mauch is funded by the Royal Academy of Engineering.
gyorgyf@175 129 We would like to thank Justin Salamon, Rachel Bittner and Juan Bello
gyorgyf@175 130 for their comments and coding help.
gyorgyf@175 131
gyorgyf@175 132 Three key references. (APA v6)
gyorgyf@175 133 [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010.
gyorgyf@175 134 [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory.
gyorgyf@175 135 [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).
gyorgyf@175 136
gyorgyf@175 137 Comments/queries to organisers.
gyorgyf@175 138