annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 352:7178bb4dcdfb v0.6

Icons
author Chris Cannam
date Mon, 16 Jun 2014 13:04:09 +0100
parents 26224791546f
children
rev   line source
gyorgyf@175 1 Paper title.
gyorgyf@175 2 Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications
gyorgyf@175 3
gyorgyf@175 4 Abstract.
gyorgyf@175 5 We present **Tony**, a free, open-source software tool for
gyorgyf@175 6 computer-aided pitch track and note annotation of melodic audio content.
gyorgyf@175 7 The accurate annotation of fundamental frequencies and notes
gyorgyf@175 8 is essential to the scientific study of
gyorgyf@175 9 intonation in singing and other instruments.
gyorgyf@175 10 Unlike commercial applications for singers and producers
gyorgyf@175 11 or other academic tools for generic music annotation and visualisation
gyorgyf@175 12 **Tony** has been designed for the scientific study of monophonic music:
gyorgyf@175 13 a) it implements state-of-the art algorithms for pitch and note estimation from audio,
gyorgyf@175 14 b) it provides visual and auditory feedback of the extracted pitches
gyorgyf@175 15 for the identification of detection errors,
gyorgyf@175 16 b) it provides an intelligent graphical user interface
gyorgyf@175 17 through which the user can identify and rapidly correct estimation errors,
gyorgyf@175 18 c) it provides functions for exporting pitch track and note track
gyorgyf@175 19 enabling further processing in spreadsheets or other applications.
gyorgyf@175 20 Software versions for Windows, OSX and Linux platforms can be downloaded from
gyorgyf@175 21 http://code.soundsoftware.ac.uk/projects/tony
gyorgyf@175 22
gyorgyf@175 23 Keyword 1.
gyorgyf@175 24 Pitch/Note Analysis
gyorgyf@175 25
gyorgyf@175 26 Keyword 2.
gyorgyf@175 27 Software
gyorgyf@175 28
gyorgyf@175 29 Keyword 3.
gyorgyf@175 30 Singing.
gyorgyf@175 31
gyorgyf@175 32 Aims.
gyorgyf@175 33 We aim to make the scientific annotation of melodic content more efficient.
gyorgyf@175 34 ==> We aim to make the annotation of melodic content for scientific purposes more efficient.
gyorgyf@175 35 (also, possibly move this sentence to the end)
gyorgyf@175 36
gyorgyf@175 37 Music psychologists interested in the analysis of pitch and intonation
gyorgyf@175 38 usually use software programs originally aimed at the analysis of speech
gyorgyf@175 39 (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation
gyorgyf@175 40 tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/)
gyorgyf@175 41 to extract pitches of notes from audio recordings.
gyorgyf@175 42 Since these programs were not conceived for musical pitch analysis,
gyorgyf@175 43 the process of extracting note frequencies remains laborious and can take
gyorgyf@175 44 many times the duration of the recording.
gyorgyf@175 45
gyorgyf@175 46 On the other hand, commercial tools such as
gyorgyf@175 47 Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175 48 Sing&See (http://www.singandsee.com/) have
gyorgyf@175 49 unknown frequency estimation procedures (proprietary code)
gyorgyf@175 50 and do not provide export formats needed for scientific analysis.
gyorgyf@175 51
gyorgyf@175 52 ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175 53 Sing&See (http://www.singandsee.com/) also exists for these purposes, however
gyorgyf@175 54 their frequency estimation procedures are typically not public (proprietary code),
gyorgyf@175 55 and they do not provide export formats suitable for scientific analysis.
gyorgyf@175 56
gyorgyf@175 57
gyorgyf@175 58 An academic note annotation system [1] exists, but does not feature
gyorgyf@175 59 note extraction. It is also not openly available.
gyorgyf@175 60
gyorgyf@175 61 ==> An note annotation system [1] developed for academic purposes exists, but it does not feature
gyorgyf@175 62 note extraction. It is also not openly available. (openly ?? => open source, free/prop.? )
gyorgyf@175 63
gyorgyf@175 64 This is why, during our own research on intonation [2],
gyorgyf@175 65 we decided to code our own pitch extraction tool that would avoid the shortcomings.
gyorgyf@175 66
gyorgyf@175 67 ==> This is why we decided to develop our own pitch extraction tool that would avoid
gyorgyf@175 68 the above shortcomings during our own research on intonation [2].
gyorgyf@175 69
gyorgyf@175 70
gyorgyf@175 71 Methods.
gyorgyf@175 72 For automatic pitch and note estimation we use the pYIN method [3].
gyorgyf@175 73 The method provides precise pitch and note estimates and
gyorgyf@175 74 automatically determines which parts of the recording are voiced.
gyorgyf@175 75
gyorgyf@175 76 The graphical user interface is based upon the
gyorgyf@175 77 open source software libraries from Sonic Visualiser.
gyorgyf@175 78
gyorgyf@175 79 ==> The graphical user interface is based upon
gyorgyf@175 80 open source software libraries originally developed for the Sonic Visualiser software.
gyorgyf@175 81
gyorgyf@175 82 It features the audio waveform, a spectrogram representation,
gyorgyf@175 83 the pitch track and notes. Users can scroll and zoom in time.
gyorgyf@175 84 **Tony** does not only play back the original audio,
gyorgyf@175 85 but also, optionally, sonifications of the pitch track (melody line)
gyorgyf@175 86 and the note track (discrete pitches with durations).
gyorgyf@175 87 Notes' pitches are robustly estimated as the median of the pitch track
gyorgyf@175 88 that occurs during the duration of the note.
gyorgyf@175 89
gyorgyf@175 90 (robustly? I know it's good, but nothing really supports the fairly strong statement here…)
gyorgyf@175 91
gyorgyf@175 92 The user can delete, move, cut, merge, crop and extend notes,
gyorgyf@175 93 and the note's frequency is adapted accordingly.
gyorgyf@175 94 The user can delete spurious parts of the pitch track
gyorgyf@175 95 and shift the pitch track in frequency.
gyorgyf@175 96 In order to efficiently correct erroneous pitch tracks, the user can select
gyorgyf@175 97 a time interval, and **Tony** will provide various alternative
gyorgyf@175 98 pitch tracks. The user can then pick the correct one.
gyorgyf@175 99
gyorgyf@175 100 Outcomes.
gyorgyf@175 101 The system is currently being used for two projects:
gyorgyf@175 102 for the generation of new training and test data for Music Informatics research,
gyorgyf@175 103 and for a new project on intonation in unaccompanied solo singing.
gyorgyf@175 104
gyorgyf@175 105 ==> The system is currently being used for two projects:
gyorgyf@175 106 1) for the generation of new training and test data for Music Informatics research,
gyorgyf@175 107 and 2) a research project on intonation in unaccompanied solo singing.
gyorgyf@175 108
gyorgyf@175 109 Preliminary feedback by the users suggests that
gyorgyf@175 110 the system does indeed facilitate pitch annotation
gyorgyf@175 111 and provides vital features that cannot be found in other tools.
gyorgyf@175 112
gyorgyf@175 113
gyorgyf@175 114 Title for final section.
gyorgyf@175 115 Conclusions
gyorgyf@175 116
gyorgyf@175 117 [Q37].
gyorgyf@175 118 We presented **Tony** a new software tool for computer-assisted
gyorgyf@175 119 annotation of melodic audio content for scientific analysis.
gyorgyf@175 120 No other existing program combines pitch and note estimation,
gyorgyf@175 121 a graphical user interface with auditory feedback,
gyorgyf@175 122 rapid, computer-aided correction of pitches and
gyorgyf@175 123 and extensive exporting facilities.
gyorgyf@175 124 **Tony** is freely available for use on Windows, OSX and Linux platforms
gyorgyf@175 125 from http://code.soundsoftware.ac.uk/projects/tony/.
gyorgyf@175 126
gyorgyf@175 127 Acknowledgements.
gyorgyf@175 128 Matthias Mauch is funded by the Royal Academy of Engineering.
gyorgyf@175 129 We would like to thank Justin Salamon, Rachel Bittner and Juan Bello
gyorgyf@175 130 for their comments and coding help.
gyorgyf@175 131
gyorgyf@175 132 Three key references. (APA v6)
gyorgyf@175 133 [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010.
gyorgyf@175 134 [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory.
gyorgyf@175 135 [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).
gyorgyf@175 136
gyorgyf@175 137 Comments/queries to organisers.
gyorgyf@175 138