Mercurial > hg > tony
view publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 492:6e484c58ca25 recording
Restore record button toggle state if user cancels file session save dialog after hitting record
author | Chris Cannam |
---|---|
date | Mon, 12 Oct 2015 13:24:12 +0100 |
parents | 26224791546f |
children |
line wrap: on
line source
Paper title. Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications Abstract. We present **Tony**, a free, open-source software tool for computer-aided pitch track and note annotation of melodic audio content. The accurate annotation of fundamental frequencies and notes is essential to the scientific study of intonation in singing and other instruments. Unlike commercial applications for singers and producers or other academic tools for generic music annotation and visualisation **Tony** has been designed for the scientific study of monophonic music: a) it implements state-of-the art algorithms for pitch and note estimation from audio, b) it provides visual and auditory feedback of the extracted pitches for the identification of detection errors, b) it provides an intelligent graphical user interface through which the user can identify and rapidly correct estimation errors, c) it provides functions for exporting pitch track and note track enabling further processing in spreadsheets or other applications. Software versions for Windows, OSX and Linux platforms can be downloaded from http://code.soundsoftware.ac.uk/projects/tony Keyword 1. Pitch/Note Analysis Keyword 2. Software Keyword 3. Singing. Aims. We aim to make the scientific annotation of melodic content more efficient. ==> We aim to make the annotation of melodic content for scientific purposes more efficient. (also, possibly move this sentence to the end) Music psychologists interested in the analysis of pitch and intonation usually use software programs originally aimed at the analysis of speech (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/) to extract pitches of notes from audio recordings. Since these programs were not conceived for musical pitch analysis, the process of extracting note frequencies remains laborious and can take many times the duration of the recording. On the other hand, commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or Sing&See (http://www.singandsee.com/) have unknown frequency estimation procedures (proprietary code) and do not provide export formats needed for scientific analysis. ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or Sing&See (http://www.singandsee.com/) also exists for these purposes, however their frequency estimation procedures are typically not public (proprietary code), and they do not provide export formats suitable for scientific analysis. An academic note annotation system [1] exists, but does not feature note extraction. It is also not openly available. ==> An note annotation system [1] developed for academic purposes exists, but it does not feature note extraction. It is also not openly available. (openly ?? => open source, free/prop.? ) This is why, during our own research on intonation [2], we decided to code our own pitch extraction tool that would avoid the shortcomings. ==> This is why we decided to develop our own pitch extraction tool that would avoid the above shortcomings during our own research on intonation [2]. Methods. For automatic pitch and note estimation we use the pYIN method [3]. The method provides precise pitch and note estimates and automatically determines which parts of the recording are voiced. The graphical user interface is based upon the open source software libraries from Sonic Visualiser. ==> The graphical user interface is based upon open source software libraries originally developed for the Sonic Visualiser software. It features the audio waveform, a spectrogram representation, the pitch track and notes. Users can scroll and zoom in time. **Tony** does not only play back the original audio, but also, optionally, sonifications of the pitch track (melody line) and the note track (discrete pitches with durations). Notes' pitches are robustly estimated as the median of the pitch track that occurs during the duration of the note. (robustly? I know it's good, but nothing really supports the fairly strong statement hereā¦) The user can delete, move, cut, merge, crop and extend notes, and the note's frequency is adapted accordingly. The user can delete spurious parts of the pitch track and shift the pitch track in frequency. In order to efficiently correct erroneous pitch tracks, the user can select a time interval, and **Tony** will provide various alternative pitch tracks. The user can then pick the correct one. Outcomes. The system is currently being used for two projects: for the generation of new training and test data for Music Informatics research, and for a new project on intonation in unaccompanied solo singing. ==> The system is currently being used for two projects: 1) for the generation of new training and test data for Music Informatics research, and 2) a research project on intonation in unaccompanied solo singing. Preliminary feedback by the users suggests that the system does indeed facilitate pitch annotation and provides vital features that cannot be found in other tools. Title for final section. Conclusions [Q37]. We presented **Tony** a new software tool for computer-assisted annotation of melodic audio content for scientific analysis. No other existing program combines pitch and note estimation, a graphical user interface with auditory feedback, rapid, computer-aided correction of pitches and and extensive exporting facilities. **Tony** is freely available for use on Windows, OSX and Linux platforms from http://code.soundsoftware.ac.uk/projects/tony/. Acknowledgements. Matthias Mauch is funded by the Royal Academy of Engineering. We would like to thank Justin Salamon, Rachel Bittner and Juan Bello for their comments and coding help. Three key references. (APA v6) [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010. [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory. [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014). Comments/queries to organisers.