tony: publications/sempre2014/mauch_sempre2014_GF

annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 698:ee97c742d184 tip

Default branch is now named default on git as well as hg, in case we ever want to switch to mirroring in the other direction

author	Chris Cannam
date	Thu, 27 Aug 2020 15:58:43 +0100
parents	26224791546f
children

rev	line source
gyorgyf@175	1 Paper title.
gyorgyf@175	2 Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications
gyorgyf@175	3
gyorgyf@175	4 Abstract.
gyorgyf@175	5 We present Tony, a free, open-source software tool for
gyorgyf@175	6 computer-aided pitch track and note annotation of melodic audio content.
gyorgyf@175	7 The accurate annotation of fundamental frequencies and notes
gyorgyf@175	8 is essential to the scientific study of
gyorgyf@175	9 intonation in singing and other instruments.
gyorgyf@175	10 Unlike commercial applications for singers and producers
gyorgyf@175	11 or other academic tools for generic music annotation and visualisation
gyorgyf@175	12 Tony has been designed for the scientific study of monophonic music:
gyorgyf@175	13 a) it implements state-of-the art algorithms for pitch and note estimation from audio,
gyorgyf@175	14 b) it provides visual and auditory feedback of the extracted pitches
gyorgyf@175	15 for the identification of detection errors,
gyorgyf@175	16 b) it provides an intelligent graphical user interface
gyorgyf@175	17 through which the user can identify and rapidly correct estimation errors,
gyorgyf@175	18 c) it provides functions for exporting pitch track and note track
gyorgyf@175	19 enabling further processing in spreadsheets or other applications.
gyorgyf@175	20 Software versions for Windows, OSX and Linux platforms can be downloaded from
gyorgyf@175	21 http://code.soundsoftware.ac.uk/projects/tony
gyorgyf@175	22
gyorgyf@175	23 Keyword 1.
gyorgyf@175	24 Pitch/Note Analysis
gyorgyf@175	25
gyorgyf@175	26 Keyword 2.
gyorgyf@175	27 Software
gyorgyf@175	28
gyorgyf@175	29 Keyword 3.
gyorgyf@175	30 Singing.
gyorgyf@175	31
gyorgyf@175	32 Aims.
gyorgyf@175	33 We aim to make the scientific annotation of melodic content more efficient.
gyorgyf@175	34 ==> We aim to make the annotation of melodic content for scientific purposes more efficient.
gyorgyf@175	35 (also, possibly move this sentence to the end)
gyorgyf@175	36
gyorgyf@175	37 Music psychologists interested in the analysis of pitch and intonation
gyorgyf@175	38 usually use software programs originally aimed at the analysis of speech
gyorgyf@175	39 (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation
gyorgyf@175	40 tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/)
gyorgyf@175	41 to extract pitches of notes from audio recordings.
gyorgyf@175	42 Since these programs were not conceived for musical pitch analysis,
gyorgyf@175	43 the process of extracting note frequencies remains laborious and can take
gyorgyf@175	44 many times the duration of the recording.
gyorgyf@175	45
gyorgyf@175	46 On the other hand, commercial tools such as
gyorgyf@175	47 Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175	48 Sing&See (http://www.singandsee.com/) have
gyorgyf@175	49 unknown frequency estimation procedures (proprietary code)
gyorgyf@175	50 and do not provide export formats needed for scientific analysis.
gyorgyf@175	51
gyorgyf@175	52 ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175	53 Sing&See (http://www.singandsee.com/) also exists for these purposes, however
gyorgyf@175	54 their frequency estimation procedures are typically not public (proprietary code),
gyorgyf@175	55 and they do not provide export formats suitable for scientific analysis.
gyorgyf@175	56
gyorgyf@175	57
gyorgyf@175	58 An academic note annotation system [1] exists, but does not feature
gyorgyf@175	59 note extraction. It is also not openly available.
gyorgyf@175	60
gyorgyf@175	61 ==> An note annotation system [1] developed for academic purposes exists, but it does not feature
gyorgyf@175	62 note extraction. It is also not openly available. (openly ?? => open source, free/prop.? )
gyorgyf@175	63
gyorgyf@175	64 This is why, during our own research on intonation [2],
gyorgyf@175	65 we decided to code our own pitch extraction tool that would avoid the shortcomings.
gyorgyf@175	66
gyorgyf@175	67 ==> This is why we decided to develop our own pitch extraction tool that would avoid
gyorgyf@175	68 the above shortcomings during our own research on intonation [2].
gyorgyf@175	69
gyorgyf@175	70
gyorgyf@175	71 Methods.
gyorgyf@175	72 For automatic pitch and note estimation we use the pYIN method [3].
gyorgyf@175	73 The method provides precise pitch and note estimates and
gyorgyf@175	74 automatically determines which parts of the recording are voiced.
gyorgyf@175	75
gyorgyf@175	76 The graphical user interface is based upon the
gyorgyf@175	77 open source software libraries from Sonic Visualiser.
gyorgyf@175	78
gyorgyf@175	79 ==> The graphical user interface is based upon
gyorgyf@175	80 open source software libraries originally developed for the Sonic Visualiser software.
gyorgyf@175	81
gyorgyf@175	82 It features the audio waveform, a spectrogram representation,
gyorgyf@175	83 the pitch track and notes. Users can scroll and zoom in time.
gyorgyf@175	84 Tony does not only play back the original audio,
gyorgyf@175	85 but also, optionally, sonifications of the pitch track (melody line)
gyorgyf@175	86 and the note track (discrete pitches with durations).
gyorgyf@175	87 Notes' pitches are robustly estimated as the median of the pitch track
gyorgyf@175	88 that occurs during the duration of the note.
gyorgyf@175	89
gyorgyf@175	90 (robustly? I know it's good, but nothing really supports the fairly strong statement here…)
gyorgyf@175	91
gyorgyf@175	92 The user can delete, move, cut, merge, crop and extend notes,
gyorgyf@175	93 and the note's frequency is adapted accordingly.
gyorgyf@175	94 The user can delete spurious parts of the pitch track
gyorgyf@175	95 and shift the pitch track in frequency.
gyorgyf@175	96 In order to efficiently correct erroneous pitch tracks, the user can select
gyorgyf@175	97 a time interval, and Tony will provide various alternative
gyorgyf@175	98 pitch tracks. The user can then pick the correct one.
gyorgyf@175	99
gyorgyf@175	100 Outcomes.
gyorgyf@175	101 The system is currently being used for two projects:
gyorgyf@175	102 for the generation of new training and test data for Music Informatics research,
gyorgyf@175	103 and for a new project on intonation in unaccompanied solo singing.
gyorgyf@175	104
gyorgyf@175	105 ==> The system is currently being used for two projects:
gyorgyf@175	106 1) for the generation of new training and test data for Music Informatics research,
gyorgyf@175	107 and 2) a research project on intonation in unaccompanied solo singing.
gyorgyf@175	108
gyorgyf@175	109 Preliminary feedback by the users suggests that
gyorgyf@175	110 the system does indeed facilitate pitch annotation
gyorgyf@175	111 and provides vital features that cannot be found in other tools.
gyorgyf@175	112
gyorgyf@175	113
gyorgyf@175	114 Title for final section.
gyorgyf@175	115 Conclusions
gyorgyf@175	116
gyorgyf@175	117 [Q37].
gyorgyf@175	118 We presented Tony a new software tool for computer-assisted
gyorgyf@175	119 annotation of melodic audio content for scientific analysis.
gyorgyf@175	120 No other existing program combines pitch and note estimation,
gyorgyf@175	121 a graphical user interface with auditory feedback,
gyorgyf@175	122 rapid, computer-aided correction of pitches and
gyorgyf@175	123 and extensive exporting facilities.
gyorgyf@175	124 Tony is freely available for use on Windows, OSX and Linux platforms
gyorgyf@175	125 from http://code.soundsoftware.ac.uk/projects/tony/.
gyorgyf@175	126
gyorgyf@175	127 Acknowledgements.
gyorgyf@175	128 Matthias Mauch is funded by the Royal Academy of Engineering.
gyorgyf@175	129 We would like to thank Justin Salamon, Rachel Bittner and Juan Bello
gyorgyf@175	130 for their comments and coding help.
gyorgyf@175	131
gyorgyf@175	132 Three key references. (APA v6)
gyorgyf@175	133 [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010.
gyorgyf@175	134 [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory.
gyorgyf@175	135 [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).
gyorgyf@175	136
gyorgyf@175	137 Comments/queries to organisers.
gyorgyf@175	138

Mercurial > hg > tony

annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 698:ee97c742d184 tip