tony: publications/sempre2014/mauch_sempre2014_GF

annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 352:7178bb4dcdfb v0.6

Icons

author	Chris Cannam
date	Mon, 16 Jun 2014 13:04:09 +0100
parents	26224791546f
children

rev	line source
gyorgyf@175	1 Paper title.
gyorgyf@175	2 Matthias Mauch and Chris Cannam: Efficient Computer-Aided Pitch Track and Note Estimation for Scientific Applications
gyorgyf@175	3
gyorgyf@175	4 Abstract.
gyorgyf@175	5 We present Tony, a free, open-source software tool for
gyorgyf@175	6 computer-aided pitch track and note annotation of melodic audio content.
gyorgyf@175	7 The accurate annotation of fundamental frequencies and notes
gyorgyf@175	8 is essential to the scientific study of
gyorgyf@175	9 intonation in singing and other instruments.
gyorgyf@175	10 Unlike commercial applications for singers and producers
gyorgyf@175	11 or other academic tools for generic music annotation and visualisation
gyorgyf@175	12 Tony has been designed for the scientific study of monophonic music:
gyorgyf@175	13 a) it implements state-of-the art algorithms for pitch and note estimation from audio,
gyorgyf@175	14 b) it provides visual and auditory feedback of the extracted pitches
gyorgyf@175	15 for the identification of detection errors,
gyorgyf@175	16 b) it provides an intelligent graphical user interface
gyorgyf@175	17 through which the user can identify and rapidly correct estimation errors,
gyorgyf@175	18 c) it provides functions for exporting pitch track and note track
gyorgyf@175	19 enabling further processing in spreadsheets or other applications.
gyorgyf@175	20 Software versions for Windows, OSX and Linux platforms can be downloaded from
gyorgyf@175	21 http://code.soundsoftware.ac.uk/projects/tony
gyorgyf@175	22
gyorgyf@175	23 Keyword 1.
gyorgyf@175	24 Pitch/Note Analysis
gyorgyf@175	25
gyorgyf@175	26 Keyword 2.
gyorgyf@175	27 Software
gyorgyf@175	28
gyorgyf@175	29 Keyword 3.
gyorgyf@175	30 Singing.
gyorgyf@175	31
gyorgyf@175	32 Aims.
gyorgyf@175	33 We aim to make the scientific annotation of melodic content more efficient.
gyorgyf@175	34 ==> We aim to make the annotation of melodic content for scientific purposes more efficient.
gyorgyf@175	35 (also, possibly move this sentence to the end)
gyorgyf@175	36
gyorgyf@175	37 Music psychologists interested in the analysis of pitch and intonation
gyorgyf@175	38 usually use software programs originally aimed at the analysis of speech
gyorgyf@175	39 (e.g. Praat http://www.fon.hum.uva.nl/praat/) or generic audio annotation
gyorgyf@175	40 tools (e.g. Sonic Visualiser http://www.sonicvisualiser.org/)
gyorgyf@175	41 to extract pitches of notes from audio recordings.
gyorgyf@175	42 Since these programs were not conceived for musical pitch analysis,
gyorgyf@175	43 the process of extracting note frequencies remains laborious and can take
gyorgyf@175	44 many times the duration of the recording.
gyorgyf@175	45
gyorgyf@175	46 On the other hand, commercial tools such as
gyorgyf@175	47 Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175	48 Sing&See (http://www.singandsee.com/) have
gyorgyf@175	49 unknown frequency estimation procedures (proprietary code)
gyorgyf@175	50 and do not provide export formats needed for scientific analysis.
gyorgyf@175	51
gyorgyf@175	52 ==> Commercial tools such as Melodyne (http://www.celemony.com/), Songs2See (http://www.songs2see.com/) or
gyorgyf@175	53 Sing&See (http://www.singandsee.com/) also exists for these purposes, however
gyorgyf@175	54 their frequency estimation procedures are typically not public (proprietary code),
gyorgyf@175	55 and they do not provide export formats suitable for scientific analysis.
gyorgyf@175	56
gyorgyf@175	57
gyorgyf@175	58 An academic note annotation system [1] exists, but does not feature
gyorgyf@175	59 note extraction. It is also not openly available.
gyorgyf@175	60
gyorgyf@175	61 ==> An note annotation system [1] developed for academic purposes exists, but it does not feature
gyorgyf@175	62 note extraction. It is also not openly available. (openly ?? => open source, free/prop.? )
gyorgyf@175	63
gyorgyf@175	64 This is why, during our own research on intonation [2],
gyorgyf@175	65 we decided to code our own pitch extraction tool that would avoid the shortcomings.
gyorgyf@175	66
gyorgyf@175	67 ==> This is why we decided to develop our own pitch extraction tool that would avoid
gyorgyf@175	68 the above shortcomings during our own research on intonation [2].
gyorgyf@175	69
gyorgyf@175	70
gyorgyf@175	71 Methods.
gyorgyf@175	72 For automatic pitch and note estimation we use the pYIN method [3].
gyorgyf@175	73 The method provides precise pitch and note estimates and
gyorgyf@175	74 automatically determines which parts of the recording are voiced.
gyorgyf@175	75
gyorgyf@175	76 The graphical user interface is based upon the
gyorgyf@175	77 open source software libraries from Sonic Visualiser.
gyorgyf@175	78
gyorgyf@175	79 ==> The graphical user interface is based upon
gyorgyf@175	80 open source software libraries originally developed for the Sonic Visualiser software.
gyorgyf@175	81
gyorgyf@175	82 It features the audio waveform, a spectrogram representation,
gyorgyf@175	83 the pitch track and notes. Users can scroll and zoom in time.
gyorgyf@175	84 Tony does not only play back the original audio,
gyorgyf@175	85 but also, optionally, sonifications of the pitch track (melody line)
gyorgyf@175	86 and the note track (discrete pitches with durations).
gyorgyf@175	87 Notes' pitches are robustly estimated as the median of the pitch track
gyorgyf@175	88 that occurs during the duration of the note.
gyorgyf@175	89
gyorgyf@175	90 (robustly? I know it's good, but nothing really supports the fairly strong statement here…)
gyorgyf@175	91
gyorgyf@175	92 The user can delete, move, cut, merge, crop and extend notes,
gyorgyf@175	93 and the note's frequency is adapted accordingly.
gyorgyf@175	94 The user can delete spurious parts of the pitch track
gyorgyf@175	95 and shift the pitch track in frequency.
gyorgyf@175	96 In order to efficiently correct erroneous pitch tracks, the user can select
gyorgyf@175	97 a time interval, and Tony will provide various alternative
gyorgyf@175	98 pitch tracks. The user can then pick the correct one.
gyorgyf@175	99
gyorgyf@175	100 Outcomes.
gyorgyf@175	101 The system is currently being used for two projects:
gyorgyf@175	102 for the generation of new training and test data for Music Informatics research,
gyorgyf@175	103 and for a new project on intonation in unaccompanied solo singing.
gyorgyf@175	104
gyorgyf@175	105 ==> The system is currently being used for two projects:
gyorgyf@175	106 1) for the generation of new training and test data for Music Informatics research,
gyorgyf@175	107 and 2) a research project on intonation in unaccompanied solo singing.
gyorgyf@175	108
gyorgyf@175	109 Preliminary feedback by the users suggests that
gyorgyf@175	110 the system does indeed facilitate pitch annotation
gyorgyf@175	111 and provides vital features that cannot be found in other tools.
gyorgyf@175	112
gyorgyf@175	113
gyorgyf@175	114 Title for final section.
gyorgyf@175	115 Conclusions
gyorgyf@175	116
gyorgyf@175	117 [Q37].
gyorgyf@175	118 We presented Tony a new software tool for computer-assisted
gyorgyf@175	119 annotation of melodic audio content for scientific analysis.
gyorgyf@175	120 No other existing program combines pitch and note estimation,
gyorgyf@175	121 a graphical user interface with auditory feedback,
gyorgyf@175	122 rapid, computer-aided correction of pitches and
gyorgyf@175	123 and extensive exporting facilities.
gyorgyf@175	124 Tony is freely available for use on Windows, OSX and Linux platforms
gyorgyf@175	125 from http://code.soundsoftware.ac.uk/projects/tony/.
gyorgyf@175	126
gyorgyf@175	127 Acknowledgements.
gyorgyf@175	128 Matthias Mauch is funded by the Royal Academy of Engineering.
gyorgyf@175	129 We would like to thank Justin Salamon, Rachel Bittner and Juan Bello
gyorgyf@175	130 for their comments and coding help.
gyorgyf@175	131
gyorgyf@175	132 Three key references. (APA v6)
gyorgyf@175	133 [1] Pant, S., Rao, V., & Rao, P. (2010). A melody detection user interface for polyphonic music. 2010 National Conference On Communications (NCC), 2010.
gyorgyf@175	134 [2] Mauch, M., Frieler, K., & Dixon, S. (under review). Intonation in Unaccompanied Singing: Accuracy, Drift and a Model of Intonation Memory.
gyorgyf@175	135 [3] Mauch, M., & Dixon, S. (2014). pYIN : a Fundamental Frequency Estimator Using Probabilistic Threshold Distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014).
gyorgyf@175	136
gyorgyf@175	137 Comments/queries to organisers.
gyorgyf@175	138

Mercurial > hg > tony

annotate publications/sempre2014/mauch_sempre2014_GF_edits.txt @ 352:7178bb4dcdfb v0.6