Mercurial > hg > emotion-detection-top-level
comparison Code/Descriptors/yin/yin.html @ 4:92ca03a8fa99 tip
Update to ICASSP 2013 benchmark
author | Dawn Black |
---|---|
date | Wed, 13 Feb 2013 11:02:39 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
3:e1cfa7765647 | 4:92ca03a8fa99 |
---|---|
1 <html> | |
2 <head> | |
3 <title> YIN </title> | |
4 </head> | |
5 | |
6 <body> | |
7 | |
8 See README for copyright information. | |
9 | |
10 <hr> | |
11 <h2> YIN: fundamental frequency estimator </h2> | |
12 <hr> | |
13 | |
14 YIN estimates the fundamental frequency (F0) of an audio signal. | |
15 Features are: | |
16 <ul> | |
17 <li> Reliability (based on tests, see reference below). | |
18 <li> Accuracy (subsample resolution). | |
19 <li> Wide search range (default is 30 Hz - sr/4). | |
20 <li> Good temporal resolution. | |
21 <li> Ease of use. | |
22 </ul> | |
23 | |
24 YIN operates on vectors or files. YIN outputs a structure containing a set | |
25 of four vectors: F0 vs time, two estimates of aperiodic/total power | |
26 (one gross estimate, one fine estimate), and a period-smoothed estimate | |
27 of instantaneous power. | |
28 <p> | |
29 If no output argment is specified, YIN | |
30 plots F0 as a function of time (in octaves re: 440 Hz), aperiodicity, and power. | |
31 <p> | |
32 In the F0 plot, samples in blue are reckoned reliable (aperiodicity<threshold), | |
33 green are intermediate (aperiodicity<2*threshold), and | |
34 yellow unreliable (aperiodicity>2*threshold). | |
35 | |
36 <p> | |
37 Type 'help yin' for a description of the parameters. Read the reference below and | |
38 the code to understand their meaning. In brief: | |
39 <ul> | |
40 <li> To increase speed: increase 'hop' or 'minf0'. | |
41 <li> To reduce memory needs: reduce 'bufsize', or increase 'hop' or 'minf0'. | |
42 <li> To slightly increase reliability: reduce 'hop'. | |
43 <li> To slightly increase precision: upsample before processing. | |
44 <li> To improve temporal resolution: increase 'minf0', decrease 'hop'. | |
45 <li> To process lower F0s: reduce 'minf0'. Higher F0s: upsample and increase 'maxf0'. | |
46 <li> To avoid subharmonic errors: increase 'thresh'. | |
47 <li> To avoid harmonic/formant errors: reduce 'thresh'. | |
48 <li> Make sure that the range [minf0 maxf0] includes the expected f0. | |
49 </ul> | |
50 Parameter 'thresh' sets the proportion of aperiodic power that is | |
51 tolerated within a "periodic" signal. This may vary according to the application. | |
52 <p> | |
53 For speech or musical instruments a value of 0.1 is usually adequate. Singing voice | |
54 may require a smaller value (as low as 0.001) if a harmonic is reinforced by a | |
55 sharp formant. | |
56 <p> | |
57 Some signals are inherently ambiguous. For example | |
58 the response of a high-Q resonator excited by a pulse train may be seen either as a complex | |
59 tone with an F0 equal to that of the pulse train, or as an | |
60 amplitude modulated pure tone with an F0 equal to the resonant frequency. | |
61 Neither is more "correct" than the other. To obtain the result | |
62 that you expect, you must set the threshold to an appropriate value: small for the | |
63 fundamental periodicity, large for the resonance periodicity. | |
64 | |
65 <p> | |
66 YIN is described in: | |
67 <br></tt> | |
68 de Cheveigné, A., and Kawahara, H. (2002). "YIN, a fundamental frequency estimator | |
69 for speech and music," J. Acoust. Soc. Am., 111, 1917-1930. (<a href= | |
70 "http://www.ircam.fr/pcm/cheveign/ps/yin.pdf">pdf</a>) | |
71 </tt> | |
72 | |
73 <hr> | |
74 [Code is <a href="http://www.ircam.fr/pcm/cheveign/sw/yin.zip">here</a>] | |
75 [<a href="http://www.ircam.fr/pcm/cheveign">Alain de Cheveigné</a>] | |
76 | |
77 | |
78 | |
79 | |
80 | |
81 </body> | |
82 </html> | |
83 | |
84 |