Mercurial > hg > camir-aes2014
comparison toolboxes/MIRtoolbox1.3.2/MIRToolboxDemos/demo4segmentation.m @ 0:e9a9cd732c1e tip
first hg version after svn
author | wolffd |
---|---|
date | Tue, 10 Feb 2015 15:05:51 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e9a9cd732c1e |
---|---|
1 function demo4segmentation | |
2 % To get familiar with some approaches of segmentation of audio files | |
3 % using MIRtoolbox. | |
4 | |
5 % 1. Load an audio file (for instance, guitar.wav). | |
6 a = miraudio('guitar'); | |
7 | |
8 % 2. We will perform the segmentation strategy as proposed in (Foote & | |
9 % Cooper, 2003). First, decompose the file into successive frames of 50 ms | |
10 % without overlap. | |
11 help mirframe | |
12 fr = mirframe(a,0.05,1) | |
13 | |
14 % 3. Compute the spectrum representation (FFT) of the frames. | |
15 sp = mirspectrum(fr) | |
16 clear fr | |
17 % (Remove from the memory any data that will not be used any more.) | |
18 | |
19 % 4. Compute the similarity matrix that shows the similarity between the | |
20 % spectrum of different frames. | |
21 help mirsimatrix | |
22 sm = mirsimatrix(sp) | |
23 clear sp | |
24 % Look at the structures shown in the matrix and find the relation with the | |
25 % structure heard when listening to the extract. | |
26 | |
27 % 5. Estimate the novelty score related to the similarity matrix. It | |
28 % consists in a convolution of the diagonal of the matrix with a | |
29 % checker-board Gaussian kernel. Use the novelty function for that purpose. | |
30 help mirnovelty | |
31 nv = mirnovelty(sm) | |
32 | |
33 % 6. Detect the peaks in the novelty score. | |
34 help mirpeaks | |
35 p1 = mirpeaks(nv) | |
36 | |
37 % You can change the threshold value of the peak picker function in order to | |
38 % get better results. | |
39 p2 = mirpeaks(nv,'Contrast',0.01) | |
40 | |
41 clear nv | |
42 | |
43 % 7. Segment the original audio file using the peaks as position for | |
44 % segmentation. | |
45 help mirsegment | |
46 s1 = mirsegment(a,p1) | |
47 clear p1 | |
48 | |
49 % 8. Listen to the results. | |
50 mirplay(s1) | |
51 | |
52 %s2 = mirsegment(a,p2) | |
53 %clear p2 | |
54 %mirplay(s2) | |
55 | |
56 % 9. Compute the similarity matrix of this obtained segmentation, in order | |
57 % to view the relationships between the different segments and their | |
58 % possible clustering into higher-level groups. | |
59 mirsimatrix(s1,'Similarity') | |
60 clear s1 | |
61 %mirsimatrix(s2) | |
62 %clear s2 | |
63 | |
64 display('Strike any key to continue...'); | |
65 pause | |
66 close all | |
67 | |
68 % 10. Change the size of the kernel used in the novelty function, in order | |
69 % to obtain segmentations of different levels of detail, from detailed | |
70 % analysis of the local texture, to very simple segmentation of the whole | |
71 % piece. | |
72 n100 = mirnovelty(sm,'KernelSize',100) | |
73 n50 = mirnovelty(sm,'KernelSize',50) | |
74 n10 = mirnovelty(sm,'KernelSize',10) | |
75 clear sm | |
76 % As you can see, the smaller the gaussian kernel is, the more peaks can be | |
77 % found in the novelty score. Indeed, if the kernel is small, the cumulative | |
78 % multiplication of its elements with the superposed elements in the | |
79 % similarity matrix may vary more easily, throughout the progressive | |
80 % sliding of the kernel along the diagonal of the similarity matrix, and | |
81 % local change of texture may be more easily detected. On the contrary, | |
82 % when the kernel is large, only large-scale change of texture are | |
83 % detected. | |
84 | |
85 display('Strike any key to continue...'); | |
86 pause | |
87 close all | |
88 | |
89 p100 = mirpeaks(n100,'NoBegin','NoEnd') | |
90 clear n100 | |
91 p50 = mirpeaks(n50,'NoBegin','NoEnd') | |
92 clear n50 | |
93 p10 = mirpeaks(n10,'NoBegin','NoEnd') | |
94 clear n10 | |
95 s100 = mirsegment(a,p100) | |
96 clear p100 | |
97 mirplay(s100) | |
98 clear s100 | |
99 s50 = mirsegment(a,p50) | |
100 clear p50 | |
101 mirplay(s50) | |
102 clear s50 | |
103 s10 = mirsegment(a,p10) | |
104 clear p10 | |
105 mirplay(s10) | |
106 clear s10 | |
107 | |
108 display('Strike any key to continue...'); | |
109 pause | |
110 close all | |
111 | |
112 % One more compact way of writing these commands is as follows: | |
113 mirsegment(a,'Novelty') | |
114 mirsegment(a,'Novelty','Contrast',0.01) | |
115 mirsegment(a,'Novelty','KernelSize',100) | |
116 | |
117 display('Strike any key to continue...'); | |
118 pause | |
119 close all | |
120 | |
121 % Besides, if you want to see the novelty curve with the peaks, just add a | |
122 % second output: | |
123 [s50 p50] = mirsegment(a,'Novelty','KernelSize',50) | |
124 clear s50 p50 | |
125 [s10 p10] = mirsegment(a,'Novelty','KernelSize',10) | |
126 clear a s10 p10 | |
127 | |
128 display('Strike any key to continue...'); | |
129 pause | |
130 close all | |
131 | |
132 | |
133 % 11. Try the whole process with MFCC instead of spectrum analysis. Take the | |
134 % first ten MFCC for instance. | |
135 help mirsegment | |
136 % The segment function can simply be called as follows: | |
137 sc = mirsegment('czardas','Novelty','MFCC','Rank',1:10) | |
138 clear sc | |
139 | |
140 % Here are some other examples of use: | |
141 [ssp p m b] = mirsegment('valse_triste_happy','Spectrum',... | |
142 'KernelSize',150,'Contrast',.1) | |
143 clear p m b | |
144 mirplay(ssp) | |
145 clear ssp | |
146 | |
147 display('Strike any key to continue...'); | |
148 pause | |
149 close all | |
150 | |
151 [smfcc2 p m a] = mirsegment('valse_triste_happy','MFCC',2:10,... | |
152 'KernelSize',150,'Contrast',.1) | |
153 clear p m a | |
154 mirplay(smfcc2) |