wolffd@0
|
1 function demo4segmentation
|
wolffd@0
|
2 % To get familiar with some approaches of segmentation of audio files
|
wolffd@0
|
3 % using MIRtoolbox.
|
wolffd@0
|
4
|
wolffd@0
|
5 % 1. Load an audio file (for instance, guitar.wav).
|
wolffd@0
|
6 a = miraudio('guitar');
|
wolffd@0
|
7
|
wolffd@0
|
8 % 2. We will perform the segmentation strategy as proposed in (Foote &
|
wolffd@0
|
9 % Cooper, 2003). First, decompose the file into successive frames of 50 ms
|
wolffd@0
|
10 % without overlap.
|
wolffd@0
|
11 help mirframe
|
wolffd@0
|
12 fr = mirframe(a,0.05,1)
|
wolffd@0
|
13
|
wolffd@0
|
14 % 3. Compute the spectrum representation (FFT) of the frames.
|
wolffd@0
|
15 sp = mirspectrum(fr)
|
wolffd@0
|
16 clear fr
|
wolffd@0
|
17 % (Remove from the memory any data that will not be used any more.)
|
wolffd@0
|
18
|
wolffd@0
|
19 % 4. Compute the similarity matrix that shows the similarity between the
|
wolffd@0
|
20 % spectrum of different frames.
|
wolffd@0
|
21 help mirsimatrix
|
wolffd@0
|
22 sm = mirsimatrix(sp)
|
wolffd@0
|
23 clear sp
|
wolffd@0
|
24 % Look at the structures shown in the matrix and find the relation with the
|
wolffd@0
|
25 % structure heard when listening to the extract.
|
wolffd@0
|
26
|
wolffd@0
|
27 % 5. Estimate the novelty score related to the similarity matrix. It
|
wolffd@0
|
28 % consists in a convolution of the diagonal of the matrix with a
|
wolffd@0
|
29 % checker-board Gaussian kernel. Use the novelty function for that purpose.
|
wolffd@0
|
30 help mirnovelty
|
wolffd@0
|
31 nv = mirnovelty(sm)
|
wolffd@0
|
32
|
wolffd@0
|
33 % 6. Detect the peaks in the novelty score.
|
wolffd@0
|
34 help mirpeaks
|
wolffd@0
|
35 p1 = mirpeaks(nv)
|
wolffd@0
|
36
|
wolffd@0
|
37 % You can change the threshold value of the peak picker function in order to
|
wolffd@0
|
38 % get better results.
|
wolffd@0
|
39 p2 = mirpeaks(nv,'Contrast',0.01)
|
wolffd@0
|
40
|
wolffd@0
|
41 clear nv
|
wolffd@0
|
42
|
wolffd@0
|
43 % 7. Segment the original audio file using the peaks as position for
|
wolffd@0
|
44 % segmentation.
|
wolffd@0
|
45 help mirsegment
|
wolffd@0
|
46 s1 = mirsegment(a,p1)
|
wolffd@0
|
47 clear p1
|
wolffd@0
|
48
|
wolffd@0
|
49 % 8. Listen to the results.
|
wolffd@0
|
50 mirplay(s1)
|
wolffd@0
|
51
|
wolffd@0
|
52 %s2 = mirsegment(a,p2)
|
wolffd@0
|
53 %clear p2
|
wolffd@0
|
54 %mirplay(s2)
|
wolffd@0
|
55
|
wolffd@0
|
56 % 9. Compute the similarity matrix of this obtained segmentation, in order
|
wolffd@0
|
57 % to view the relationships between the different segments and their
|
wolffd@0
|
58 % possible clustering into higher-level groups.
|
wolffd@0
|
59 mirsimatrix(s1,'Similarity')
|
wolffd@0
|
60 clear s1
|
wolffd@0
|
61 %mirsimatrix(s2)
|
wolffd@0
|
62 %clear s2
|
wolffd@0
|
63
|
wolffd@0
|
64 display('Strike any key to continue...');
|
wolffd@0
|
65 pause
|
wolffd@0
|
66 close all
|
wolffd@0
|
67
|
wolffd@0
|
68 % 10. Change the size of the kernel used in the novelty function, in order
|
wolffd@0
|
69 % to obtain segmentations of different levels of detail, from detailed
|
wolffd@0
|
70 % analysis of the local texture, to very simple segmentation of the whole
|
wolffd@0
|
71 % piece.
|
wolffd@0
|
72 n100 = mirnovelty(sm,'KernelSize',100)
|
wolffd@0
|
73 n50 = mirnovelty(sm,'KernelSize',50)
|
wolffd@0
|
74 n10 = mirnovelty(sm,'KernelSize',10)
|
wolffd@0
|
75 clear sm
|
wolffd@0
|
76 % As you can see, the smaller the gaussian kernel is, the more peaks can be
|
wolffd@0
|
77 % found in the novelty score. Indeed, if the kernel is small, the cumulative
|
wolffd@0
|
78 % multiplication of its elements with the superposed elements in the
|
wolffd@0
|
79 % similarity matrix may vary more easily, throughout the progressive
|
wolffd@0
|
80 % sliding of the kernel along the diagonal of the similarity matrix, and
|
wolffd@0
|
81 % local change of texture may be more easily detected. On the contrary,
|
wolffd@0
|
82 % when the kernel is large, only large-scale change of texture are
|
wolffd@0
|
83 % detected.
|
wolffd@0
|
84
|
wolffd@0
|
85 display('Strike any key to continue...');
|
wolffd@0
|
86 pause
|
wolffd@0
|
87 close all
|
wolffd@0
|
88
|
wolffd@0
|
89 p100 = mirpeaks(n100,'NoBegin','NoEnd')
|
wolffd@0
|
90 clear n100
|
wolffd@0
|
91 p50 = mirpeaks(n50,'NoBegin','NoEnd')
|
wolffd@0
|
92 clear n50
|
wolffd@0
|
93 p10 = mirpeaks(n10,'NoBegin','NoEnd')
|
wolffd@0
|
94 clear n10
|
wolffd@0
|
95 s100 = mirsegment(a,p100)
|
wolffd@0
|
96 clear p100
|
wolffd@0
|
97 mirplay(s100)
|
wolffd@0
|
98 clear s100
|
wolffd@0
|
99 s50 = mirsegment(a,p50)
|
wolffd@0
|
100 clear p50
|
wolffd@0
|
101 mirplay(s50)
|
wolffd@0
|
102 clear s50
|
wolffd@0
|
103 s10 = mirsegment(a,p10)
|
wolffd@0
|
104 clear p10
|
wolffd@0
|
105 mirplay(s10)
|
wolffd@0
|
106 clear s10
|
wolffd@0
|
107
|
wolffd@0
|
108 display('Strike any key to continue...');
|
wolffd@0
|
109 pause
|
wolffd@0
|
110 close all
|
wolffd@0
|
111
|
wolffd@0
|
112 % One more compact way of writing these commands is as follows:
|
wolffd@0
|
113 mirsegment(a,'Novelty')
|
wolffd@0
|
114 mirsegment(a,'Novelty','Contrast',0.01)
|
wolffd@0
|
115 mirsegment(a,'Novelty','KernelSize',100)
|
wolffd@0
|
116
|
wolffd@0
|
117 display('Strike any key to continue...');
|
wolffd@0
|
118 pause
|
wolffd@0
|
119 close all
|
wolffd@0
|
120
|
wolffd@0
|
121 % Besides, if you want to see the novelty curve with the peaks, just add a
|
wolffd@0
|
122 % second output:
|
wolffd@0
|
123 [s50 p50] = mirsegment(a,'Novelty','KernelSize',50)
|
wolffd@0
|
124 clear s50 p50
|
wolffd@0
|
125 [s10 p10] = mirsegment(a,'Novelty','KernelSize',10)
|
wolffd@0
|
126 clear a s10 p10
|
wolffd@0
|
127
|
wolffd@0
|
128 display('Strike any key to continue...');
|
wolffd@0
|
129 pause
|
wolffd@0
|
130 close all
|
wolffd@0
|
131
|
wolffd@0
|
132
|
wolffd@0
|
133 % 11. Try the whole process with MFCC instead of spectrum analysis. Take the
|
wolffd@0
|
134 % first ten MFCC for instance.
|
wolffd@0
|
135 help mirsegment
|
wolffd@0
|
136 % The segment function can simply be called as follows:
|
wolffd@0
|
137 sc = mirsegment('czardas','Novelty','MFCC','Rank',1:10)
|
wolffd@0
|
138 clear sc
|
wolffd@0
|
139
|
wolffd@0
|
140 % Here are some other examples of use:
|
wolffd@0
|
141 [ssp p m b] = mirsegment('valse_triste_happy','Spectrum',...
|
wolffd@0
|
142 'KernelSize',150,'Contrast',.1)
|
wolffd@0
|
143 clear p m b
|
wolffd@0
|
144 mirplay(ssp)
|
wolffd@0
|
145 clear ssp
|
wolffd@0
|
146
|
wolffd@0
|
147 display('Strike any key to continue...');
|
wolffd@0
|
148 pause
|
wolffd@0
|
149 close all
|
wolffd@0
|
150
|
wolffd@0
|
151 [smfcc2 p m a] = mirsegment('valse_triste_happy','MFCC',2:10,...
|
wolffd@0
|
152 'KernelSize',150,'Contrast',.1)
|
wolffd@0
|
153 clear p m a
|
wolffd@0
|
154 mirplay(smfcc2) |