wolffd@0
|
1 echo on;
|
wolffd@0
|
2
|
wolffd@0
|
3 clc;
|
wolffd@0
|
4
|
wolffd@0
|
5 % This is a very basic demo of the mixture of factor analyzer software
|
wolffd@0
|
6 % written in Matlab by Zoubin Ghahramani
|
wolffd@0
|
7 % Dept of Computer Science
|
wolffd@0
|
8 % University of Toronto
|
wolffd@0
|
9
|
wolffd@0
|
10 pause; % Hit any key to continue
|
wolffd@0
|
11
|
wolffd@0
|
12 % To demonstrate the software we generate a sample data set
|
wolffd@0
|
13 % from a mixture of two Gaussians
|
wolffd@0
|
14
|
wolffd@0
|
15 pause; % Hit any key to continue
|
wolffd@0
|
16
|
wolffd@0
|
17 X1=randn(300,5); % zero mean 5 dim Gaussian data
|
wolffd@0
|
18 X2=randn(200,5)+2; % 5 dim Gaussian data with mean [1 1 1 1 1]
|
wolffd@0
|
19 X=[X1;X2]; % total 500 data points from mixture
|
wolffd@0
|
20
|
wolffd@0
|
21 % Fitting the model is very easy. For example to fit a mixture of 2
|
wolffd@0
|
22 % factor analyzers with three factors each...
|
wolffd@0
|
23
|
wolffd@0
|
24 pause; % Hit any key to continue
|
wolffd@0
|
25
|
wolffd@0
|
26
|
wolffd@0
|
27 [Lh,Ph,Mu,Pi,LL]=mfa(X,2,3);
|
wolffd@0
|
28
|
wolffd@0
|
29 % Lh, Ph, Mu, and Pi are the factor loadings, observervation
|
wolffd@0
|
30 % variances, observation means for each mixture, and mixing
|
wolffd@0
|
31 % proportions. LL is the vector of log likelihoods (the learning
|
wolffd@0
|
32 % curve). For more information type: help mfa
|
wolffd@0
|
33
|
wolffd@0
|
34 % to plot the learning curve (log likelihood at each step of EM)...
|
wolffd@0
|
35
|
wolffd@0
|
36 pause; % Hit any key to continue
|
wolffd@0
|
37
|
wolffd@0
|
38 plot(LL);
|
wolffd@0
|
39
|
wolffd@0
|
40 % you get a more informative picture of convergence by looking at the
|
wolffd@0
|
41 % log of the first difference of the log likelihoods...
|
wolffd@0
|
42
|
wolffd@0
|
43 pause; % Hit any key to continue
|
wolffd@0
|
44
|
wolffd@0
|
45 semilogy(diff(LL));
|
wolffd@0
|
46
|
wolffd@0
|
47 % you can look at some of the parameters of the fitted model...
|
wolffd@0
|
48
|
wolffd@0
|
49 pause; % Hit any key to continue
|
wolffd@0
|
50
|
wolffd@0
|
51 Mu
|
wolffd@0
|
52
|
wolffd@0
|
53 Pi
|
wolffd@0
|
54
|
wolffd@0
|
55 % ...to see whether they make any sense given that me know how the
|
wolffd@0
|
56 % data was generated.
|
wolffd@0
|
57
|
wolffd@0
|
58 % you can also evaluate the log likelihood of another data set under
|
wolffd@0
|
59 % the model we have just fitted using the mfa_cl (for Calculate
|
wolffd@0
|
60 % Likelihood) function. For example, here we generate a test from the
|
wolffd@0
|
61 % same distribution.
|
wolffd@0
|
62
|
wolffd@0
|
63
|
wolffd@0
|
64 X1=randn(300,5);
|
wolffd@0
|
65 X2=randn(200,5)+2;
|
wolffd@0
|
66 Xtest=[X1; X2];
|
wolffd@0
|
67
|
wolffd@0
|
68 pause; % Hit any key to continue
|
wolffd@0
|
69
|
wolffd@0
|
70 mfa_cl(Xtest,Lh,Ph,Mu,Pi)
|
wolffd@0
|
71
|
wolffd@0
|
72 % we should expect the log likelihood of the test set to be lower than
|
wolffd@0
|
73 % that of the training set.
|
wolffd@0
|
74
|
wolffd@0
|
75 % finally, we can also fit a regular factor analyzer using the ffa
|
wolffd@0
|
76 % function (Fast Factor Analysis)...
|
wolffd@0
|
77
|
wolffd@0
|
78 pause; % Hit any key to continue
|
wolffd@0
|
79
|
wolffd@0
|
80 [L,Ph,LL]=ffa(X,3);
|
wolffd@0
|
81
|