wolffd@0: function [S, LL] = bic_score(counts, CPT, ncases) wolffd@0: % BIC_SCORE Bayesian Information Criterion score for a single family wolffd@0: % [S, LL] = bic_score(counts, CPT, ncases) wolffd@0: % wolffd@0: % S is a large sample approximation to the log marginal likelihood, wolffd@0: % which can be computed using dirichlet_score. wolffd@0: % wolffd@0: % S = \log [ prod_j _prod_k theta_ijk ^ N_ijk ] - 0.5*d*log(ncases) wolffd@0: % where counts encode N_ijk, theta_ijk is the MLE comptued from counts, wolffd@0: % and d is the num of free parameters. wolffd@0: wolffd@0: %CPT = mk_stochastic(counts); wolffd@0: tiny = exp(-700); wolffd@0: LL = sum(log(CPT(:) + tiny) .* counts(:)); wolffd@0: % CPT(i) = 0 iff counts(i) = 0 so it is okay to add tiny wolffd@0: wolffd@0: ns = mysize(counts); wolffd@0: ns_ps = ns(1:end-1); wolffd@0: ns_self = ns(end); wolffd@0: nparams = prod([ns_ps (ns_self-1)]); wolffd@0: % sum-to-1 constraint reduces the effective num. vals of the node by 1 wolffd@0: wolffd@0: S = LL - 0.5*nparams*log(ncases);