已知 EM 算法可以应用于拟合高斯混合模型。 是否有任何算法示例,其中使用 MATLAB 解释了 k-means
?
我找到了这个 m 文件:
function [label, model, llh] = emgm(X, init)% Perform EM algorithm for fitting the Gaussian mixture model.% X: d x n data matrix% init: k (1 x 1) or label (1 x n, 1<=label(i)<=k) or center (d x k)% Written by Michael Chen ([email protected]).%% initializationfprintf('EM for Gaussian mixture: running ... \n');R = initialization(X,init);[~,label(1,:)] = max(R,[],2);R = R(:,unique(label));tol = 1e-10;maxiter = 500;llh = -inf(1,maxiter);converged = false;t = 1;while ~converged && t < maxiter t = t+1; model = maximization(X,R); [R, llh(t)] = expectation(X,model); [~,label(:)] = max(R,[],2); u = unique(label); % non-empty components if size(R,2) ~= size(u,2) R = R(:,u); % remove empty components else converged = llh(t)-llh(t-1) < tol*abs(llh(t)); endendllh = llh(2:t);if converged fprintf('Converged in %d steps.\n',t-1);else fprintf('Not converged in %d steps.\n',maxiter);endfunction R = initialization(X, init)[d,n] = size(X);if isstruct(init) % initialize with a model R = expectation(X,init);elseif length(init) == 1 % random initialization k = init; idx = randsample(n,k); m = X(:,idx); [~,label] = max(bsxfun(@minus,m'*X,dot(m,m,1)'/2),[],1); [u,~,label] = unique(label); while k ~= length(u) idx = randsample(n,k); m = X(:,idx); [~,label] = max(bsxfun(@minus,m'*X,dot(m,m,1)'/2),[],1); [u,~,label] = unique(label); end R = full(sparse(1:n,label,1,n,k,n));elseif size(init,1) == 1 && size(init,2) == n % initialize with labels label = init; k = max(label); R = full(sparse(1:n,label,1,n,k,n));elseif size(init,1) == d %initialize with only centers k = size(init,2); m = init; [~,label] = max(bsxfun(@minus,m'*X,dot(m,m,1)'/2),[],1); R = full(sparse(1:n,label,1,n,k,n));else error('ERROR: init is not valid.');endfunction [R, llh] = expectation(X, model)mu = model.mu;Sigma = model.Sigma;w = model.weight;n = size(X,2);k = size(mu,2);logRho = zeros(n,k);for i = 1:k logRho(:,i) = loggausspdf(X,mu(:,i),Sigma(:,:,i));endlogRho = bsxfun(@plus,logRho,log(w));T = logsumexp(logRho,2);llh = sum(T)/n; % loglikelihoodlogR = bsxfun(@minus,logRho,T);R = exp(logR);function model = maximization(X, R)[d,n] = size(X);k = size(R,2);nk = sum(R,1);w = nk/n;mu = bsxfun(@times, X*R, 1./nk);Sigma = zeros(d,d,k);sqrtR = sqrt(R);for i = 1:k Xo = bsxfun(@minus,X,mu(:,i)); Xo = bsxfun(@times,Xo,sqrtR(:,i)'); Sigma(:,:,i) = Xo*Xo'/nk(i); Sigma(:,:,i) = Sigma(:,:,i)+eye(d)*(1e-6); % add a prior for numerical stabilityendmodel.mu = mu;model.Sigma = Sigma;model.weight = w;function y = loggausspdf(X, mu, Sigma)d = size(X,1);X = bsxfun(@minus,X,mu);[U,p]= chol(Sigma);if p ~= 0 error('ERROR: Sigma is not PD.');endQ = U'\X;q = dot(Q,Q,1); % quadratic term (M distance)c = d*log(2*pi)+2*sum(log(diag(U))); % normalization constanty = -(c+q)/2;
回答:
k-means
用于初始化 EM
算法中的均值。 尽管我建议您编写自己的 EM 算法,但您会发现 Mathworks 文件交换中的 这个 EM 程序 对您入门有所帮助。 作者使用了 k-means
,这正是您想要的。