Introduction In this post I’m going to discuss collapsed gibbs sampling and how we can apply it to our Gaussian Mixture Model to model an Infinite Gaussian Mixture Model, also known as the Dirichlet Process Gaussian Mixture Model or DPGMM. The name is derived because we place a Dirichlet Process prior on the number of components allowing the number of components to be derived from the data instead of relying on the user to guess the correct number of components. Collapsed Gibbs Collapsed gibbs sampling is similar to regular gibbs sampling except that we are going to integrate out \(\mu\), \(\Sigma\), and \(\pi\) which allows us to reduce the number of sampling procedures required.
Dirichlet Processes A Dirichlet Process is best described as a stochastic process used in bayesian non-parametric modeling. Each draw from a DP is itself a distribution, making it a distribution over distributions. The DP derives it’s name from the fact that the marginal distributions of a DP is a finite dimensional Dirichlet Distribution just as a Gaussian Process has a finite dimensional Gaussian distributed marginal distribution. The DP has an infinite number of parameters which places it in the family of non-parametric. To motivate our exploration of DP’s, we can imagine a situation where we would like to extend our finite mixture model to the case of infinite components.
Introduction In my previous post, we extended our Gibbs Sampler to handle any number of \(K\) Components and fit our model in the case where \(K=4\). We can now move on and extend our model to handle the Multivariate case. I stick to demonstrating the model on the bivariate case for simplicity and to better enable visualization of the results. K-Component Bivariate GMM The K-Component Bivariate is defined exactly like the univariate case, \(p(x|\theta) = \sum_{j=1}^K\pi_j\phi_{\theta_j}(x)\) except that \(\phi_{\theta_j}(x)\) is defined by the multivariate Gaussian distribution: \[ \begin{align*} \phi_{\theta_j}(x) & = \frac{1}{\sqrt{2\pi\left|\Sigma_j\right|}}\exp^{\frac{1}{2}(x-\pmb{\mu_j})^T\Sigma^{-1}(x-\pmb{\mu_j})} \end{align*} \] We can generate our data using a similar method as before, except we extend \(\pmb{\mu_j}\) to the vector \(\pmb{\pmb{\mu_j}}\) and \(\sigma_j\) becomes the covariance matrix \(\Sigma_j\).
Introduction In my previous post, I derived a Gibbs Sampler for a univariate Gaussian Mixture Model (GMM). In this post I will extend the sampler to handle the K-Component univariate GMM. As a quick reminder, Gibbs Sampling is a MCMC method for sampling from multivariate distributions that may be difficult to sample from directly. The method is commonly used in bayesian inference when sampling the the posterior or joint distribution in question. The samples generated from the Markov chain will converge to the desired distribution when \(N\) is large. K-Component GMM The K-Component GMM can be defined as \(p(x|\theta) = \sum_{j=1}^K\pi_j\phi_{\theta_j}(x)\).
Gibbs Sampling Gibbs sampling is a Markov Chain Monte Carlo method for sampling from a posterior distribution usually defined as \(p(\theta|data)\). The idea behind the Gibbs Sampler is to sweep through each one of the parameters and sample from their conditional distributions, fixing the other parameters constant. For example, consider the random variables \(X_1, X_2, X_3\) and assume that I can write out the analytic form of \(p(X_1|X_2,X_3), p(X_2|X_1,X_3), p(X_3|X_2,X_1)\) . We start by initializing \(x_{1,t}, x_{2,t}, x_{3,t}\) and for each iteration \(t\) we sample \(p(X_{1,t+1}|X_{2,t},X_{3,t})\), \(p(X_{2,t+1}|X_{1,t+1},X_{3,t})\), and \(p(X_{3,t+1}|X_{2,t+1},X_{3,t+1})\). This process can then continue until convergence Peng (2021). Algorithm 1 details a general Gibbs Sampler.