Beschreibung

vor 18 Jahren
Probabilistic modeling for data mining and machine learning
problems is a fundamental research area. The general approach is to
assume a generative model underlying the observed data, and
estimate model parameters via likelihood maximization. It has the
deep probability theory as the mathematical background, and enjoys
a large amount of methods from statistical learning, sampling
theory and Bayesian statistics. In this thesis we study several
advanced probabilistic models for data clustering and feature
projection, which are the two important unsupervised learning
problems. The goal of clustering is to group similar data points
together to uncover the data clusters. While numerous methods exist
for various clustering tasks, one important question still remains,
i.e., how to automatically determine the number of clusters. The
first part of the thesis answers this question from a mixture
modeling perspective. A finite mixture model is first introduced
for clustering, in which each mixture component is assumed to be an
exponential family distribution for generality. The model is then
extended to an infinite mixture model, and its strong connection to
Dirichlet process (DP) is uncovered which is a non-parametric
Bayesian framework. A variational Bayesian algorithm called VBDMA
is derived from this new insight to learn the number of clusters
automatically, and empirical studies on some 2D data sets and an
image data set verify the effectiveness of this algorithm. In
feature projection, we are interested in dimensionality reduction
and aim to find a low-dimensional feature representation for the
data. We first review the well-known principal component analysis
(PCA) and its probabilistic interpretation (PPCA), and then
generalize PPCA to a novel probabilistic model which is able to
handle non-linear projection known as kernel PCA. An
expectation-maximization (EM) algorithm is derived for kernel PCA
such that it is fast and applicable to large data sets. Then we
propose a novel supervised projection method called MORP, which can
take the output information into account in a supervised learning
context. Empirical studies on various data sets show much better
results compared to unsupervised projection and other supervised
projection methods. At the end we generalize MORP probabilistically
to propose SPPCA for supervised projection, and we can also
naturally extend the model to S2PPCA which is a semi-supervised
projection method. This allows us to incorporate both the label
information and the unlabeled data into the projection process. In
the third part of the thesis, we introduce a unified probabilistic
model which can handle data clustering and feature projection
jointly. The model can be viewed as a clustering model with
projected features, and a projection model with structured
documents. A variational Bayesian learning algorithm can be
derived, and it turns out to iterate the clustering operations and
projection operations until convergence. Superior performance can
be obtained for both clustering and projection.

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15
:
: