Beschreibung

vor 17 Jahren
It is well-known that traditional clustering methods considering
all dimensions of the feature space usually fail in terms of
efficiency and effectivity when applied to high-dimensional data.
This poor behavior is based on the fact that clusters may not be
found in the high-dimensional feature space, although clusters
exist in subspaces of the feature space. To overcome these
limitations of traditional clustering methods, several methods for
subspace clustering have been proposed recently. Subspace
clustering algorithms aim at automatically identifying lower
dimensional subspaces of the feature space in which clusters exist.
There exist two types of subspace clustering algorithms: Algorithms
for detecting clusters in axis-parallel subspaces and, as an
extension, algorithms for finding clusters in subspaces which are
arbitrarily oriented. Generally, the subspace clusters may be
hierarchically nested, i.e., several subspace clusters of low
dimensionality may form a subspace cluster of higher
dimensionality. Since existing subspace clustering methods are not
able to detect these complex structures, hierarchical approaches
for subspace clustering have to be applied. The goal of this
dissertation is to develop new efficient and effective methods for
hierarchical subspace clustering by identifying novel challenges
for the hierarchical approach and proposing innovative and solid
solutions for these challenges. The first Part of this work deals
with the analysis of hierarchical subspace clusters in
axis-parallel subspaces. Two new methods are proposed that search
simultaneously for subspace clusters of arbitrary dimensionality in
order to detect complex hierarchies of subspace clusters.
Furthermore, a new visualization model of the clustering result by
means of a graph representation is provided. In the second Part of
this work new methods for hierarchical clustering in arbitrarily
oriented subspaces of the feature space are discussed. The
so-called correlation clustering can be seen as an extension of
axis-parallel subspace clustering. Correlation clustering aims at
grouping the data set into subsets, the so-called correlation
clusters, such that the objects in the same correlation cluster
show uniform attribute correlations. Two new hierarchical
approaches are proposed which combine density-based clustering with
Principal Component Analysis in order to identify hierarchies of
correlation clusters. The last Part of this work addresses the
analysis and interpretation of the results obtained from
correlation clustering algorithms. A general method is introduced
to extract quantitative information on the linear dependencies
between the objects of given correlation clusters. Furthermore,
these quantitative models can be used to predict the probability
that an object is created by one of these models. Both, the
efficiency and the effectiveness of the presented techniques are
thoroughly analyzed. The benefits over traditional approaches are
shown by evaluating the new methods on synthetic as well as
real-world test data sets.

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15
:
: