Enhanced Query Processing on Complex Spatial and Temporal Data
Beschreibung
vor 17 Jahren
Innovative technologies in the area of multimedia and mechanical
engineering as well as novel methods for data acquisition in
different scientific subareas, including geo-science, environmental
science, medicine, biology and astronomy, enable a more exact
representation of the data, and thus, a more precise data analysis.
The resulting quantitative and qualitative growth of specifically
spatial and temporal data leads to new challenges for the
management and processing of complex structured objects and
requires the employment of efficient and effective methods for data
analysis. Spatial data denote the description of objects in space
by a well-defined extension, a specific location and by their
relationships to the other objects. Classical representatives of
complex structured spatial objects are three-dimensional CAD data
from the sector "mechanical engineering" and two-dimensional
bounded regions from the area "geography". For industrial
applications, efficient collision and intersection queries are of
great importance. Temporal data denote data describing time
dependent processes, as for instance the duration of specific
events or the description of time varying attributes of objects.
Time series belong to one of the most popular and complex type of
temporal data and are the most important form of description for
time varying processes. An elementary type of query in time series
databases is the similarity query which serves as basic query for
data mining applications. The main target of this thesis is to
develop an effective and efficient algorithm supporting collision
queries on spatial data as well as similarity queries on temporal
data, in particular, time series. The presented concepts are based
on the efficient management of interval sequences which are
suitable for spatial and temporal data. The effective analysis of
the underlying objects will be efficiently supported by adequate
access methods. First, this thesis deals with collision queries on
complex spatial objects which can be reduced to intersection
queries on interval sequences. We introduce statistical methods for
the grouping of subsequences. Involving the concept of multi-step
query processing, these methods enable the user to accelerate the
query process drastically. Furthermore, in this thesis we will
develop a cost model for the multi-step query process of interval
sequences in distributed systems. The proposed approach
successfully supports a cost based query strategy. Second, we
introduce a novel similarity measure for time series. It allows the
user to focus specific time series amplitudes for the similarity
measurement. The new similarity model defines two time series to be
similar iff they show similar temporal behavior w.r.t. being below
or above a specific threshold. This type of query is primarily
required in natural science applications. The main goal of this new
query method is the detection of anomalies and the adaptation to
new claims in the area of data mining in time series databases. In
addition, a semi-supervised cluster analysis method will be
presented which is based on the introduced similarity model for
time series. The efficiency and effectiveness of the proposed
techniques will be extensively discussed and the advantages against
existing methods experimentally proofed by means of datasets
derived from real-world applications.
engineering as well as novel methods for data acquisition in
different scientific subareas, including geo-science, environmental
science, medicine, biology and astronomy, enable a more exact
representation of the data, and thus, a more precise data analysis.
The resulting quantitative and qualitative growth of specifically
spatial and temporal data leads to new challenges for the
management and processing of complex structured objects and
requires the employment of efficient and effective methods for data
analysis. Spatial data denote the description of objects in space
by a well-defined extension, a specific location and by their
relationships to the other objects. Classical representatives of
complex structured spatial objects are three-dimensional CAD data
from the sector "mechanical engineering" and two-dimensional
bounded regions from the area "geography". For industrial
applications, efficient collision and intersection queries are of
great importance. Temporal data denote data describing time
dependent processes, as for instance the duration of specific
events or the description of time varying attributes of objects.
Time series belong to one of the most popular and complex type of
temporal data and are the most important form of description for
time varying processes. An elementary type of query in time series
databases is the similarity query which serves as basic query for
data mining applications. The main target of this thesis is to
develop an effective and efficient algorithm supporting collision
queries on spatial data as well as similarity queries on temporal
data, in particular, time series. The presented concepts are based
on the efficient management of interval sequences which are
suitable for spatial and temporal data. The effective analysis of
the underlying objects will be efficiently supported by adequate
access methods. First, this thesis deals with collision queries on
complex spatial objects which can be reduced to intersection
queries on interval sequences. We introduce statistical methods for
the grouping of subsequences. Involving the concept of multi-step
query processing, these methods enable the user to accelerate the
query process drastically. Furthermore, in this thesis we will
develop a cost model for the multi-step query process of interval
sequences in distributed systems. The proposed approach
successfully supports a cost based query strategy. Second, we
introduce a novel similarity measure for time series. It allows the
user to focus specific time series amplitudes for the similarity
measurement. The new similarity model defines two time series to be
similar iff they show similar temporal behavior w.r.t. being below
or above a specific threshold. This type of query is primarily
required in natural science applications. The main goal of this new
query method is the detection of anomalies and the adaptation to
new claims in the area of data mining in time series databases. In
addition, a semi-supervised cluster analysis method will be
presented which is based on the introduced similarity model for
time series. The efficiency and effectiveness of the proposed
techniques will be extensively discussed and the advantages against
existing methods experimentally proofed by means of datasets
derived from real-world applications.
Weitere Episoden
vor 11 Jahren
vor 11 Jahren
vor 11 Jahren
In Podcasts werben
Kommentare (0)