Evaluation of XPath Queries against XML Streams
Beschreibung
vor 19 Jahren
XML is nowadays the de facto standard for electronic data
interchange on the Web. Available XML data ranges from small Web
pages to ever-growing repositories of, e.g., biological and
astronomical data, and even to rapidly changing and possibly
unbounded streams, as used in Web data integration and
publish-subscribe systems. Animated by the ubiquity of XML data,
the basic task of XML querying is becoming of great theoretical and
practical importance. The last years witnessed efforts as well from
practitioners, as also from theoreticians towards defining an
appropriate XML query language. At the core of this common effort
has been identified a navigational approach for information
localization in XML data, comprised in a practical and simple query
language called XPath. This work brings together the two
aforementioned ``worlds'', i.e., the XPath query evaluation and the
XML data streams, and shows as well theoretical as also practical
relevance of this fusion. Its relevance can not be subsumed by
traditional database management systems, because the latter are not
designed for rapid and continuous loading of individual data items,
and do not directly support the continuous queries that are typical
for stream applications. The first central contribution of this
work consists in the definition and the theoretical investigation
of three term rewriting systems to rewrite queries with reverse
predicates, like parent or ancestor, into equivalent forward
queries, i.e., queries without reverse predicates. Our rewriting
approach is vital to the evaluation of queries with reverse
predicates against unbounded XML streams, because neither the
storage of past fragments of the stream, nor several stream
traversals, as required by the evaluation of reverse predicates,
are affordable. Beyond their declared main purpose of providing
equivalences between queries with reverse predicates and forward
queries, the applications of our rewriting systems shed light on
other query language properties, like the expressivity of some of
its fragments, the query minimization, or even the complexity of
query evaluation. For example, using these systems, one can rewrite
any graph query into an equivalent forward forest query. The second
main contribution consists in a streamed and progressive evaluation
strategy of forward queries against XML streams. The evaluation is
specified using compositions of so-called stream processing
functions, and is implemented using networks of deterministic
pushdown transducers. The complexity of this evaluation strategy is
polynomial in both the query and the data sizes for forward forest
queries and even for a large fragment of graph queries. The third
central contribution consists in two real monitoring applications
that use directly the results of this work: the monitoring of
processes running on UNIX computers, and a system for providing
graphically real-time traffic and travel information, as
broadcasted within ubiquitous radio signals.
interchange on the Web. Available XML data ranges from small Web
pages to ever-growing repositories of, e.g., biological and
astronomical data, and even to rapidly changing and possibly
unbounded streams, as used in Web data integration and
publish-subscribe systems. Animated by the ubiquity of XML data,
the basic task of XML querying is becoming of great theoretical and
practical importance. The last years witnessed efforts as well from
practitioners, as also from theoreticians towards defining an
appropriate XML query language. At the core of this common effort
has been identified a navigational approach for information
localization in XML data, comprised in a practical and simple query
language called XPath. This work brings together the two
aforementioned ``worlds'', i.e., the XPath query evaluation and the
XML data streams, and shows as well theoretical as also practical
relevance of this fusion. Its relevance can not be subsumed by
traditional database management systems, because the latter are not
designed for rapid and continuous loading of individual data items,
and do not directly support the continuous queries that are typical
for stream applications. The first central contribution of this
work consists in the definition and the theoretical investigation
of three term rewriting systems to rewrite queries with reverse
predicates, like parent or ancestor, into equivalent forward
queries, i.e., queries without reverse predicates. Our rewriting
approach is vital to the evaluation of queries with reverse
predicates against unbounded XML streams, because neither the
storage of past fragments of the stream, nor several stream
traversals, as required by the evaluation of reverse predicates,
are affordable. Beyond their declared main purpose of providing
equivalences between queries with reverse predicates and forward
queries, the applications of our rewriting systems shed light on
other query language properties, like the expressivity of some of
its fragments, the query minimization, or even the complexity of
query evaluation. For example, using these systems, one can rewrite
any graph query into an equivalent forward forest query. The second
main contribution consists in a streamed and progressive evaluation
strategy of forward queries against XML streams. The evaluation is
specified using compositions of so-called stream processing
functions, and is implemented using networks of deterministic
pushdown transducers. The complexity of this evaluation strategy is
polynomial in both the query and the data sizes for forward forest
queries and even for a large fragment of graph queries. The third
central contribution consists in two real monitoring applications
that use directly the results of this work: the monitoring of
processes running on UNIX computers, and a system for providing
graphically real-time traffic and travel information, as
broadcasted within ubiquitous radio signals.
Weitere Episoden
vor 11 Jahren
vor 11 Jahren
vor 11 Jahren
In Podcasts werben
Kommentare (0)