Galax Home | |||
|
You have reached Galax's XQuery optimization page!The purpose of this page is to document our progress in developing state of the art XML query optimization in Galax. We are currently redesigning Galax's architecture to support query optimization, and we are working on a number of specific optimization features. What you will find here:
Overview and ArchitectureOne of the challenges in building an efficient query
processor is the need to work at many different levels in
system in a cohesive fashion. The following figure presents an
![]() From a query processing point of view, the evaluation in Galax can be separated in the following phases:
Query pre-processing![]() Before applying optimization techniques, a number of actual transformation on the query must be performed. This phase includes parsing, query normalization, static typing, and a number of syntactic simplification operated on the query abstract syntax tree (AST). Some of those query transformations have already a direct impact on performances. For instance, removing implicit casting of nodes to values or removing sorting by document order. In addition, many other optimization techniques based on XML algebras cannot be performed before a serious clean-up of the XQuery AST. More detailed about that phase and on its impact can be found in the following papers.
Algebraic Query Optimization![]() After the query has been normalized and simplified, the next step is to compile it into an algebra. Currently Galax is using a variant of the XQuery core with support for tuples as such an algebra. We are interested in using a more complete algebra, but what is the right algebra for XML Query processing is still largely an open issue. One of the possible candidates though is the TAX algebra developped by our friends at university of Michigan. Specific XQuery Operations![]() We are currently developing specific algorithms for certain classes of operations which are expensive or used very often during XQuery processing. Note that efficient support for those operations typically assumes specific knowledge about the physical representation. XML ProjectionOne of the bottleneck of query processing for main-memory XQuery implementations is due to the size of tree representations for XML document (e.g., DOM or the XQuery Data Model). XML projection is physical operation that can be used to remove uncessary node in the XML data model based on the paths used in a given query. Document projection takes an XML stream as input and loads a projected document according to a set of input path expressions. Those paths are inferred from the query using a static analysis algorithm described in details in the following papers.
If you want to try out document projection for yourself, you can download the following archive, which contains the complete source code for Galax with document projection: Physical representation, storage and indices![]() XML is a very versatile markup language, suitable for many kinds of applications in many kinds of environment. Galax has the ambition to be a very versatile XQuery implementation that will work on a variety of physical XML representations. We are especially interested in experimenting with Galax in the following environments.
Currently, Galax only supports access to local XML files, but we are working on hooking up an http/SOAP client inside Galax that will allow Galax to receive XML streams from the network. We are also considering the development of a storage manager that will allow Galax users to process efficiently large amounts of XML data. Please come back to check on our progress! |