[Galax]

Galax Home
Tech. Center                        
 XQuery Optimization
 Static Typing
 Testing
 Research Papers
 Galax Projects

This page contains a list of ``low supervision'' projects, appropriate for upper-level undergraduates and graduate students. If you decide to do one of these projects, please send us mail. If you want to work on a project that involves the optimizer or query evaluation strategy, it is probably worth discussing it with us first.


Galax ``Low Supervision'' Projects

  • Extension Functions
    • Add support for external functions written in Java, C, or O'Caml.
  • Add support for profiling user-defined and external functions.
  • XQuery function libraries for common vocabularies: RDF, SOAP, business dialects, etc.
    • Problem statement: XQuery now supports a notion of module. A module is a set of user-defined XQuery functions bundled together. A nice use of such modules is as 'libraries' that support common operations on specific dialects.
    • Objective: Start writing libraries for the most important dialects (e.g., SOAP, RDF).
    • Technical aspects: The difficulty is to clearly identify what the most useful operations on those dialects are.
    • Pointers:
  • XML Schema facets.
    • Problem statement: Currently, Galax does not support facets for XML Schema simple types. It does not take those into account during XML Schema validation.
    • Objective: Implement support for XML Schema facets in Galax.
    • Technical aspects: The first goal is to understand how to change the type system to add facet descriptions systematically. The second goal is to implement facets, starting with the most important (e.g., max min values, string regular expressions, etc), this will include both parsing, and validation for those facets.
    • Pointers:

      Facets are described in [XML Schema Part 2: Datatypes]

  • Collations : support for various collations; implement string functions that support collations
    • Problem statement: Currently, Galax has minimal support for unicode, and no support for collations.
    • Objective: Implement a complete, and efficient support for Unicode, including collations.
    • Technical directions: This problem is fairly important and complex. This will require good understanding of the Unicode standard, and approach to unicode support XML and in XQuery. It might require some changes to the Galax parsing infrastructure(limited to the lexing level). Currently Galax uses some basic lexing support for Unicode from the PXP engine (see in ./tools/pxp-engine and ./tools/netstring).
    • Pointers:

      There is a good Unicode library for Caml called camomile. It can be found at: http://camomile.sourceforge.net/ This should be probably used as the basis to support string/unicode/collations in Galax.

  • Testing & performance
    • Run Galax against all publically available test suites : NIST, BumbleBee, etc.
  • Portability
    • Build Linux RPMs; Figure out dynamic linking under Windows
  • Alternative data model implementations:
    • Write DM wrapper for LDAP, other semi-structured sources
  • User interface
    • Emacs mode for XQuery
  • Add support for RelaxNG.
    • Problem statement: RelaxNG is a nice alternative to XML Schema, that some people like. Currently, Galax only support XML Schema.
    • Objective: Add support for Relax NG to Galax.
    • Technical directions: As opposed to DTDs, the mapping from Relax NG to XML Schema is not fully trivial. Notably, there are two key differences: Relax NG does not use named typing, and allows ambiguous content models. The first goal is to define a reasonable mapping from Relax NG to the XQuery type system. Then a parser to RelaxNG and mapping to the type system should be implemented.
    • Pointers: