Ontology Databases
Paea Le Pendu
Committee: Dejing Dou (chair), Zena Ariola, Christopher Wilson, Monte Westerfield
Dissertation Defense(Mar 2010)
Keywords:

On the one hand, ontologies provide a means of formally specifying complex descriptions and relationships about information in a way that is expressive yet amenable to automated processing and reasoning. When data are annotated using terms from an ontology, the instances inhere in formal semantics. Compared to an ontology, which may have as few as a dozen or as many as tens of thousands of terms, the annotated instances for the ontology are often several orders of magnitude larger, from millions to possibly trillions of instances. Unfortunately, existing reasoning techniques cannot scale to these sizes.

On the other hand, relational database management systems provide mechanisms for storing, retrieving, and maintaining the integrity of large amounts of data. Relational database management systems are well known for scaling to extremely large sizes of data, some claiming to manage over a quadrillion data.

This dissertation defines ontology databases as a mapping from ontologies to relational databases in order to combine the expressiveness of ontologies with the scalability of relational databases. This mapping is s01J,nd and, under certain conditions, complete. That is, the database behaves like a knowledge base which is faithful to the semantics of a given ontology. What distinguishes this work is the treatment of the relational database management system as an active reasoning component rather than as a passive storage and retrieval system. The main contributions this dissertation will highlight include: (i) the theory and implementation particulars for mapping ontologies to databases, (ii) subsumption based reasoning, (iii) inconsistency detection, (iv) scalability studies, and (v) information integration (specifically, information exchange). This work is novel because it is the first attempt to embed a logical reasoning system, specified by a Semantic Web ontology, into a plain relational database management system using active database technologies. This work also introduces the not-gadget, which relaxes the closed-world assumption and increases the expressive power of the logical system without significant cost. This work also demonstrates how to deploy the same framework as an information integration system for data exchange scenarios, which is an important step toward semantic information integration over distributed data repositories.