Ontology-based Error Detection in Text
Fernando Gutierrez
Committee: Dejing Dou (chair), Stephen Fickas, Daniel Lowd
Area Exam(Feb 2014)
Keywords: Ontology, Information Extraction, Inconsistency

In general, research related to text analysis assumes that the information contained in text form, although ambiguous, is correct with respect to the domain to which that text belongs to. This assumption comes in part from the fact that text analysis has historically been done over scientific documents. As the trend of taking text understanding to broader domains, such as Internet, we need to consider the presence of incorrect text in our data set. By incorrect text, we refer to a natural language text statement which is either false or contradicts the knowledge of the domain.

We propose the use of Ontology-based Information Extraction (OBIE) to identify incorrect statement in a text. OBIE, a subfield of Information Extraction (IE), uses the formal and explicit specification provided by an ontology to guide the Information Extraction process. OBIE can capture the semantic elements of the text through its IE component, and it can determine if these semantics contradicts the domain through its ontology component, concluding if the text is correct or incorrect.

In the present work, we review the most important topics of Ontology Inconsistency that can be relevant for the task of identifying and explaining incorrect statements, and we also review of the most relevant Information Extraction research. We believe that research in the detection of logic contradiction in ontologies (i.e., Ontology Inconsistency) can provide us with useful insight into identifying incorrect text and determining the specific elements (e.g., axioms) that participate in the contradiction. On the other hand, research in Information Extraction (and OBIE) can provide us awareness about the complexity of the analysis that can be performed on the text, given the semantics that can be extracted from it.