Colloquium Details
Faculty Search Colloquium: From Nodes to Networks: Exploiting Autocorrelation to Improve Statistical Models of Relational Data
Author: | Jennifer Neville University of Massachusetts, Amherst |
---|---|
Date: | February 28, 2006 |
Time: | 15:30 |
Location: | 220 Deschutes |
Host: | Allen Malony |
Abstract
Statistical relational learning is transforming the field of automated learning and discovery by moving beyond the conventional analysis of entities in isolation to analyze networks of interconnected entities. In domains such as bioinformatics, citation analysis, epidemiology, fraud detection, intelligence analysis, and web analytics, there is often limited information about any one entity in isolation, instead it is the connections among entities that are of crucial importance to pattern discovery.
One of the most compelling reasons to use relational models is the ubiquitous presence of autocorrelation in relational datasets. Autocorrelation is a statistical dependency between the values of the same variable of related entities (e.g., hyperlinked web pages are likely to discuss the same topic), which can be exploited to improve predictions by learning models for collective inference. In this talk, I will discuss two graphical models I have developed for collective inference: relational dependency networks (RDNs) and latent group models (LGMs). RDNs are the first statistical relational model capable of learning cyclic autocorrelation dependencies. LGMs models are the first model to exploit latent group structures to improve inference accuracy and efficiency.
To understand the performance differences between RDNs and LGMs, I have developed an extended bias-variance analysis framework that incorporates errors due to both learning and inference. Using this framework, I will demonstrate the effects of data characteristics on model performance and illustrate the mechanisms behind model performance that be used to drive the development of improved models and algorithms.
Biography
Jennifer Neville is a Ph.D. candidate in the Department of Computer Science at the University of Massachusetts, Amherst working with Professor David Jensen in the Knowledge Discovery Laboratory. Her research focuses on data mining and machine learning in relational data, with applications in bioinformatics, citation analysis, epidemiology, fraud detection, and web analytics. Jennifer received her B.S. with honors in 2000 and her M.S. in 2004 from the University of Massachusetts, Amherst. She was awarded graduate research fellowships by both NSF and AT&T Laboratories.