Skip Navigation

Colloquium Details

Faculty Search Colloquium: From Nodes to Networks: Exploiting Autocorrelation to Improve Statistical Models of Relational Data

Author:Jennifer Neville University of Massachusetts, Amherst
Date:February 28, 2006
Time:15:30
Location:220 Deschutes
Host:Allen Malony

Abstract

Statistical relational learning is transforming the field of automated learning and discovery by moving beyond the conventional analysis of entities in isolation to analyze networks of interconnected entities. In domains such as bioinformatics, citation analysis, epidemiology, fraud detection, intelligence analysis, and web analytics, there is often limited information about any one entity in isolation, instead it is the connections among entities that are of crucial importance to pattern discovery.

One of the most compelling reasons to use relational models is the ubiquitous presence of autocorrelation in relational datasets. Autocorrelation is a statistical dependency between the values of the same variable of related entities (e.g., hyperlinked web pages are likely to discuss the same topic), which can be exploited to improve predictions by learning models for collective inference. In this talk, I will discuss two graphical models I have developed for collective inference: relational dependency networks (RDNs) and latent group models (LGMs). RDNs are the first statistical relational model capable of learning cyclic autocorrelation dependencies. LGMs models are the first model to exploit latent group structures to improve inference accuracy and efficiency.

To understand the performance differences between RDNs and LGMs, I have developed an extended bias-variance analysis framework that incorporates errors due to both learning and inference. Using this framework, I will demonstrate the effects of data characteristics on model performance and illustrate the mechanisms behind model performance that be used to drive the development of improved models and algorithms.

Biography

Jennifer Neville is a Ph.D. candidate in the Department of Computer Science at the University of Massachusetts, Amherst working with Professor David Jensen in the Knowledge Discovery Laboratory. Her research focuses on data mining and machine learning in relational data, with applications in bioinformatics, citation analysis, epidemiology, fraud detection, and web analytics. Jennifer received her B.S. with honors in 2000 and her M.S. in 2004 from the University of Massachusetts, Amherst. She was awarded graduate research fellowships by both NSF and AT&T Laboratories.