Skip Navigation

Colloquium Details

End-User Feature Labeling: A Locally-Weighted Regression Approach

Author:Weng-Keen Wong Oregon State University
Date:February 24, 2011
Time:15:30
Location:220 Deschutes
Host:Daniel Lowd

Abstract

When intelligent interfaces, such as intelligent desktop assistants, email classifiers, and recommender systems, customize themselves to a particular end user, such customizations can decrease productivity and increase frustration due to inaccurate predictions - especially in early stages, when training data is limited. The end user can improve the learning algorithm by tediously labeling a substantial amount of additional training data, but this takes time and is too ad hoc to target a particular area of inaccuracy. To solve this problem, we propose a new learning algorithm based on locally weighted regression for feature labeling by end users, enabling them to point out which features are important for a class, rather than provide new training instances. In our user study, the first allowing ordinary end users to freely choose features to label directly from text documents, our algorithm was both more effective than others at leveraging end users' feature labels to improve the learning algorithm, and more robust to real users' noisy feature labels. These results strongly suggest that allowing users to freely choose features to label is a promising method for allowing end users to improve learning algorithms effectively.

Biography

Weng-Keen Wong is an Assistant Professor of Computer Science at Oregon State University. He received his Ph.D. (2004) and M.S. (2001) in Computer Science at Carnegie Mellon University, and his B.Sc. (1997) from the University of British Columbia. After graduating from Carnegie Mellon University in 2004, he joined the Department of Biomedical Informatics at the University of Pittsburgh as a Postdoctoral Associate. His research areas are in data mining and machine learning, with specific interests in anomaly detection, species distribution mapping, and "human in the loop" machine learning.