Exploiting Domain Structure with Hybrid Generative-Discriminative Models
Austen Kelly
Committee: Daniel Lowd (chair), Dejing Dou, Chris Wilson
Directed Research Project(Sep 2019)
Keywords: machine learning, graphical models

Machine learning methods often face a tradeoff between the accuracy of discriminative models and the lower sample complexity of their generative counterparts. This inspires a need for hybrid methods. We present the graphical ensemble classifier (GEC), a novel combination of logistic regression and naive Bayes. By partitioning the feature space based on known independence structure, GEC is able to handle datasets with a diverse set of features and achieve higher accuracy than a purely discriminative model from less training data. In addition to describing the theoretical basis of our model, we show the practical effectiveness on artificial data, along with the 20-newsgroups and MediFor datasets.