Exploiting Domain Structure with Hybrid Generative-Discriminative Models
Austen Kelly
Committee: Daniel Lowd(chair)
Masters Thesis(May 2024)
Keywords:

Machine learning methods often face a tradeoff between the accuracy of discriminative models and the lower sample complexity of their generative counterparts. This inspires a need for hybrid methods. In this paper we present the graphical ensemble classifier (GEC), a novel combination of logistic regression and naive Bayes. By partitioning the feature space based on known independence structure, GEC is able to handle datasets with a diverse set of features and achieve higher accuracy than a purely discriminative model from less training data. In addition to describing the theoretical basis of our model, we show the practical effectiveness on artificial data, along with the 20-newsgroups, MNIST, and MediFor datasets.