CIS607, Spring 2010
Data Mining and Data Integration in Bioinformatics
Course Description:
Data mining and data integration are two active research areas related to several disciplines, such as Databases, Machine Learning, Statistics, AI and the Semantic Web, as well as applications in science, engineering and industry. People in those fields are conducting a large body of research to deal with semantically and structurally heterogeneous data. The general goal of data mining and data integration in biomedical informatics is to design data mining and data integration systems which can automatically extract knowledge, compare and integrate information from multiple heterogeneous biomedical data resources, such as genetics and neuroscience labs. This graduate research seminar will cover the following topics: data heterogeneity and integration in biomedical data, knowledge discovery in biomedical data, ontologies and the Semantic Web in bioinformatics.
The instructor will give some introduction about those topics in a couple
of lectures. Students are expected to read and discuss papers from
journals or conference proceedings or from unpublished manuscripts on the
Web. Each student is expected to give a presentation about the paper(s)
or the topic he/she is interested in. The final report for each student can
be a survey paper or a small implementation.
Prerequisites:
None. Basic knowledge of Databases, machine learning and AI will be helpful.
Time and Place:
Tuesdays 2:00-3:20, 260 Deschutes Hall.
Instructor:
Dejing Dou, 303 Deschutes, phone 346-4572, email dou@cs.uoregon.edu.
Office hours:
Fridays 3:30-5:00 or by appointment.
Evaluation:
There is no exam for this seminar. Attendance and participation, paper
reading, paper presentation and final report will determine the course score. Students
will be encouraged to conduct further research projects from the topics
discussed in this seminar, but it is not the requirement. Some more detail.
-
Lecture Notes:
Homework:
Papers for Reading (keeping updated):
- Surveys on Data Mining, Data Integration and Bioinformatics
- Data Mining and Knowledge Discovery in Biomedical Data
- MICHAEL B. EISEN, PAUL T. SPELLMAN, PATRICK O. BROWN, and DAVID BOTSTEIN Cluster analysis and display of genome-wide expression patterns
. Proc. Natl. Acad. Sci. Vol. 95, pp. 14863-14868, December 1998
- Isabelle Guyon, Jason Weston, Stephen Barnhill, M.D.
and Vladimir Vapnik Gene Selection for Cancer Classification using
Support Vector Machines
. Machine Learning, 46, 389-422, 2002
- YING JIN, T. M. MURALI and NAREN RAMAKRISHNAN Compositional Mining of Multirelational
Biological Datasets
. ACM Transactions on Knowledge Discovery from Data, Vol. 2, No. 1, Article 2, March 2008
- Padmini Srinivasan and Xin Ying Qiu GO for gene documents
. BMC Bioinformatics 2007, 8(Suppl 9):S3
- Data Integration and Data Fusion in Biomedical Data
- Jieping Ye, Kewei Chen, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Pate
Min Bae, Ravi Janardan, Huan Liu, Gene Alexander, and Eric Reiman Heterogeneous Data Fusion for Alzheimer's Disease Study
. In Proceedings of KDD 2008, pp 1025-1033
- Aaron Birkland and Golan Yona BIOZON: a system for unification, management and analysis of heterogeneous biological data
. BMC Bioinformatics 2006, 7:70
- Gudmundur A. Thorisson, Juha Muilu and Anthony J. Brookes Genotype-phenotype databases:
challenges and solutions for the post-genomic era
. NATURE REVIEWS, Genetics Vol. 10, pp. 9-18, January, 2009
- Jacob Kohler, Stephan Philippi and Matthias Lange SEMEDA: ontology based semantic integration of biological databases. Bioinformatics Vol. 19 no. 18, pp. 2420-2427, 2003
Useful Links