Skip Navigation

Colloquium Details

Database Support for Scientific Data Analysis Application: Optimizations and Challenges

Author:Henrique Andrade University of Maryland, College Park
Date:March 12, 2004
Time:15:30
Location:220 Deschutes

Note: Special Day

Abstract

The efficient storage, management, and manipulation of large datasets is important in many fields of science, engineering and business. Simulations and experimental measurements are the main sources of data in these fields and the amount of data available for analyzing is increasing at a very high pace due both to the increased capability to collect and store data, as well as to the capability for processing it. In many cases, data analysis is employed in a collaborative environment, where multiple clients access the same datasets and perform similar processing on the data. For instance, in medical training, a large group of students may want to simultaneously explore a similar set of digitized microscopy slides, or visualize the same high resolution Magnetic Resonance Imaging (MRI) results.

In this talk, I will discuss a generic optimization framework that can be used as a common platform to deploy data analysis applications that are able to efficiently handle multiple simultaneous queries and can leverage previously computed results to partially or fully compute the results for new queries. Two optimization strategies will be discussed: active semantic caching and compiler-based techniques. I will also elaborate on some recent work for deploying a scientific database framework on a widely distributed Grid environment.

Biography

Henrique Andrade is a post-doc research associate at the University of Maryland, College Park, a consultant for the Department of Biomedical Informatics at the Ohio State University, and an adjunct assistant professor at the University of Maryland, University College. He obtained his PhD in Computer Science in 2002 at the University Maryland, College Park. He also holds two Master's degrees in Computer Science (Federal University of Minas Gerais - 1997 and University of Maryland College Park - 1999). His research interests are in the areas of high performance computing, in particular, middleware technologies and optimization techniques for data analysis applications, grid computing, data mining, and parallel debugging techniques.