2008 Poster Sessions : Integrating Multiple Publically Available Gene Expression Datasets to Predict Therapeutic Options across the Disease Nosology

Student Name : Marina Sirota
Advisor : Atul J. Butte
Research Areas: Artificial Intelligence
Abstract
The goal of translational bioinformatics is to enable the transformation of increasingly voluminous genomic and biological data into diagnostics and therapeutics for the clinician. Traditionally comparative microarray analysis has been used in order to pinpoint genetic abnormalities in a disease of interest. By examining genes that are upregulated and downregulated in a disease state as opposed to a normal state, we can create a genetic profile of a disease. In addition, microarrays have been used to monitor changes in gene expression in response to drug treatments. Combining results of disease and drug related microarray experiments enables the discovery of possible functional connections between drugs, genes and diseases through the common gene expression changes. In a recent study, Lamb et al. presents us with a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules. The study consisted of 453 experiments with different dosages of 164 compound perturbagens and corresponding vehicle controls. The selected compounds included several FDA approved drugs as well as some nondrug bioactive compounds chosen to represent a broad range of effects. The authors created 11 disease signatures manually by examining the relevant literature and examined connections between small molecules such as HDAC inhibitors and looked at connections with disease states such as diet-induced obesity and Alzheimer's disease. In this work, we recreate and extend the drug-disease "connectivity map" using publically available disease related gene expression data obtained from the Gene Expression Omnibus. We automate the process of creating disease signatures using publically available data. We extend the original set of 11 signatures to examine nearly 100 diseases and predict possible therapeutics based on the drug-disease connectivity scores. We validate our findings using the known drug disease associations from the Micromedex database.

Bio
I am a second year Ph.D. student in the Biomedical Informatics department at Stanford University. Previously, I completed a Biomedical Computation undergraduate and Biomedical Informatics coterminal masters degree at Stanford.

As an undergraduate and masters student I worked in Serafim Batzoglou's laboratory. My previous work focused on computational biology, more specifically developing algorithms for comparative sequence analysis.

My current research advisor is Atul Butte. We are primarily interested in developing bioinformatics methods in integrative biology, or reasoning over the many available genome-scale measurements and experimental modalities, and applying these methods to study complex disorders in genomic medicine.