2008 Poster Sessions : Boolean Analysis of Large Gene-Expression Datasets

Student Name : Debashis Sahoo
Advisor : David Dill
Research Areas: Computer Systems
We present a new algorithm for building Boolean networks from very large amounts of gene expression data. The resulting networks include not only symmetric relationships between genes, such as co-expression, but also asymmetric relations that represent if-then rules. The approach is conceptually simple and fast enough that it can build a complete gene network using 3 billion gene pairs with more than 9,500 expression values per gene-pair in less than 3 hours on an ordinary office computer. The algorithm was applied to publicly available data from thousands of microarrays for humans, mice, and fruit flies (for a total of 365 million Affymetrix probeset expression levels). The resulting network consists of hundreds of millions of relationships between genes, and contains biologically meaningful information about gender differences, tissue differences, development, differentiation and co-expression. We also examine relationships that are conserved between humans, mice, and fruit flies. The full Boolean relationships are available for exploration at http://gourd.stanford.edu/BooleanNet.

Boolean networks were constructed from 4,787 publicly available Affymetrix U133 Plus 2.0 human, 2,154 Affymetrix mouse 430 2.0, and 450 Affymetrix Drosophila genome 1 arrays from Gene Expression Omnibus (Edgar et al. 2002). All the datasets were normalized using the RMA algorithm (Irizarry et al. 2003). There are 208 million, 336 million and 17 million Boolean relationships in human, mouse and fruit fly respectively. Additionally, 4 million Boolean relationships are conserved in human and mouse and 41,260 Boolean relationships are conserved in human, mouse and fruit fly.

I am a PhD candidate in the department of Electrical Engineering at Stanford University. I completed my undergrad in Computer Science and Engineering at Indian Institute of Technology, Kharagpur. My research currently focuses on Bioinformatics. Specifically, I am developing tools to understand gene expression changes in various cancers. Broadly, I am interested in issues related to cancer biology, immunology and genetics.