2017 Poster Sessions : How To Train Your DragoNN (Deep Regulatory Genomic Neural Network

Student Name : Johnny Israeli
Advisor : None
Deep learning models have been recently applied to several key problems in regulatory genomics including prediction of protein-DNA interactions, context-specific chromatin state and non-coding regulatory variants. However, model design, parameter selection and training Deep RegulAtory GenOmics Neural Networks (DragoNNs) remains something of a black art. When is a DragoNN good choice for a learning problem in genomics? How does one design a high-performance model? And more importantly, can we interpret these models to discover novel patterns in the input data and the induced features to gain new biological insights? To demystify these questions, we developed the dragonn toolkit (http://kundajelab.github.io/dragonn/) - an interactive, cloud-based framework to allow users with inter-disciplinary backgrounds to learn and experiment with strategies for training and interpretting DragoNNs that model regulatory DNA sequence data. The dragonn toolkit provides a customizable simulation engine for regulatory DNA sequences; instructive built-in simulations capturing key properties of regulatory DNA; interactive IPython notebook tutorials for novice users; a command-line interface for simple applications of DragoNNs to custom user-defined sequence data; an interpretation toolkit for model exploration, pattern discovery and visualization and cloud resources for easy access to software and hardware. We will showcase the dragonn toolkit on simulated and real regulatory genomic data, demystify popular DragoNN architectures and provide guidelines for modeling and interpreting regulatory sequence using DragoNN models. We have used dragonn in several introductory workshops and tutorials on deep learning for genomics. We plan to continue its development into a community resource with guidelines for best practices and support for a model zoo allowing rapid access to and development of high performance, interpretable deep learning models for genomics.

Johnny Israeli is a PhD student in the biophysics program and Bio-X SIGF fellow. Working with Prof. Anshul Kundaje in the departments of Genetics and Computer Science, he has developed deep learning models of protein-DNA interactions underlying gene regulation and tools to democratize deep learning for genomics. In his spare time, he enjoys traveling and playing the guitar. Johnny earned a Bachelor's in Math and Master's in Physics at the University of Kansas.