2016 Poster Sessions : DeepDive: Extracting Databases from Dark Data

Student Name : Jaeho Shin
Advisor : Christopher Ré
Research Areas: Computer Systems
DeepDive is an open source engine for building machine learning systems that extract databases from dark data, i.e., unstructured information sources such as text, tables, figures, and images. DeepDive helps answer challenging macroscopic questions that require high-quality structured data at massive scale. DeepDive powers many award winning knowledge base construction systems that have been accelerating science and creating positive societal impact. With its tools and abstraction, DeepDive enables users to rapidly construct, train, and debug massive statistical inference models without having to deal with lower-level details such as feature engineering and algorithm tuning.

Jaeho Shin is a PhD candidate advised by Chris Re. His work focuses on making DeepDive faster, lighter, and more usable by solving data management challenges in machine learning systems and creating languages, abstractions, and tools to accelerate humans-in-the-loop.