2011 Poster Sessions : Wrangler: Interactive Visual Specification of Data Transformation Scripts

Student Name : Sean Kandel
Advisor : Jeffrey M. Heer
Research Areas: Graphics/HCI
Though data analysis tools continue to improve, analysts still expend an inordinate amount of time and effort manipulating data and assessing data quality issues. Such “data wrangling” regularly involves reformatting data values or layout, correcting erroneous or missing values, and integrating multiple data sources. These transforms are often difficult to specify and difficult to reuse across analysis tasks, teams, and tools. In response, we introduce Wrangler, an interactive system for creating data transformations. Wrangler combines direct manipulation of visualized data with automatic inference of relevant transforms, enabling analysts to iteratively explore the space of applicable operations and preview their effects. Wrangler supports script export as generated python or javascript code, enabling analysts to apply transformations to large data sets offline.

Sean is a Computer Science Ph.D. student at Stanford University, where he is advised by Professor Jeffrey Heer. He currently builds graphical systems for data cleaning. His most recent project is Data Wrangler, an interactive tool for authoring data transformation scripts. His research investigates visualization and interaction techniques for identifying and addressing data quality issues.