2011 Poster Sessions : An Introduction to Generic Entity Resolution

Student Name : Steven Whang
Advisor : Hector Garcia-Molina
Research Areas: Computer Systems
Abstract:
Entity Resolution (ER) is an important information integration problem: The same "real-world entities" (e.g., customers, or products) are referred to in different ways in multiple data records. For instance, two records on the same person may provide different name spellings, and addresses may differ. The goal of ER is to "resolve" entities, by identifying the records that represent the same entity and reconciling them to obtain one record per entity. In our approach, the functions that "match" records (i.e. decide whether they represent the same entity) and "merge" them are viewed as black-boxes, which permits generic and extensible ER solutions. In this paper, we provide an overview of various techniques aimed at improving the accuracy, scalability, and maintainability of ER.

Bio:
Steven Whang is a computer science PhD candidate at Stanford University advised by Prof. Hector Garcia-Molina. His research interests include information integration and data privacy. He received his B.S. in computer science from the Korea Advanced Institute of Science and Technology (KAIST) in 2003 and his M.S. in computer science from Stanford University in 2007. He is a recipient of the IBM PhD Fellowship.