2008 Poster Sessions : Using HiCarbString for handling large-volume web documents

Student Name : Seungbeom Kim
Advisor : David Cheriton
Research Areas: Computer Systems
Many applications, such as web crawlers and archives, process an enormous volume of string data, which typically contain a lot of duplicated information. Therefore, exploiting the duplication can help store and handle the data much more efficiently. We present HiCarbString, a non-conventional data structure for strings, and show how it can be used to reduce the required storage and process the data efficiently, also in distributed environments.

Seungbeom Kim is a Ph.D. candidate at Stanford University, where he is a member of the Distributed Systems Group. His research focuses on HiCarbString to process large-volume data efficiently in distributed environments. Seungbeom graduated summa cum laude with a
B.Sc.(Eng) from Seoul National University.