Paul Heymann : 2010 InfoLab Workshop


Thursday, April 29, 2010
Location: Fisher Conference Center, Arrillaga Alumni Center

"Tagging Human Knowledge"


A fundamental premise of tagging systems is that regular users can organize large collections for browsing and other tasks using uncontrolled vocabularies. Until now, that premise has remained relatively unexamined. Using library data, we test the tagging approach to organizing a collection. We find that tagging systems have three major large scale organizational features: consistency, quality, and completeness. In addition to testing these features, we present results suggesting that users produce tags similar to the topics designed by experts, that paid tagging can effectively supplement tags in a tagging system, and that information integration may be possible across tagging systems.


Paul Heymann is a Computer Science Ph.D. candidate at Stanford University, advised by Prof. Hector Garcia-Molina. His work looks at instances where a large number of people interact with large amounts of data. At Stanford, his work on collaborative tagging systems has touched on problems in web search and library science. Most recently, he has been working on crowdsourcing, looking at ways to utilize vast pools of transient labor.