2010 Poster Sessions : Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities

Student Name : Bangpeng Yao
Advisor : Fei-Fei Li
Research Areas: Artificial Intelligence
Detecting objects in cluttered scenes and estimating articulated human body parts are two challenging problems in computer vision. The difficulty is particularly pronounced in activities involving human-object interactions (e.g. playing tennis), where the relevant object tends to be small or only partially visible, and the human body parts are often self-occluded. We observe, however, that objects and human poses can serve as mutual context to each other -- recognizing one facilitates the recognition of the other. In this paper we propose a new random field model to encode the mutual context of objects and human poses in human-object interaction activities. We then cast the model learning task as a structure learning problem, of which the structural connectivity between the object, the overall human pose, and different body parts are estimated through a structure search approach, and the parameters of the model are estimated by a new max-margin algorithm. On a sports data set of six classes of human-object interactions, we show that our mutual context model significantly outperforms state-of-the-art in detecting very difficult objects and human poses.

Bangpeng Yao is a Ph.D. student in Computer Science Department at Stanford University, where he is advised by Fei-Fei Li. His research interests include computer vision, machine learning, and computational neuroscience. He earned a B.E. degree in Automation, and a M.E. degree in Computer Science from Tsinghua University in China. He worked with Prof. Li for one year in Princeton University before moving to Stanford.