Abstract for Paper rosin.2011.tr1692

Return to Wisconsin Computer Vision Group Publications Page

A Bayesian model for image sense ambiguity in pictorial communication systems
J. Rosin, A. Goldberg, X. Zhu, and C. R. Dyer, Computer Sciences Department Technical Report 1692, University of Wisconsin - Madison, June 2011.

Abstract

Pictorial communication systems use synthesized pictures, rather than text, to communicate with users. Because such systems depend on images to convey meanings, it is critical to understand how a human user perceives the image meaning (sense). This paper offers an empirical and theoretical study of how humans perceive image senses. We conduct a user study with 113 users to elicit their perceived senses on 400 image sets, from which we discover widespread image sense ambiguities. We examine how the number of images shown relates to sense ambiguity and discover several significant patterns. We then propose a Bayesian model to explain human image perception behaviors, based on a novel random walk process on a WordNet-like sense hierarchy. Our model makes qualitative and quantitative predictions that largely agree with our observations of human perception. It can explain the "basic level" phenomenon known in psychology, and suggests a method for image sense disambiguation in pictorial communication systems.