Return to Wisconsin Computer Vision Group Publications Page

Active 3D Surface Modeling using Perception-Based, Differential Geometric Primitives
Liangyin Yu, Ph.D. Dissertation, Computer Sciences Department, University of Wisconsin - Madison, August 1999.

Abstract

Computational vision is about why a biological vision system functions as it does and how to emulate its performance on computers. The central topics of this thesis are how a differential geometry language can be used to describe the essential elements of visual perception in both 2D and 3D domains, and how the components of this geometric language can be computed in ways closely related to how the human visual system performs similar functions.

The thesis starts by showing that at the earliest stage of vision, biological systems implement a mechanism that is computationally equivalent to computing local geometric invariants at the two-dimensional curve level. The availability of this information establishes the foundation for computing components of a differential geometry language from sensory inputs. The mathematical framework of scale space that makes this computational approach possible, likewise, has its biological basis.

On the other hand, visual perception is a global phenomenon that occurs generally in a 3D space. To understand this process and design computational systems that have comparable performance to humans requires specification of how a 2D local computational mechanism can be used in this global 3D environment. This goal is achieved through two steps. First, a global surface representation formulation is extended from the 2D framework. It is shown how local geometric features that are sparse and perceptually meaningful can be naturally used to represent global 3D surfaces. Second, active motion by an observer is introduced as an additional dimension to the data set so that the observer becomes mobile and can react to observations or verify hypotheses actively. This also makes dynamical data such as optical flow available to the observer. These added abilities enable the observer to perform tasks such as surface recovery and 3D navigation. In addition, the modeling process of 3D objects is naturally constrained by the computational resources available to the observer so that the model is inherently incremental.

This thesis contributes in the following areas: (1) direct computation of 2D differential geometric invariants from images using methods comparable to the human vision system, (2) perception-based global representations of 2D and 3D objects using geometric invariants, (3) novel methods for optical flow computation and segmentation, and (4) active methods for global surface recovery and navigation using both stationary contours, apparent contours and textured surfaces.