The methods presented in this thesis are called Semiparametric Geometric Modeling (SGM). The SGM fits a nonlinear manifold to Diffusion Weighted Magnetic Resonance Imaging data and produces a nonlinear coordinate system. Specifically, an SGM simultaneously extracts white matter structures and produces a set of functions that together define a model of the white matter. An SGM produces manifold models of the physical white matter structures. This allows the physical structures to be mapped by a multi-dimensional, nonlinear coordinate system that allows points, curves, surfaces, and volumes to be defined by the manifold model. Associated SGM functions can interpolate to the level of a single neural fiber, reveal the path of nerve fiber bundles, and be used to study the interaction e.g. crossing, touching, bifurcating, of fiber bundles throughout the brain. SGM functions can be used to query the manifold structure, allowing data to be organized so as to enable methods such as Functional Data Analysis to be used for statistical analysis of the data.

Software to build SGMs was implemented and a series of experiments were
carried out on Diffusion Weighted Magnetic Resonance Imaging data. The data
consisted of control subjects and subjects with autism. An SGM was used to
simultaneously extract and model two structures for each subject: a portion of the
genu of the corpus callosum and the right corticospinal tract. The SGMs were used
to map data from imaging space to curves on the manifold. These data curves were
the input for group differential analysis using Functional Data Analysis. Group
differences were found, based on these structures that are consistent with results
from other sources. However, the results also indicate that the group differences
were the result of differences in rates in change in data distributions along the
structure rather than simply point-wise differences in data at specific locations.
}
BIBTEX
{
@phdthesis{ Pack:2012:phd,
author = "Gary D. Pack",
title = "Semiparametric Geometric Methods for Extracting and Modeling White Matter Volumetric Structures of the Brain",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 2012 }
}
NOPS {}
--------------------------------------------------------------
KEY { maynord.2012.tr1774 }
TITLE { An image-to-speech iPad app }
AUTHORS { M. Maynord, J. Tiachunpun, X. Zhu, C. R. Dyer, K.-S. Jun, and J. Rosin }
PUBLISHEDIN { Computer Sciences Department Technical Report 1774,
University of Wisconsin - Madison, July 2012 }
ABSTRACT
{
We describe an iPad app which assists in language acquisition and development. Such an application can be used by clinicians for human developmental disabilities. A user drags images around on the screen. The app generates and speaks random (but sensible) phrases that matches the image interact. For example, if a user drags an image of a squirrel onto an image of a tree, the app may say ``the squirrel ran up the tree.'' A key challenge is the automated creation of ``sensible'' English phrases, which we solve by using a large corpus and machine learning.
}
BIBTEX
{
@techreport{ maynord.2012.tr1774,
author = "Maynord, M. and Tiachunpun, J. and Zhu, X. and Dyer, C. R. and Jun, K.-S. and Rosin, J.",
title = "An image-to-speech iPad app",
institution = "Computer Sciences Department, University of Wisconsin - Madison",
number = 1774,
year = 2012 }
}
NOPS {}
--------------------------------------------------------------
KEY { rosin.2012.tr1697 }
TITLE { The multimodal focused attribute model: A nonparametric Bayesian approach to simultaneous object classification and
attribute discovery}
AUTHORS { J. Rosin, C. R. Dyer, and X. Zhu }
PUBLISHEDIN { Computer Sciences Department Technical Report 1697,
University of Wisconsin - Madison, January 2012 }
ABSTRACT
{
A nonparametric Bayesian model for attribute-based
object recognition and image-based class attribute inference is presented. This model
draws on existing work in Bayesian nonparametrics such as the focused topic model.
The model allows either the classification of objects or the inference of attributes
over known classes (or both simultaneously). Attributes inferred from image datasets
allow an improvement in classification accuracy when combined with attributes from
other sources, including ``layperson'' knowledge and existing inference methods.
}
BIBTEX
{
@techreport{ rosin.2012.tr1697,
author = "Rosin, J. and Dyer, C. R. and Zhu, X.",
title = "The multimodal focused attribute model: A nonparametric Bayesian approach to simultaneous object classification and attribute discovery",
institution = "Computer Sciences Department, University of Wisconsin - Madison",
number = 1697,
year = 2012 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2011 }
--------------------------------------------------------------
KEY { rosin.2011.tr1692 }
TITLE { A Bayesian model for image sense ambiguity in pictorial communication systems }
AUTHORS { J. Rosin, A. Goldberg, X. Zhu, and C. R. Dyer }
PUBLISHEDIN { Computer Sciences Department Technical Report 1692,
University of Wisconsin - Madison, June 2011 }
ABSTRACT
{
Pictorial communication systems use synthesized pictures,
rather than text, to communicate with users. Because such
systems depend on images to convey meanings, it is critical
to understand how a human user perceives the image meaning
(sense). This paper offers an empirical and theoretical
study of how humans perceive image senses. We conduct a
user study with 113 users to elicit their perceived senses on
400 image sets, from which we discover widespread image
sense ambiguities. We examine how the number of images
shown relates to sense ambiguity and discover several significant
patterns. We then propose a Bayesian model to explain
human image perception behaviors, based on a novel
random walk process on a WordNet-like sense hierarchy.
Our model makes qualitative and quantitative predictions that
largely agree with our observations of human perception. It
can explain the "basic level" phenomenon known in psychology,
and suggests a method for image sense disambiguation
in pictorial communication systems.
}
BIBTEX
{
@techreport{ rosin.2011.tr1692,
author = "Rosin, J. and Goldberg, A. and Zhu, X. and Dyer, C. R.",
title = "A Bayesian model for image sense ambiguity in pictorial communication systems",
institution = "Computer Sciences Department, University of Wisconsin - Madison",
number = 1692,
year = 2011 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2010 }
--------------------------------------------------------------
KEY { arora.2010.dvsn }
TITLE { Joint projective invariants for distributed camera networks }
AUTHORS { R. Arora and C. R. Dyer }
PUBLISHEDIN { Distributed Video Sensor Networks, B. Bhanu, C. V. Ravishankar, A. K. Roy-Chowdhury, H. Aghajan, and D. Terzopoulos, eds., Springer, New York, 2010. }
ABSTRACT
{
A novel method is presented for distributed matching across different viewpoints.
The fundamental perspective invariants for curves in the real projective
space are the volume cross-ratios. Probabilistic analysis of projective invariants
shows that they are not unique and therefore not discriminative. However, a
curve in m-dimensional Euclidean space is completely prescribed by the signature
manifold of joint invariants generated by taking all possible combinations
of n points on the projective curve where n is at least m+ 2. Furthermore, submanifolds
given by the projection of the signature manifold also represent the
curve uniquely. Sections of the sub-manifolds that admit large enough variation
of cross ratios are found to be sufficient, statistically, for matching of curves.
Such sectional signatures allow fast computation and matching of features while
keeping the descriptors compact. These features are computed independently at
cameras with different viewpoints and shared, thereby achieving the matching
of objects in the image. Experimental results with simulated as well as real data
are provided.
}
BIBTEX
{
@incollection{ arora.2010.dvsn,
author = "Arora, R. and Dyer, C. R.",
title = "Joint projective invariants for distributed camera networks",
booktitle = "Distributed Video Sensor Networks",
editor = "B. Bhanu, C. V. Ravishankar, A. K. Roy-Chowdhury, H. Aghajan, and D. Terzopoulos",
publisher = "Springer",
address = "New York",
year = 2010 }
}
NOPS {}
--------------------------------------------------------------
KEY { arora.2010.icdsc }
TITLE { Distributed curve matching in camera networks using projective joint invariant signatures }
AUTHORS { R. Arora, C. R. Dyer, Y. H. Hu and N. Boston }
PUBLISHEDIN { Proc. 4th Int. Conf. on Distributed Smart Cameras, 2010. }
ABSTRACT
{
An efficient method based on projective joint invariant signatures
is presented for distributed matching of curves in a
camera network. The fundamental projective joint invariants
for curves in the real projective space are the volume
cross-ratios. A curve in m-dimensional projective space is
represented by a signature manifold comprising n-point projective
joint invariants, where n is at least m + 2. The
signature manifold can be used to establish equivalence of
two curves in projective space. However, without correspondence
between the two curves, matching signature manifolds
is a computational challenge. In this paper we overcome
this challenge by finding discriminative sections of signature
manifolds consistently across varying viewpoints and scoring
the similarity between these sections. This motivates a
simple yet powerful method for distributed curve matching
in a camera network. Experimental results with real data
demonstrate the classification performance of the proposed
algorithm with respect to the size of the sections of the invariant
signature in various noisy conditions.
}
BIBTEX
{
@inproceedings{ arora.2010.icdsc,
author = "Arora, R. and Dyer, C. R. and Hu, Y. H. and Boston, N.",
title = "Distributed curve matching in camera networks using projective joint invariant signatures",
booktitle = "Proc. 4th Int. Conf. on Distributed Smart Cameras",
year = 2010 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2009 }
--------------------------------------------------------------
KEY { goldberg.2009.nips }
TITLE { Toward text-to-picture synthesis }
AUTHORS { A. B. Goldberg, J. Rosin, X. Zhu and C. R. Dyer }
PUBLISHEDIN { Proc. NIPS 2009 Symposium on Assistive Machine Learning for People with Disabilities, 2009. }
ABSTRACT
{
It is estimated that more that 2 million people in the United States have significant communication
impairments that result in them relying on methods other than natural speech alone for communication.
One type of commonly used augmentative and alternative communication system is
pictorial communication software such as SymWriter, which uses a lookup table to transliterate
each word (or common phrase) in a sentence into an icon. This is an example of converting information
between modalities. However, the resulting sequence of icons can be difficult to understand.
We have been developing general-purpose Text-to-Picture (TTP) synthesis algorithms to
improve understandability using machine learning techniques. Our goal is to help users with special
needs, such as the elderly or those with disabilities, to rapidly browse documents through pictorial
summaries. Our TTP system targets general English. This differs from other pictorial
conversion systems that require hand-crafted narrative descriptions of a scene, 3D models,
or special domains. Instead, we use a concatenative or collage approach and present
how machine learning enables the key components of our TTP system.
}
BIBTEX
{
@inproceedings{ goldberg.2009.nips,
author = "Goldberg, A. B. and Rosin, J. and Zhu, X. and Dyer, C. R.",
title = "Toward text-to-picture synthesis",
booktitle = "Proc. NIPS 2009 Symposium on Assistive Machine Learning for People with Disabilities",
year = 2009 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2009.iccv }
TITLE { A study on automatic age estimation using a large database }
AUTHORS { G-D. Guo, G. Mu, Y. Fu, C. R. Dyer and T. S. Huang }
PUBLISHEDIN { Proc. Int. Conf. Computer Vision, 2009. }
ABSTRACT
{
In this paper we study some problems related to human
age estimation using a large database. First, we study the
influence of gender on age estimation based on face representations
that combine biologically-inspired features with
manifold learning techniques. Second, we study age estimation
using smaller gender and age groups rather than on
all ages. Significant error reductions are observed in both
cases. Based on these results, we designed three frameworks
for automatic age estimation that exhibit high performance.
Unlike previous methods that require manual separation
of males and females prior to age estimation, our
work is the first to estimate age automatically on a large
database. Furthermore, a data fusion approach is proposed
using one of the frameworks, which gives an age estimation
error more than 40% smaller than previous methods.
}
BIBTEX
{
@inproceedings{ guo.2009.iccv,
author = "Guo, G-D. and Mu, G. and Fu, Y. and Dyer, C. R. and Huang, T. S.",
title = "A study on automatic age estimation using a large database",
booktitle = "Proc. Int. Conf. Computer Vision",
year = 2009 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2009.hci }
TITLE { Is gender recognition influenced by age? }
AUTHORS { G-D. Guo, C. Dyer, Y. Fu, and T. S. Huang }
PUBLISHEDIN { Proc. Int. Workshop on Human-Computer Interaction, 2009. }
ABSTRACT
{
Gender recognition is important for many applications
including human computer interaction (HCI). This paper
shows that gender recognition accuracy is affected significantly
by the age of the person. Our empirical studies
on a large face database of 8,000 images with ages from
0 to 93 years show that gender classification accuracy on
adult faces can be 10% higher than that on young or senior
faces, evaluated using one of the state-of-the-art methods.
We examine aging effects on human faces, which motivates
us to investigate which features can incorporate shape and
texture variations on faces together with gender encoding.
Based on the aging effects, the local binary pattern (LBP)
and histograms of oriented gradients (HOG) methods are
evaluated for gender characterization with age variation.
We also investigate a biologically-inspired method for gender
recognition. Overall, no matter what methods are used,
the accuracies on adult faces are consistently higher than
on young or senior faces. This new finding suggests new
efforts in both psychological studies and computational visual
recognition for the purpose of HCI applications.
}
BIBTEX
{
@inproceedings{ guo.2009.hci,
author = "Guo, G-D. and Dyer, C. R. and Fu, Y. and Huang, T. S.",
title = "Is gender recognition influenced by age?",
booktitle = "Proc. Int. Workshop on Human-Computer Interaction",
year = 2009 }
}
NOPS {}
--------------------------------------------------------------
KEY { singh.2009.cvpr }
TITLE { Half-Integrality based Algorithms for Cosegmentation of Images }
AUTHORS { L. Mukherjee, V. Singh and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 2009. }
ABSTRACT
{
We study the cosegmentation problem where the objective
is to segment the same object (i.e., region) from a pair
of images. The segmentation for each image can be cast
using a partitioning/segmentation function with an additional
constraint that seeks to make the histograms of the
segmented regions (based on intensity and texture features)
similar. Using Markov Random Field (MRF) energy terms
for the simultaneous segmentation of the images together
with histogram consistency requirements using the squared
L2 (rather than L1) distance, after linearization and adjustments,
yields an optimization model with some interesting
combinatorial properties. We discuss these properties
which are closely related to certain relaxation strategies recently
introduced in computer vision. Finally, we show experimental
results of the proposed approach.
}
BIBTEX
{
@inproceedings{ Singh:2009:cvpr,
author = "L. Mukherjee, V. Singh and C. R. Dyer",
title = "Half-Integrality based Algorithms for Cosegmentation of Images",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 2009 }
}
NOPS {}
--------------------------------------------------------------
KEY { arora.2009.icassp }
TITLE { Estimating Correspondence between Multiple Cameras using Joint Invariants }
AUTHORS { R. Arora, Y. H. Hu and C. R. Dyer }
PUBLISHEDIN { Proc. Int. Conf. Acoustics, Speech, and Signal Processing, 2009. }
ABSTRACT
{
The joint invariants of the projective group PSL(3,R) on
RP2, the five-point volume cross-ratios, are studied to address
the problem of correspondence in a camera network.
The distribution of cross-ratios over the unit square as well as
in a small local-neighborhood of a reference point are found
to have a heavy tail. No cross ratio value is unique but the
collection of five point cross ratios generated by taking all
possible combination of five points completely prescribes the
curve. Sections of the signature submanifold that admit large
enough variation of cross ratios are found to be sufficient in
providing correspondence across wide perspectives. Such invariant
signatures may be collected independently at cameras
with different viewpoints and shared, thereby achieving the
registration of objects in the image. Experimental results with
license plate database are provided.
}
BIBTEX
{
@inproceedings{ Arora:2009:icassp,
author = "R. Arora, Y. H. Hu and C. R. Dyer",
title = "Estimating Correspondence between Multiple Cameras using Joint Invariants",
booktitle = "Proc. Int. Conf. Acoustics, Speech, and Signal Processing",
year = 2009 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2008 }
--------------------------------------------------------------
KEY { guo.2008.icpr }
TITLE { Head Pose Estimation: Classification or Regression? }
AUTHORS { G-D. Guo, Y. Fu, C. R. Dyer and T. S. Huang }
PUBLISHEDIN { Proc. 19th Int. Conf. Pattern Recognition, 2008. }
ABSTRACT
{
Head pose estimation has many useful applications
in practice. How to estimate the head pose automatically
and robustly is still a challenging problem. In
pose estimation, different pose angles can be used as
regression values or viewed as different class labels.
Thus a question is raised in our study: which is proper
for pose estimation -- classification or regression? We
investigate representative classification and regression
methods on the same problem to see any difference. A
method that combines regression and classification approaches
is also examined. Preliminary experiments
show some interesting results which might prompt further
exploration of related issues in pose estimation.
}
BIBTEX
{
@inproceedings{ Guo:2008:icpr,
author = "G-D. Guo, Y. Fu, C. R. Dyer and T. S. Huang",
title = "Head Pose Estimation: Classification or Regression?",
booktitle = "Proc. 19th Int. Conf. Pattern Recognition",
year = 2008 }
}
NOPS {}
--------------------------------------------------------------
KEY { lai.2008.isvc }
TITLE { Efficient Schemes for Monte Carlo Markov Chain Algorithms in Global Illumination }
AUTHORS { Y.-C. Lai, F. Liu, L. Zhang, and C. R. Dyer }
PUBLISHEDIN { Proc. 4th International Symposium on Visual Computing, 2008. }
ABSTRACT
{
Current MCMC algorithms are limited from achieving high
rendering efficiency due to possibly high failure rates in caustics perturbations
and stratified exploration of the image plane. In this paper we
improve the MCMC approach significantly by introducing new lens perturbation
and new path-generation methods. The new lens perturbation
method simplifies the computation and control of caustics perturbation
and can increase the perturbation success rate. The new path-generation
methods aim to concentrate more computation on "high perceptual variance"
regions and "hard-to-find-but-important" paths. We implement
these schemes in the Population Monte Carlo Energy Redistribution
framework to demonstrate the effectiveness of these improvements. In
addition, we discuss how to add these new schemes into the Energy
Redistribution Path Tracing and Metropolis Light Transport algorithms.
Our results show that rendering efficiency is improved with these new
schemes.
}
BIBTEX
{
@inproceedings{ Lai:2008:isvc,
author = "Y.-C. Lai, F. Liu, L. Zhang, and C. R. Dyer",
title = "Efficient Schemes for Monte Carlo Markov Chain Algorithms in Global Illumination",
booktitle = "Proc. 4th International Symposium on Visual Computing",
year = 2008 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2008.tip }
TITLE { Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression }
AUTHORS { G-D. Guo, Y. Fu, C. R. Dyer and T. S. Huang }
PUBLISHEDIN { IEEE Trans. Image Processing **17**(7), 2008, 1178-1188. }
ABSTRACT
{
Estimating human age automatically via facial image
analysis has lots of potential real-world applications, such as
human computer interaction and multimedia communication.
However, it is still a challenging problem for the existing computer
vision systems to automatically and effectively estimate human
ages. The aging process is determined by not only the person's
gene, but also many external factors, such as health, living style,
living location, and weather conditions. Males and females may
also age differently. The current age estimation performance is
still not good enough for practical use and more effort has to be
put into this research direction. In this paper, we introduce the
age manifold learning scheme for extracting face aging features
and design a locally adjusted robust regressor for learning and
prediction of human ages. The novel approach improves the age
estimation accuracy significantly over all previous methods. The
merit of the proposed approaches for image-based age estimation
is shown by extensive experiments on a large internal age database
and the public available FG-NET database.
}
BIBTEX
{
@article{ Guo:2008:tip,
author = "G. Guo, Y. Fu, C. R. Dyer and T. S. Huang",
title = "Image-Based Human Age Estimation by Manifold Learning and Locally Adjusted Robust Regression",
journal = "IEEE Trans. Image Processing",
volume = 17,
number = 7,
pages = {1178--1188},
year = 2008 }
}
NOPS {}
--------------------------------------------------------------
KEY { goldberg.2008.conll }
TITLE { Easy as ABC? Facilitating Pictorial Communication via Semantically Enhanced Layout}
AUTHORS { A. B. Goldberg, X. Zhu, C. R. Dyer, M. Eldawy and L. Heng }
PUBLISHEDIN { Proc. 12th Conf. Computational Natural Language Learning, 2008 }
ABSTRACT
{
Pictorial communication systems convert
natural language text into pictures to assist
people with limited literacy. We define
a novel and challenging problem: picture
layout optimization. Given an input sentence,
we seek the optimal way to lay out
word icons such that the resulting picture
best conveys the meaning of the input sentence.
To this end, we propose a family
of intuitive "ABC" layouts, which organize
icons in three groups. We formalize layout
optimization as a sequence labeling problem,
employing conditional random fields
as our machine learning method. Enabled
by novel applications of semantic role labeling
and syntactic parsing, our trained
model makes layout predictions that agree
well with human annotators. In addition,
we conduct a user study to compare our
ABC layout versus the standard linear layout.
The study shows that our semantically
enhanced layout is preferred by non-native
speakers, suggesting it has the potential to
be useful for people with other forms of
limited literacy, too.
}
BIBTEX
{
@inproceedings{ Goldberg:2008:conll,
author = "A. B. Goldberg, X. Zhu, C. R. Dyer, M. Eldawy and L. Heng",
title = "Easy as ABC? Facilitating Pictorial Communication via Semantically Enhanced Layout",
booktitle = "Proc. 12th Conf. Computational Natural Language Learning",
year = 2008
}
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2008.slam }
TITLE { A Probabilistic Fusion Approach to Human Age Prediction }
AUTHORS { G-D. Guo, Y. Fu, C. R. Dyer and T. S. Huang }
PUBLISHEDIN { Proc. 3rd International Workshop on Semantic Learning and Applications in Multimedia, 2008 }
ABSTRACT
{
Human age prediction is useful for many applications.
The age information could be used as a kind of semantic
knowledge for multimedia content analysis and understanding.
In this paper we propose a Probabilistic Fusion Approach
(PFA) that produces a high performance estimator
for human age prediction. The PFA framework fuses a regressor
and a classifier. We derive the predictor based on
Bayes’ rule without the mutual independence assumption
that is very common for traditional classifier combination
methods. Using a sequential fusion strategy, the predictor
reduces age estimation errors significantly. Experiments on
the large UIUC-IFP-Y aging database and the FG-NET aging
database show the merit of the proposed approach to
human age prediction.
}
BIBTEX
{
@inproceedings{ Guo:2008:slam,
author = "G-D. Guo, Y. Fu, C. R. Dyer and T. S. Huang",
title = "A Probabilistic Fusion Approach to Human Age Prediction",
booktitle = "Proc. 3rd International Workshop on Semantic Learning and Applications in Multimedia",
year = 2008
}
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2008.wacv }
TITLE { Locally Adjusted Robust Regression for Human Age Estimation }
AUTHORS { G-D. Guo, Y. Fu, T. S. Huang and C. R. Dyer }
PUBLISHEDIN { Proc. Workshop on Application of Computer Vision, 2008 }
ABSTRACT
{
Automatic human age estimation has considerable potential
applications in human computer interaction and
multimedia communication. However, the age estimation
problem is challenging. We design a locally adjusted robust
regressor (LARR) for learning and prediction of human
ages. The novel approach reduces the age estimation errors
significantly over all previous methods. Experiments on two
aging databases show the success of the proposed method
for human aging estimation.
}
BIBTEX
{
@inproceedings{ Guo:2008:wacv,
author = "G-D. Guo, Y. Fu, T. S. Huang and C. R. Dyer",
title = "Locally Adjusted Robust Regression for Human Age Estimation",
booktitle = "Proc. Workshop on Application of Computer Vision",
year = 2008
}
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2007 }
--------------------------------------------------------------
KEY { guo.2007.cvprip }
TITLE { Face Cyclographs for Recognition }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN { Proc. 10th Joint Conference on Information Sciences, 2007, 923-929 }
ABSTRACT
{
A new representation of faces, called face cyclographs, is introduced for face
recognition that incorporates all views of a rotating face into a single image.
The main motivation for this representation comes from recent psychophysi-
cal studies that show that humans use continuous image sequences in object
recognition. Face cyclographs are created by slicing spatiotemporal face vol-
umes that are constructed automatically based on real-time face detection. This
representation is a compact, multiperspective, spatiotemporal description. To
use face cyclographs for recognition, a dynamic programming based algorithm
is developed. The motion trajectory image of the eye slice is used to analyze
the approximate single-axis motion and normalize the face cyclographs. Using
normalized face cyclographs can speed up the matching process. Experimental
results on more than 100 face videos show that this representation efficiently
encodes the continuous views of faces.
}
BIBTEX
{
@inproceedings{ Guo:2007:cvprip,
author = "G-D. Guo and C. R. Dyer",
title = "Face Cyclographs for Recognition",
booktitle = "Proc. 10th Joint Conference on Information Sciences",
year = 2007,
pages = {923--929}
}
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2007.patches }
TITLE { Patch-based Image Correlation with Rapid Filtering }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN { Proc. 2nd Beyond Patches Workshop, 2007 }
ABSTRACT
{
This paper describes a patch-based approach for rapid
image correlation or template matching. By representing a
template image with an ensemble of patches, the method is
robust with respect to variations such as local appearance
variation, partial occlusion, and scale changes. Rectangle
filters are applied to each image patch for fast filtering
based on the integral image representation. A new method
is developed for feature dimension reduction by detecting
the "salient" image structures given a single image. Experiments
on a variety images show the success of the method
in dealing with different variations in the test images. In
terms of computation time, the approach is faster than traditional
methods by up to two orders of magnitude and is at
least three times faster than a fast implementation of normalized
cross correlation.
}
BIBTEX
{
@inproceedings{ Guo:2007:patches,
author = "G-D. Guo and C. R. Dyer",
title = "Patch-based Image Correlation with Rapid Filtering",
booktitle = "Proc. 2nd Beyond Patches Workshop",
year = 2007,
# pages = {xxx--xxx}
}
}
NOPS {}
--------------------------------------------------------------
KEY { zhu.2007.aaai }
TITLE { A Text-to-Picture Synthesis System for Augmenting Communication }
AUTHORS { X. Zhu, A. B. Goldberg, M. Eldawy, C. R. Dyer and B. Strock }
PUBLISHEDIN { Proc. 22nd AAAI Conf. on Artificial Intelligence, 2007, 1590-1595}
ABSTRACT
{
We present a novel Text-to-Picture system that synthesizes
a picture from general, unrestricted natural language
text. The process is analogous to Text-to-Speech
synthesis, but with pictorial output that conveys the gist
of the text. Our system integrates multiple AI components,
including natural language processing, computer
vision, computer graphics, and machine learning. We
present an integration framework that combines these
components by first identifying informative and "picturable"
text units, then searching for the most likely
image parts conditioned on the text, and finally optimizing
the picture layout conditioned on both the text and
image parts. The effectiveness of our system is assessed
in two user studies using children's books and news articles.
Experiments show that the synthesized pictures
convey as much information about children's stories as
the original artists' illustrations, and much more information
about news articles than their original photos
alone. These results suggest that Text-to-Picture synthesis
has great potential in augmenting human-computer
and human-human communication modalities, with applications
in education and health care, among others.
}
BIBTEX
{
@inproceedings{ Zhu:2007:aaai,
author = "X. Zhu, A. B. Goldberg, M. Eldawy, C. R. Dyer and B. Strock",
title = "A Text-to-Picture Synthesis System for Augmenting Communication",
booktitle = "Proc. 22nd AAAI Conf. on Artificial Intelligence",
year = 2007,
pages = {1590--1595}
}
}
NOPS {}
--------------------------------------------------------------
KEY { lai.2007.egsr }
TITLE { Photorealistic Image Rendering with Population Monte Carlo Energy Redistribution }
AUTHORS { Y.-C. Lai, S. Fan, S. Chenney, and C. R. Dyer }
PUBLISHEDIN { Proc. Eurographics Symposium on Rendering, 2007, 287-296}
ABSTRACT
{
This work presents a novel global illumination algorithm which concentrates computation on important light
transport paths and automatically adjusts energy distributed area for each light transport path.
We adapt statistical
framework of Population Monte Carlo into global illumination to improve rendering efficiency. Information
collected in previous iterations is used to guide subsequent iterations by adapting the kernel
function to approximate
the target distribution without introducing bias into the final result. Based on this framework, our algorithm
automatically adapts the amount of energy redistribution at different pixels and the area
over which energy is
redistributed. Our results show that the efficiency can be improved by exploring the
correlated information among
light transport paths.
}
BIBTEX
{
@inproceedings{ Lai:2007:egsr,
author = "Y.-C. Lai, S. Fan, S. Chenney, and C. R. Dyer",
title = "Photorealistic Image Rendering with Population Monte Carlo Energy Redistribution",
booktitle = "Proc. Eurographics Symposium on Rendering",
year = 2007,
pages = {287--296}
}
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2006 }
--------------------------------------------------------------
KEY { liu.2006.umb }
TITLE { Segmentation of Elastographic Images using a Coarse-to-Fine Active Contour Model }
AUTHORS { W. Liu, J. A. Zagzebski, T. Varghese, C. R. Dyer, U. Techavipoo and T. J. Hall }
PUBLISHEDIN { Ultrasound in Medicine and Biology **32**(3), 2006, 397-408. }
ABSTRACT
{
Delineation of radiofrequency-ablation-induced coagulation (thermal lesion) boundaries is an
important clinical problem that is not well addressed by conventional imaging modalities.
Elastography, which produces images of the local strain after small, externally applied compressions,
can be used for visualization of thermal coagulations. This paper presents an automated
segmentation approach for thermal coagulations on 3-D elastographic data to obtain both area
and volume information rapidly. The approach consists of a coarse-to-fine method for active
contour initialization and a gradient vector flow, active contour model for deformable contour
optimization with the help of prior knowledge of the geometry of general thermal coagulations.
The performance of the algorithm has been shown to be comparable to manual delineation of
coagulations on elastograms by medical physicists (r = 0.99 for volumes of 36 radiofrequency-induced
coagulations). Furthermore, the automatic algorithm applied to elastograms yielded results that
agreed with manual delineation of coagulations on pathology images (r = 0.96 for the same 36 lesions).
This algorithm has also been successfully applied on in vivo elastograms.
}
BIBTEX
{
@article{ Liu:2006:umb,
author = "W. Liu, J. A. Zagzebski, T. Varghese, C. R. Dyer, U. Techavipoo and T. J. Hall",
title = "Segmentation of Elastographic Images using a Coarse-to-Fine Active Contour Model",
journal = "Ultrasound in Medicine and Biology",
volume = 32,
number = 3,
pages = {397--408},
year = 2006 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2006.thesis }
TITLE { Face, Expression, and Iris Recognition Using Learning-based Approaches }
AUTHORS { Guodong Guo }
PUBLISHEDIN
{
Ph.D. Dissertation, University of Wisconsin - Madison, August 2006.
}
ABSTRACT
{
This thesis investigates the problem of facial image analysis.
Human faces contain a lot of information that is useful for many
applications. For instance, the face and iris are important
biometric features for security applications. Facial activity
analysis such as face expression recognition is helpful for
perceptual user interfaces. Developing new methods to improve
recognition performance is a major concern in this thesis.

In approaching the recognition problem of facial image analysis, the key idea is to use learning-based methods whenever possible. For face recognition, we propose a face cyclograph representation to encode continuous views of faces, motivated by psychophysical studies on human object recognition. For face expression recognition, we apply a machine learning technique to solve the feature selection and classifier training problems simultaneously, even in the small sample case.

Iris recognition has high recognition accuracy among biometric features, however, there are still some issues to address to make more practical use of the iris. One major problem is how to capture iris images automatically without user interaction, i.e., not asking users to adjust their eye positions. Towards this goal, a two-camera system consisting of a face camera and an iris camera is designed and implemented based on facial landmark detection. Another problem is iris localization. A new type of feature based on texture difference is incorporated into an objective function in addition to image gradient. By minimizing the objective function, the iris localization performance can be improved significantly. Finally, a method is proposed for iris encoding using a set of specially designed filters. These filters can take advantage of efficient integral image computation methods so that the filtering process is fast no matter how big the filters are. } BIBTEX { @phdthesis{ Guo:2006:phd, author = "Guodong Guo", title = "Face, Expression, and Iris Recognition Using Learning-based Approaches", school = "University of Wisconsin - Madison", address = "Madison, WI", year = 2006 } } NOPS {} -------------------------------------------------------------- KEY { guo.2006.tr1555 } TITLE { Face Cyclographs for Recognition } AUTHORS { G-D. Guo and C. R. Dyer } PUBLISHEDIN { Computer Sciences Department Technical Report 1555, University of Wisconsin - Madison, March 2006. } ABSTRACT { A new representation of faces, called face cyclographs, is introduced for face recognition that incorporates all views of a rotating face into a single image. The main motivation for this representation comes from recent psychophysical studies that show that humans use continuous image sequences in object recognition. Face cyclographs are created by slicing spatiotemporal face volumes that are constructed automatically based on real-time face detection. This representation is a compact, multiperspective, spatiotemporal description. To use face cyclographs for recognition, a dynamic programming based algorithm is developed. The motion trajectory image of the eye slice is used to analyze the approximate single-axis motion and normalize the face cyclographs. Using normalized face cyclographs can speed up the matching process. Experimental results on more than 100 face videos show that this representation efficiently encodes the continuous views of faces and improves face recognition performance over view-based methods. } BIBTEX { @techreport{ Guo:2006:tr1555, author = "Guodong Guo and Charles R. Dyer", title = "Face Cyclographs for Recognition", institution = "Computer Sciences Department, University of Wisconsin-Madison", number = 1555, year = 2006 } } NOPS {} -------------------------------------------------------------- KEY { fan.2006.thesis } TITLE { Sequential Monte Carlo Methods for Physically Based Rendering } AUTHORS { Shaohua Fan } PUBLISHEDIN { Ph.D. Dissertation, University of Wisconsin - Madison, August 2006. } ABSTRACT { The goal of global illumination is to generate photo-realistic images by taking into account all the light interactions in the scene. It does so by simulating light transport behaviors based on physical principles. The main challenge of global illumination is that simulating the complex light interreflections is very expensive. In this dissertation, a novel statistical framework for physically based rendering in computer graphics is presented based on sequential Monte Carlo (SMC) methods. This framework can substantially improve the efficiency of physically based rendering by adapting and reusing the light path samples without introducing bias. Applications of the framework to a variety of problems in global illumination are demonstrated.

For the task of photo-realistic rendering, only light paths that reach the image plane are
important because only those paths contribute to the final image. A visual importance-driven
algorithm is proposed to generate visually important paths. The photons along those paths
are also cached in photon maps for further reuse. To handle difficult paths in the path space,
a technique is presented for including user-selected paths in the sampling process.
Then, a more general statistical method for light path sample adaptation and reuse is
studied in the context of sequential Monte Carlo. Based on the population Monte Carlo method,
an unbiased adaptive sampling method is presented that works on a population of samples.
The samples are sampled and resampled through distributions that are modified over time.
Information found at one iteration can be used to guide subsequent iterations without
introducing bias in the final result. After obtaining samples from multiple distributions,
an optimal control variate algorithm is developed that allows samples from multiple
distribution functions to be combined optimally.
}
BIBTEX
{
@phdthesis{ Fan:2006:phd,
author = "Shaohua Fan",
title = "Sequential Monte Carlo Methods for Physically Based Rendering",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 2006 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2005 }
--------------------------------------------------------------
KEY { guo.2005.smc }
TITLE { Learning from Examples in the Small Sample Case: Face Expression Recognition }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN { IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics **35**(3), 2005, 479-488. }
ABSTRACT
{
Example-based learning for computer vision can be difficult when a large number
of examples to represent each pattern or object class is not available.
In such situations, learning from a small number of samples is of practical value.
To study this issue, the task of face expression recognition with a small number
of training images of each expression is considered. A new technique based on
linear programming for both feature selection and classifier training is introduced.
A pairwise framework for feature selection, instead of using all classes
simultaneously, is presented. Experimental results compare the method with three
others: a simplified Bayes classifier, support vector machine, and AdaBoost.
Finally, each algorithm is analyzed and a new categorization of these algorithms
is given, especially for learning from examples in the small sample case.
}
BIBTEX
{
@article{ Guo:2005:smc,
author = "Guodong Guo and Charles R. Dyer",
title = "Learning from Examples in the Small Sample Case: Face Expression Recognition",
journal = "IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics",
volume = 35,
number = 3,
pages = {479--488},
year = 2005 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2005.cvpr }
TITLE { Linear Combination Representation for Outlier Detection in Motion Tracking }
AUTHORS { G-D. Guo, C. R. Dyer, and Z. Zhang }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., Vol. 2, 2005, 274-281. }
ABSTRACT
{
In this paper we show that Ullman and Basri’s linear
combination (LC) representation, which was originally proposed
for alignment-based object recognition, can be used
for outlier detection in motion tracking with an affine camera.
For this task LC can be realized either on image frames
or feature trajectories, and therefore two methods are developed
which we call linear combination of frames and
linear combination of trajectories. For robust estimation
of the linear combination coefficients, the support vector
regression (SVR) algorithm is used and compared with the
RANSAC method. SVR based on quadratic programming
optimization can efficiently deal with more than 50 percent
outliers and delivers more consistent results than RANSAC
in our experiments. The linear combination representation
can use SVR in a straightforward manner while previous
factorization-based or subspace separation methods cannot.
Experimental results are presented using real video
sequences to demonstrate the effectiveness of our LC + SVR
approaches, including a quantitative comparison of SVR
and RANSAC.
}
BIBTEX
{
@incollection{ Guo:2005:cvpr,
author = "Guodong Guo and Charles R. Dyer and Zhengyou Zhang",
title = "Linear Combination Representation for Outlier Detection in Motion Tracking",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
volume = 2,
pages = {274--281},
year = 2005 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2005.merl-tr2005-044 }
TITLE { A System for Automatic Iris Capturing }
AUTHORS { G-D. Guo, M. Jones and P. Beardsley }
PUBLISHEDIN
{
Technical Report TR2005-044
Mitsubishi Electric Research Laboratories
Cambridge, Massachusetts, June 2005.
}
ABSTRACT
{
Biometrics is increasingly important in security applications. Iris recognition provides
the greatest accuracy among known biometrics. The accuracy of iris recognition is, for example,
much greater than face recognition and fingerprint recognition. However, it is not trivial to
capture iris images in practice, and usually the users need to adjust their eye positions for
iris image acquisition (e.g., the classical Daugman's and Wildes systems). This paper describes
a new system to capture iris images automatically without user interaction. It works at a
distance of over l meter to the users. Experimental results demonstrate the performance of the system.
}
BIBTEX
{
@techreport{ Guo:2005:tr2005-044,
author = "Guodong Guo, Michael Jones and Paul Beardsley",
title = "A System for Automatic Iris Capturing",
institution = "Mitsubishi Electric Research Laboratories, Cambridge, Mass.",
number = TR2005-044,
year = "June 2005" }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2004 }
--------------------------------------------------------------
KEY { liu.2004.ultrasonics }
TITLE { Automated Thermal Coagulation Segmentation of Three-Dimensional Elastographic Imaging using an Active Contour Model }
AUTHORS { W. Liu, J. A. Zagzebski, T. Varghese, C. R. Dyer, and U. Techavipoo }
PUBLISHEDIN { Proc. IEEE Ultrasonics Symposium, 2004, 36-39. }
ABSTRACT
{
Delineation of RF-ablator induced coagulation
(thermal lesion) boundaries is an important clinical problem not
well addressed by conventional imaging modalities. Automation
of this process is certainly desirable. Elastography that estimates
and images the local strain corresponding to small, externally
applied, quasi-static compressions can be used for visualization of
thermal coagulations. Several studies have demonstrated that
coagulation volumes computed from multiple planar slices
through the region of interest are more accurate than volumes
estimated assuming simple shapes and incorporating single or
orthogonal diameter estimates. This paper presents an
automated segmentation approach for thermal coagulations on
three-dimensional elastographic data to obtain both area and
volume information. This approach consists of a coarse-to-fine
method for active contour initialization and a gradient vector
flow active contour model for deformable contour optimization
with the help of prior knowledge of the geometry of general
thermal coagulations. The performance of the proposed
algorithm is shown to be comparable to manual delineation by
medical physicists (r = 0.99 for 36 RF-induced coagulations). The
correlation coefficient of the coagulation volume between autosegmented
elastography and manually-delineated pathology is
0.96.
}
BIBTEX
{
@incollection{ Liu:2004:ultrasonics,
author = "Wu Liu and J. A. Zagzebski and T. Varghese and C. R. Dyer and U. Techavipoo",
title = "Automated Thermal Coagulation Segmentation of Three-Dimensional Elastographic Imaging using an Active Contour Model",
booktitle = "Proc. IEEE Ultrasonics Symposium",
pages = {36--39},
year = 2004 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2004.tr1501 }
TITLE { Recognizing Faces from Head Rotation }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1501,
University of Wisconsin - Madison, May 2004.
}
ABSTRACT
{
A new approach for recognizing human faces is presented that uses video sequences of natural,
uncontrolled head rotations to capture face motion and dynamic appearance characteristics.
Unlike traditional methods for face recognition that utilize one or a few static views,
video is used for both training the face recognition system and for recognizing test faces.
An algorithm is described that takes an uncalibrated video sequence and extracts the angular
rotation of the head in each frame relative to the initial frame. A cropped window of the moving
face is also computed, providing a dynamic appearance representation of the face together with
the head motion description. Face recognition accuracy using this representation of rotating
faces is shown for a small face video database, demonstrating the promise of the method.
}
BIBTEX
{
@techreport{ Guo:2006:tr1501,
author = "Guodong Guo and Charles R. Dyer",
title = "Recognizing Faces from Head Rotation",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1501,
year = 2004 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2004.tr1502 }
TITLE { Spatial Resolution Enhancement of Video Using Still Images }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1502,
University of Wisconsin - Madison, October 2004.
}
ABSTRACT
{
Images captured by digital video cameras usually have lower spatial resolution than digital
still cameras. This paper addresses the problem of combining images from digital still cameras
and video cameras to generate a video sequence with higher resolution than the original video.
A method is presented for accomplishing this goal and experimental results are shown that
demonstrate its effectiveness.
}
BIBTEX
{
@techreport{ Guo:2004:tr1502,
author = "Guodong Guo and Charles R. Dyer",
title = "Spatial Resolution Enhancement of Video Using Still Images",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1502,
year = 2004 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2003 }
--------------------------------------------------------------
KEY { fan.2003.miccai }
TITLE { An Automatic System for Classification of Nuclear Sclerosis from Slit-Lamp Photographs }
AUTHORS { S. Fan, C. R. Dyer, L. Hubbard, and B. Klein }
PUBLISHEDIN { Proc. 6th Int. Conf. on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2003), (Lecture Notes in Computer Science, Vol. 2878),
R. Ellis and T. Peters, eds., Springer, Berlin, 2003, 592-601. }
ABSTRACT
{
A robust and automatic system has been developed to detect the visual
axis and extract important feature landmarks from slit-lamp photographs,
and objectively grade the severity of nuclear sclerosis based on the
intensities of those land-marks. Using linear regression, we first
select the features that play important roles in classification, and
then fit a linear grading function. We evaluated the grading function
using human grades as error bounds for "ground truth" grades, and
compared the machine grades with the human grades. As expected, the
automatic sys-tem significantly speeds up the process of grading, and
grades computed are consistent and reproducible. Machine grading time
for one image is less than 2 seconds on a Pentium III 996MHz machine
while human grading takes about 2 minutes. Statis-tical results show
that the predicted grades by the system are very reliable. For the
testing set of 141 images, with correct grading defined by a tolerance
of one grade level difference from the human grade, the automated system
has a grading accuracy of 95.8% based on the AREDS grading scale.
}
BIBTEX
{
@incollection{ Fan:2003:miccai,
author = "Shaohua Fan and Charles R. Dyer and Larry Hubbard and Barbara Klein",
title = "An Automatic System for Classification of Nuclear Sclerosis from Slit-Lamp Photographs",
booktitle = "Proc. 6th Int. Conf. on Medical Image Computing and Computer-Assisted Intervention",
series = "Lecture Notes in Computer Science, Vol. 2878",
editor = "R. Ellis and T. Peters",
publisher = "Springer",
address = "Berlin",
pages = {592--601},
year = 2003 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2003.cvpr }
TITLE { Simultaneous Feature Selection and Classifier Training via Linear Programming: A Case Study for Face Expression Recognition }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., Vol. I, 2003, 346-352. }
ABSTRACT
{
A linear programming technique is introduced that jointly performs
feature selection and classifier training so that a subset of features
is optimally selected together with the classifier. Because
traditional classification methods in computer vision have used a
two-step approach: feature selection followed by classifier training,
feature selection has often been ad hoc, using heuristics or requiring
a time-consuming forward and backward search process. Moreover, it is
difficult to determine which features to use and how many features to
use when these two steps are separated. The linear programming
technique used in this paper, which we call feature selection via
linear programming (FSLP), can determine the number of features and
which features to use in the resulting classification function based
on recent results in optimization. We analyze why FSLP can avoid the
curse of dimensionality problem based on margin analysis. As one
demonstration of the performance of this FSLP technique for computer
vision tasks, we apply it to the problem of face expression
recognition. Recognition accuracy is compared with results using
Support Vector Machines, the AdaBoost algorithm, and a Bayes
classifier.
}
BIBTEX
{
@inproceedings{ Guo:2003:cvpr,
author = "Guodong Guo and Charles R. Dyer",
title = "Simultaneous Feature Selection and Classifier Training via Linear Programming: A Case Study for Face Expression Recognition",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
volume = I,
pages = {346--352},
year = 2003 }
}
NOPS {}
--------------------------------------------------------------
KEY { fan.2003.tr1495 }
TITLE { Quantification and Correction of Iris Color }
AUTHORS { S. Fan and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1495,
University of Wisconsin - Madison, December 2003.
}
ABSTRACT
{
A system has been developed that automatically extracts the iris region from photographs,
computes the iris color in CIE u´v´ diagram color space, and corrects the color based on
a standard calibration target. This system has advantages over previous manual methods
in that it (1) is much less time-consuming because it is fully automatic,
(2) corrects for the variability inherent in a film-based photographic process,
(3) allows for greater objectivity and reproducibility, and (4) uses a mathematical
foundation that enables quantification and statistical analysis.
Preliminary experimental results show that color correction significantly
improves grading accuracy.
}
BIBTEX
{
@techreport{ Fan:2003:tr1495,
author = "Shaohua Fan and Charles R. Dyer",
title = "Quantification and Correction of Iris Color",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1495,
year = 2003 }
}
NOPS {}
--------------------------------------------------------------
KEY { manning.2003.thesis }
TITLE { Screw-Transform Manifolds for Camera Self Calibration }
AUTHORS { R. A. Manning }
PUBLISHEDIN
{
Ph.D. Dissertation, University of Wisconsin - Madison, September 2003.
}
ABSTRACT
{
This dissertation concerns the mathematical theory of screw-transform
manifolds and their use in camera self calibration. A camera's
calibration is the function that maps 3D scene points to 2D image
points, e.g., in photographs taken by the camera. Between every two
photographs taken from different positions there exists a pairwise
constraint called the "fundamental matrix" which can be computed
directly from the images. When the two photographs are captured by
the same camera, the fundamental matrix induces a surface in
calibration space called a "screw-transform manifold." This
manifold represents every possible internal calibration for the
camera. By acquiring several different pairwise fundamental matrices,
it is possible to compute several different screw-transform manifolds;
however, the internal calibration of the camera must be a member of
each such manifold and hence, by finding the intersection point of all
the manifolds, the camera's calibration can be determined. The
process of determining calibration directly from images taken by a
camera is called "self calibration."

The contributions of this dissertation include the theory of screw-transform manifolds and three original algorithms for determining the mutual intersection points of a collection of manifolds. While many papers have been written on self calibration, almost all previous methods posed their solutions as the global minima of an objective function. However, performing global optimization is problematic; it is easy to locate a local minimum without finding the global minimum, and in some cases the attraction basin of the global minimum is so small that the algorithm must essentially "guess" the solution in order to find it. One of the new approaches created as part of this dissertation, called STM-SURFIT, avoids optimization altogether but nevertheless locates all global minima (i.e., solutions to the internal calibration problem) in a single pass, running in under 1 second on a modern home computer. This general approach to avoiding the difficulties of optimization may have wider applicability beyond camera calibration.

A tutorial on multiview geometry that assumes only knowledge of linear algebra is included to provide the necessary mathematical background. The related history and previous work on self calibration and image-based rendering is also presented. As part of the theory of screw-transform manifolds, a theorem is introduced that partitions monocular view pairs into six categories based on the underlying screw displacement of the camera and provides a simple test for determining category. In addition, some methods for self calibration and image-based rendering for dynamic scenes are presented. These latter image-based rendering techniques do not require camera calibration but are limited in applicability, thus adding to the growing body of evidence that camera calibration is necessary for useful image-based rendering. } BIBTEX { @phdthesis{ Manning:2003:phd, author = "Russell A. Manning", title = "Screw-Transform Manifolds for Camera Self Calibration", school = "University of Wisconsin - Madison", address = "Madison, WI", year = 2003 } } -------------------------------------------------------------- KEY { manning.2003.tr1490 } TITLE { Research on Self Calibration Without Minimization } AUTHORS { R. A. Manning and C. R. Dyer } PUBLISHEDIN { Computer Sciences Department Technical Report 1490, University of Wisconsin - Madison, February 2003. } ABSTRACT { In this paper we present a new metric camera self-calibration algorithm that does not require the global minimization of an error function and can produce all legal solutions to the three-camera self-calibration problem in a single pass. By contrast, virtually all previous self-calibration algorithms rely on nonlinear global optimization unless special assumptions are made about the camera or its motion. The key drawback to global-optimization-based methods is that, for nontrivial error functions, they can run indefinitely. Therefore, because our new algorithm produces all solutions quickly and in a fixed amount of time, it is arguably the fastest self-calibration algorithm in existence. In addition, our algorithm makes it possible to determine experimentally the number of solutions to the three-camera self-calibration problem; an upper-bound of 21 was given by Schaffilitzky, but our experiments show this number is more typically 1 or 2. Finally, because our algorithm runs very quickly and requires only the theoretical minimum of three camera views, it can be used in conjunction with RANSAC for great robustness to noise when more than three views are available. } BIBTEX { @techreport{ Manning:2003:tr:b, author = "Russell A. Manning and Charles R. Dyer", title = "Research on Self Calibration Without Minimization", institution = "Computer Sciences Department, University of Wisconsin-Madison", number = 1490, year = 2003 } } -------------------------------------------------------------- KEY { manning.2003.tr1482 } TITLE { On Screw-Transform Manifolds } AUTHORS { R. A. Manning and C. R. Dyer } PUBLISHEDIN { Computer Sciences Department Technical Report 1482, University of Wisconsin - Madison, April 2003. } ABSTRACT { This paper describes the mathematical theory of screw-transform manifolds and their use in camera self calibration. When a camera with fixed internal parameters views a scene from two different locations, the physical transformation that moves the camera from the first location to the second location is equivalent to a screw transformation. The fundamental matrix between the two views has a representation in terms of this screw transformation. The same fundamental matrix can be generated by different cameras undergoing different screw transformations. The set of all cameras that could generate a particular fundamental matrix in this way is the screw-transform manifold for the fundamental matrix. The screw-transform manifold can be generated directly from the fundamental matrix by varying the parameters of the underlying screw transformation. When several fundamental matrices are generated using the same camera, each screw-transform manifold arising from these fundamental matrices must contain the camera. Hence by finding the mutual intersection point of all the manifolds, the original camera can be recovered; this forms a technique for self calibration.

We describe two types of screw-transform manifolds:
Kruppa-constraint manifolds and modulus-constraint
manifolds. The first type can be generated directly from fundamental
matrices, but are three-dimensional manifolds embedded in a
five-dimensional space making them more difficult to use. The latter
type are simpler two-dimensional manifolds embedded in a
three-dimensional space, but require an initial projective
reconstruction of the cameras, which is not always possible or
desirable to attain, to be used in self calibration. We also describe
three algorithms for finding the mutual intersection point of a set of
manifolds and provide extensive experimental results for the
performance of these algorithms.
}
BIBTEX
{
@techreport{ Manning:2003:tr:a,
author = "Russell A. Manning and Charles R. Dyer",
title = "On Screw-Transform Manifolds",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1482,
year = 2003 }
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2002 }
--------------------------------------------------------------
KEY { manning.2002.eccv }
TITLE { Stratified Self Calibration From Screw-Transform Manifolds }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN { Proc. European Conf. on Computer Vision, 2002, IV:131-145. }
ABSTRACT
{
This paper introduces a new, stratified approach for the metric self
calibration of a camera with fixed internal parameters. The method
works by intersecting *modulus-constraint manifolds*, which are a
specific type of screw-transform manifold. Through the addition of a
single scalar parameter, a 2-dimensional modulus-constraint manifold
can become a 3-dimensional Kruppa-constraint manifold allowing for
direct self calibration from disjoint pairs of views. In this way, we
demonstrate that screw-transform manifolds represent a single, unified
approach to performing both stratified and direct self calibration.
This paper also shows how to generate the screw-transform manifold
arising from turntable (i.e., pairwise-planar) motion and discusses
some important considerations for creating a working algorithm from
these ideas.
}
BIBTEX
{
@inproceedings{ Manning:2002:eccv,
author = "Russell A. Manning and Charles R. Dyer",
title = "Stratified Self Calibration From Screw-Transform Manifolds",
booktitle = "Proc. European Conf. on Computer Vision",
year = 2002,
volume = 4,
pages = {131--145} }
}
--------------------------------------------------------------
KEY { guo.2002.tr1446 }
TITLE { Markov Information Propagation for Texture Synthesis }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1446,
University of Wisconsin - Madison, October 2002.
}
ABSTRACT
{
A patch-based approach called Markov information propagation (MIP) is
introduced for texture synthesis. The method is fast because it uses
a simple horizontal and vertical patch filling process that preserves
local structural similarity between the input texture and the synthesized
texture. Results on both artificial and natural textures are presented
and compared with some earlier methods.
}
BIBTEX
{
@techreport{ Guo:2002:tr1446,
author = "Guodong Guo and Charles R. Dyer",
title = "Markov Information Propagation for Texture Synthesis",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1446,
year = 2002 }
}
NOPS {}
--------------------------------------------------------------
KEY { guo.2002.tr1447 }
TITLE { An Evaluation of Bayes and Large Margin Classifiers for Face Expression Recognition }
AUTHORS { G-D. Guo and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1447,
University of Wisconsin - Madison, October 2002.
}
ABSTRACT
{
In this paper we investigate three representative methods for face expression recognition.
The first one is the Bayes decision approach, which is the most classical algorithm for
general pattern recognition. The second is support vector machine (SVM) classification,
and the third is the AdaBoost method. Both SVM and AdaBoost are considered Large Margin
Classifiers. We evaluate these three methods for face expression recognition on a
common database. To solve the multi-class (7 expressions) recognition problem, we use a
voting scheme and a binary tree scheme. For the Bayes and AdaBoost methods, we use a
pairwise framework for both feature selection and discrimination in order to simplify
the problem, and get good results. In contrast, with SVMs, we use all the features
without selection. We compare linear and non-linear SVMs to see if there is any
improvement using non-linear mapping. We also find that normalization makes recognition
performance worse for SVMs but has no influence for Bayes and AdaBoost methods.
}
BIBTEX
{
@techreport{ Guo:2002:tr1447,
author = "Guodong Guo and Charles R. Dyer",
title = "An Evaluation of Bayes and Large Margin Classifiers for Face Expression Recognition",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1447,
year = 2002 }
}
NOPS {}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2001 }
--------------------------------------------------------------
KEY { yu.2001.icip }
TITLE { Observer Motion Estimation and Control from Optical Flow }
AUTHORS { Liangyin Yu and C. R. Dyer }
PUBLISHEDIN { Proc. Int. Conf. Image Processing, 2001, 941-944. }
ABSTRACT
{
The information conveyed by optical flow is analytically linked to the
observer motion in this paper by decomposing the optical flow field
into its vector field components. It is shown that the observer may
recover his ego-motion by interpreting the decomposed optical flow
field, and he may further utilize his mobility to actively control the
shape of the optical flow field, which directly reflects the surface
shape of the object. The information of surface geometry
discontinuity can be derived more directly from optical flow field by
segmenting the whole field. A new method for the segmentation is
proposed here, which combines both the magnitude and phase parts of
the optical flow. The integration of these two different kinds of
information proves to be effective in making various surface geometry
boundary explicit.
}
BIBTEX
{
@inproceedings{ yu:icip01,
author = "Liangyin Yu and Charles R. Dyer",
title = "Observer Motion Estimation and Control from Optical Flow",
booktitle = "Proc. Int. Conf. Image Processing",
year = 2001,
pages = {941--944},
url = "ftp://ftp.cs.wisc.edu/computer-vision/repository/PDF/yu.2001.icip.pdf" }
}
--------------------------------------------------------------
KEY { manning.2001.cvpr }
TITLE { Metric Self Calibration From Screw-Transform Manifolds }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 2001, I:590-597. }
ABSTRACT
{
This paper introduces a method for metric self calibration that is
based on a novel decomposition of the fundamental matrix between two
views taken by a camera with fixed internal parameters. The method
blends important advantages of the Kruppa constraints and the modulus
constraint: it works directly from fundamental matrices and uses a
reduced-parameter representation for stability. General properties of
the new decomposition are also developed, including an intuitive
interpretation of the three free parameters of internal calibration.
The approach is demonstrated on both real and synthetic data.
}
BIBTEX
{
@inproceedings{ Manning:2001:cvpr,
author = "Russell A. Manning and Charles R. Dyer",
title = "Metric Self Calibration From Screw-Transform Manifolds",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 2001,
volume = 1,
pages = {590--597} }
}
--------------------------------------------------------------
KEY { manning.2001.iccv }
TITLE { Affine Calibration from Moving Objects }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN { Proc. 8th Int. Conf. Computer Vision, 2001, I:494-500. }
ABSTRACT
{
This paper introduces a novel linear algorithm for determining the
affine calibration between two camera views of a dynamic scene. The
affine calibration is computed directly from the fundamental matrices
associated with various moving objects in the scene, as well as from
the fundamental matrix for the static background if the cameras are at
different locations. A minimum of two fundamental matrices are
required, but any number of additional fundamental matrices can be
incorporated into the linear system to improve the stability of the
computation. The technique is demonstrated on both real and synthetic
data.
}
BIBTEX
{
@inproceedings{ Manning:2001:iccv,
author = "Russell A. Manning and Charles R. Dyer",
title = "Affine Calibration from Moving Objects",
booktitle = "Proc. Eighth Int. Conf. Computer Vision",
year = 2001,
volume = 1,
pages = {494--500} }
}
--------------------------------------------------------------
KEY { dyer.2001.fia }
TITLE { Volumetric Scene Reconstruction from Multiple Views }
AUTHORS { C. R. Dyer }
PUBLISHEDIN
{
Foundations of Image Understanding, L. S. Davis, ed.,
Kluwer, Boston, 2001, 469-489.
}
ABSTRACT
{
A review of methods for volumetric scene reconstruction from multiple
views is presented. Occupancy descriptions of the voxels in a scene
volume are constructed using shape-from-silhouette techniques for
binary images, and shape-from-photo-consistency combined with
visibility testing for color images.
}
BIBTEX
{
@incollection{ dyer:fia01,
author = "C.~R. Dyer",
title = "Volumetric Scene Reconstruction from Multiple Views",
booktitle = "Foundations of Image Understanding",
editor = "L.~S. Davis",
publisher = "Kluwer",
year = 2001,
pages = {469--489},
url = "ftp://ftp.cs.wisc.edu/computer-vision/repository/PDF/dyer.2001.fia.pdf"
}
}
--------------------------------------------------------------
KEY { yu.2001.iwvf4 }
TITLE { Perception-Based 2D Shape Modeling by Curvature Shaping }
AUTHORS { Liangyin Yu and C. R. Dyer }
PUBLISHEDIN
{
Proc. 4th Int. Workshop on Visual Form, C. Arcelli,
L. P. Cordella, and G. Sanniti di Baja, eds., Lecture Notes in
Computer Science, Springer-Verlag, 2001, 272-282.
}
ABSTRACT
{
2D curve representations usually take algebraic forms in ways not
related to visual perception. This poses great difficulties in
connecting curve representation with object recognition where
information computed from raw images must be manipulated in a
perceptually meaningful way and compared to the representation. In
this paper we show that 2D curves can be represented compactly by
imposing shaping constraints in curvature space, which can be readily
computed directly from input images. The inverse problem of
reconstructing a 2D curve from the shaping constraints is solved by a
method using curvature shaping, in which the 2D image space is used in
conjunction with its curvature space to generate the curve
dynamically. The solution allows curve length to be determined
and used subsequently for curve modeling using polynomial basis
functions. Polynomial basis functions of high orders are shown to be
necessary to incorporate perceptual information commonly available at
the biological visual front-end.
}
BIBTEX
{
@incollection{yu:iwvf01,
author = "Liangyin Yu and C.~R. Dyer",
title = "Perception-Based 2D Shape Modeling by Curvature Shaping",
booktitle = "Proc. 4th Int. Workshop on Visual Form",
editor = "C. Arcelli and L. P. Cordella and G. Sanniti di Baja",
series = "Lecture Notes in Computer Science, Vol. 2059",
publisher = "Springer-Verlag",
year = 2001,
pages = {272--282},
url = "ftp://ftp.cs.wisc.edu/computer-vision/repository/PDF/yu.2001.iwvf4.pdf"
}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 2000 }
--------------------------------------------------------------
KEY { manning.2000.tr1423 }
TITLE { Environment Map Morphing }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1423,
University of Wisconsin - Madison, December 2000.
}
ABSTRACT
{
The techniques of static and dynamic view morphing are extended to
allow for (1) interpolation between arbitrary, uncalibrated
environment maps (complete or partial), (2) camera motion directly
towards or away from the scene (so that the epipole is visible), and
(3) views of dynamic scenes in which an object's vanishing point is
visible inside the object. It is shown that, by using an *image
cylinder* instead of an image plane, the scanline property can be
preserved and the morphing process can be made continuous for
environment maps. Environment map interpolation together with
layering provides an uncalibrated, sprite-like graphics primitive
similar to ``billboarding'' techniques.
}
BIBTEX
{
@techreport{ Manning:2000:tr:b,
author = "Russell A. Manning and Charles R. Dyer",
title = "Environment Map Morphing",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1423,
year = 2000 }
}
--------------------------------------------------------------
KEY { manning.2000.tr1417 }
TITLE { Affine Calibration from Dynamic Scenes }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1417,
University of Wisconsin - Madison, March 2000.
}
ABSTRACT
{
In Computer Sciences Department Technical Report 1397, the authors
introduced a linear algorithm for determining the affine calibration
between two camera views of a dynamic scene. In this paper, we expand
upon the algorithm and investigate its performance experimentally. The
algorithm computes affine calibration directly from the fundamental
matrices associated with various moving objects in the scene, as well
as from the fundamental matrix for the static background if the cameras
are at different locations. A minimum of two fundamental matrices are
required, but any number of additional fundamental matrices can be
incorporated into the linear system to improve computational
stability. The technique is demonstrated on both real and synthetic
data.
}
BIBTEX
{
@techreport{ Manning:2000:tr:a,
author = "Russell A. Manning and Charles R. Dyer",
title = "Affine Calibration from Dynamic Scenes",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1417,
year = 2000 }
}
--------------------------------------------------------------
KEY { manning.2000.nato }
TITLE { Dynamic View Interpolation without Affine Reconstruction }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN
{
Confluence of Computer Vision and Computer Graphics
A. Leonardis, F. Solina and R. Bajcsy, eds., Kluwer, Dordrecht,
The Netherlands, 2000, 123-142.
}
ABSTRACT
{
This chapter presents techniques for view interpolation between two
reference views of a dynamic scene captured at different times. The
interpolations produced portray one possible physically-valid version
of what transpired in the scene during the time between when the two
reference views were taken. We show how straight-line object motion,
relative to a camera-centered coordinate system, can be achieved, and
how the appearance of straight-line object motion relative to the
background can be created. The special case of affine cameras is also
discussed. The methods presented work with widely-separated,
uncalibrated cameras and sparse point correspondences. The approach
does not involve finding the camera-to-camera transformation and thus
does not implicitly perform affine reconstruction of the scene. For
circumstances in which the camera-to-camera transformation can be
found, we introduce a vector space of possible synthetic views that
follows naturally from the given reference views. It is assumed that
the motion of each object in the original scene consists of a series
of rigid translations.
}
BIBTEX
{
@inbook{ Manning:2000:nato,
author = "Russell A. Manning and Charles R. Dyer",
title = "Dynamic View Interpolation without Affine Reconstruction",
booktitle = "Confluence of Computer Vision and Computer Graphics",
publisher = "Kluwer, Dordrecht, The Netherlands",
editor = "A. Leonardis and F. Solina and R. Bajcsy",
pages = {123--142},
year = 2000 }
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1999 }
--------------------------------------------------------------
KEY { manning.1999.cvpr }
TITLE { Interpolating View and Scene Motion by Dynamic View Morphing }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1999, I:388-394. }
ABSTRACT
{
We introduce the problem of view interpolation for dynamic scenes. Our
solution to this problem extends the concept of view morphing and
retains the practical advantages of that method. We are specifically
concerned with interpolating between two reference views captured at
different times, so that there is a missing interval of time between
when the views were taken. The synthetic interpolations produced by
our algorithm portray one possible physically-valid version of what
transpired in the scene during the missing time. It is assumed that
each object in the original scene underwent a series of rigid
translations. Dynamic view morphing can work with widely-spaced
reference views, sparse point correspondences, and uncalibrated
cameras. When the camera-to-camera transformation can be determined,
the synthetic interpolation will portray scene objects moving along
straight-line, constant-velocity trajectories in world space.
}
BIBTEX
{
@inproceedings{ Manning:1999:cvpr,
author = "Russell A. Manning and Charles R. Dyer",
title = "Interpolating View and Scene Motion by Dynamic View Morphing",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 1999,
volume = 1,
pages = {388--394} }
}
--------------------------------------------------------------
KEY { seitz.1999.ijcv }
TITLE { Photorealistic Scene Reconstruction by Voxel Coloring }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Int. J. Computer Vision **35**, No. 2, 1999, 151-173 }
ABSTRACT
{
A novel scene reconstruction technique is presented, different from
previous approaches in its ability to cope with large changes in
visibility and its modeling of intrinsic scene color and texture
information. The method avoids image correspondence problems by
working in a discretized scene space whose voxels are traversed in a
fixed visibility ordering. This strategy takes full account of
occlusions and allows the input cameras to be far apart and widely
distributed about the environment. The algorithm identifies a special
set of invariant voxels which together form a spatial and photometric
reconstruction of the scene, fully consistent with the input images.
The approach is evaluated with images from both inward- and
outward-facing cameras.
}
BIBTEX
{
@article{ Seitz:1999:ijcv,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Photorealistic Scene Reconstruction by Voxel Coloring",
journal = "Int. J. Computer Vision",
volume = 35,
number = 2,
pages = {151--173},
year = 1999}
}
--------------------------------------------------------------
KEY { yu.1999.thesis }
TITLE { Active 3D Surface Modeling using Perception-Based, Differential Geometric Primitives }
AUTHORS { Liangyin Yu }
PUBLISHEDIN { Ph.D. Dissertation, Computer Sciences Department, University of Wisconsin - Madison, August 1999. }
ABSTRACT
{
Computational vision is about why a biological vision system functions
as it does and how to emulate its performance on computers. The central
topics of this thesis are how a differential geometry language can be
used to describe the essential elements of visual perception in both
2D and 3D domains, and how the components of this geometric language
can be computed in ways closely related to how the human visual system
performs similar functions.

The thesis starts by showing that at the earliest stage of vision, biological systems implement a mechanism that is computationally equivalent to computing local geometric invariants at the two-dimensional curve level. The availability of this information establishes the foundation for computing components of a differential geometry language from sensory inputs. The mathematical framework of scale space that makes this computational approach possible, likewise, has its biological basis.

On the other hand, visual perception is a global phenomenon that occurs generally in a 3D space. To understand this process and design computational systems that have comparable performance to humans requires specification of how a 2D local computational mechanism can be used in this global 3D environment. This goal is achieved through two steps. First, a global surface representation formulation is extended from the 2D framework. It is shown how local geometric features that are sparse and perceptually meaningful can be naturally used to represent global 3D surfaces. Second, active motion by an observer is introduced as an additional dimension to the data set so that the observer becomes mobile and can react to observations or verify hypotheses actively. This also makes dynamical data such as optical flow available to the observer. These added abilities enable the observer to perform tasks such as surface recovery and 3D navigation. In addition, the modeling process of 3D objects is naturally constrained by the computational resources available to the observer so that the model is inherently incremental.

This thesis contributes in the following areas: (1) direct computation
of 2D differential geometric invariants from images using methods
comparable to the human vision system, (2) perception-based global
representations of 2D and 3D objects using geometric invariants,
(3) novel methods for optical flow computation and segmentation,
and (4) active methods for global surface recovery and navigation
using both stationary contours, apparent contours and textured surfaces.
}
BIBTEX
{
@phdthesis{ Yu:1999:phd,
author = "Liangyin Yu",
title = "Active 3D Surface Modeling using Perception-Based, Differential Geometric Primitives",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 1999}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1998 }
--------------------------------------------------------------
KEY { dyer.1998.iuw }
TITLE { Image-Based Visualization from Widely-Separated Views }
AUTHORS { C. R. Dyer }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1998, 101-105. }
ABSTRACT
{
This report describes image-based
visualization research in support of video surveillance
and monitoring systems.
Our primary goal is to develop
methods so a user can interactively visualize
a 3D environment from images captured
by a set of widely-separated cameras.
Results include view interpolation of dynamic scenes,
coarse-to-fine voxel coloring for efficient scene reconstruction,
and recovering scene structure and camera motion.
}
BIBTEX
{
@inproceedings{ Dyer:1998:iuw,
author = "Charles R. Dyer",
title = "Image-Based Visualization from Widely-Separated Views",
booktitle = "Proc. Image Understanding Workshop",
pages = "101--105",
year = 1998}
}
--------------------------------------------------------------
KEY { prock.1998.iuw }
TITLE { Towards Real-Time Voxel Coloring }
AUTHORS { A. C. Prock and C. R. Dyer }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1998, 315-321. }
ABSTRACT
{
Techniques for constructing three-dimensional scene models from
two-dimensional images are often slow and unsuitable for
interactive, real-time applications. In this paper we explore
three methods of enhancing the performance of the voxel coloring
reconstruction method. The first approach uses texture mapping
to leverage hardware acceleration. The second approach uses
spatial coherence and a coarse-to-fine strategy to focus
computation on the filled parts of scene space. Finally, the
multi-resolution method is extended over time to enhance
performance for dynamic scenes.
}
BIBTEX
{
@inproceedings{ Prock:1998:iuw,
author = "Andrew C. Prock and Charles R. Dyer",
title = "Towards Real-Time Voxel Coloring",
booktitle = "Proc. Image Understanding Workshop",
year = 1998,
pages = {315--321} }
}
--------------------------------------------------------------
KEY { manning.1998.iuw }
TITLE { Interpolating View and Scene Motion by Dynamic View Morphing }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1998, 323-330 }
ABSTRACT
{
We present a novel technique for interpolating between two views of a
dynamic scene. Our approach extends the concept of view morphing
introduced in [Seitz and Dyer, 1996] and retains the relative
advantages of that method. The interpolation will portray one possible
physically valid version of what transpired in the scene during the
intervening time between views. The scene is assumed to consist of a
small number of objects. Each object can undergo any motion during the
time between views as long as the total movement is equivalent to a
single, rigid translation. The dynamic view morphing technique can
work with widely-spaced reference views, sparse point correspondences,
and uncalibrated cameras. When the camera-to-camera transformation can
be determined, the virtual objects can be portrayed moving along
straight-line, constant-velocity trajectories. Methods are developed
for determining the camera-to-camera transformation from information
available in the reference views. It is shown that each moving object
in a scene has a corresponding fundamental matrix and that the
camera-to-camera transformation can be determined from two distinct
fundamental matrices. Dynamic view morphing is developed for both
pinhole and orthographic cameras, and the use of three or more
reference views is discussed. Static view morphing is made more
versatile with respect to occlusion, and mosaicing is combined with
dynamic view morphing for the case when both reference views share the
same optical center. The resulting combination of techniques can be
used to fill-in missing gaps in movies, perform "view hand-offs"
between cameras at different locations, create movies from still
images, perform movie stabilization and compression, track objects
during periods of obstruction, and related tasks.
}
BIBTEX
{
@inproceedings{ Manning:1998:iuw,
author = "Russell A. Manning and Charles R. Dyer",
title = "Interpolating View and Scene Motion by Dynamic View Morphing",
booktitle = IUW,
year = 1998,
pages = {323--330} }
}
--------------------------------------------------------------
KEY { manning.1998.tr1387 }
TITLE { Dynamic View Morphing }
AUTHORS { R. A. Manning and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1387,
University of Wisconsin - Madison, September 1998.
}
ABSTRACT
{
We present a novel technique for interpolating between two views of a
dynamic scene. Our approach extends the concept of view morphing and
retains the relative advantages of that method. The interpolation
will portray *one possible* physically-valid version of what
transpired in the scene during the intervening time between views.
The scene is assumed to consist of a small number of *objects*.
Each object can undergo any motion during the time between views as
long as its total movement is equivalent to a single, rigid
translation. The dynamic view morphing technique can work with
widely-spaced reference views, sparse point correspondences, and
uncalibrated cameras. When the *camera-to-camera transformation*
can be determined, the virtual objects can be portrayed moving along
straight-line, constant-velocity trajectories. Methods are developed
for determining the camera-to-camera transformation from information
available in the reference views. It is shown that each moving object
in a scene has a corresponding fundamental matrix and that the
camera-to-camera transformation can be determined from two distinct
fundamental matrices. Dynamic view morphing is developed for both
pinhole and orthographic cameras, and the use of three or more
reference views is discussed. Static view morphing is made more
versatile with respect to occlusion, and mosaicing is combined with
dynamic view morphing for the case when both reference views share the
same optical center. The resulting combination of techniques can be
used to fill-in missing gaps in movies, perform ``view hand-offs"
between cameras at different locations, create movies from still
images, perform movie stabilization and compression, track objects
during periods of obstruction, and related tasks.
}
NOTES
{
*Very long, thorough coverage of the topic. Contains much information not found in the
CVPR99 paper.*
}
BIBTEX
{
@techreport{ Manning:1998:tr,
author = "Russell A. Manning and Charles R. Dyer",
title = "Dynamic View Morphing",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = 1387,
year = 1998 }
}
--------------------------------------------------------------
KEY { seitz.1998.iccv }
TITLE { Plenoptic Image Editing }
AUTHORS { S. M. Seitz and K. N. Kutulakos }
PUBLISHEDIN { Proc. 6th Int. Conf. Computer Vision, 1998, 17-24 }
ABSTRACT
{
This paper presents a new class of interactive image editing operations
designed to maintain physical consistency between multiple images of a
physical 3D object. The distinguishing feature of these operations is that
edits to any one image propagate automatically to all other images as if the
(unknown) 3D object had itself been modified. The approach is useful first
as a power-assist that enables a user to quickly modify many images by
editing just a few, and second as a means for constructing and editing
image-based scene representations by manipulating a set of photographs.
The approach works by extending operations like image painting,
scissoring, and morphing so that they alter an object's plenoptic function in
a physically-consistent way, thereby affecting object appearance from all
viewpoints simultaneously. A key element in realizing these operations is a
new volumetric decomposition technique for reconstructing an object's
plenoptic function from an incomplete set of camera viewpoints.
}
BIBTEX
{
@inproceedings{ Seitz:1998:iccv,
author = "Steven M. Seitz and Kyros N. Kutulakos",
title = "Plenoptic Image Editing",
booktitle = "Proc. 6th Int. Conf. Computer Vision",
pages = "17--24",
year = 1998 }
}
--------------------------------------------------------------
KEY { bestor.1998.thesis }
TITLE { Recovering Feature and Observer Position by Projected Error Refinement }
AUTHORS { G. S. Bestor }
PUBLISHEDIN
{
Ph.D. Dissertation, Computer Sciences Department Technical Report 1381,
University of Wisconsin - Madison, August 1998.
}
ABSTRACT
{
Recovering three-dimensional information from images is a principal goal of
computer vision. An approach called Structure From Motion (SFM) does so
without imposing strict requirements on the observer or scene. In
particular, SFM assumes camera motion is unknown and the scene is only
required to be static. This thesis describes a new SFM technique called
Projected Error Refinement that computes the positions of feature points
(i.e., structure) and the locations of the camera or observer (i.e., motion)
from a noisy image sequence. The technique addresses limitations of
existing SFM techniques that make them unsuitable except in controlled
environments; the approach presented in this thesis models perspective
projection, allows unconstrained camera motion, deals with outliers and
occlusion, and is scalable. This new technique is recursive and thus is
suitable for video image streams because new images can be added at any
time.

Projected Error Refinement views SFM as a geometric inverse projection problem, with the goal of determining the positions of the cameras and feature points such that the projectors defined by each image optimally intersect (projectors are the lines of projection specifying the direction of each feature point from the camera's optical center). This is expressed as a global optimization problem with the objective function minimizing the mean-squared angular projection error between the solution and the observed images. Occlusion is dealt with naturally in this approach because only visible feature points define projectors that are considered during optimization - occluded features are ignored. The technique models true perspective projection and is scalable to an arbitrary number of feature points and images. Projected Error Refinement is non-linear and uses an efficient parallel iterative refinement algorithm that takes an initial estimate of the structure and motion parameters and alternately refines the cameras' poses and the positions of the feature points in parallel. The solution can be refined to an arbitrary precision or refinement can be terminated prematurely due to limited processing time. The solution converges rapidly towards the global minimum even when started from a poor initial estimate. Experimental results are given for both 2D and 3D perspective projection using real and synthetic images sequences. } BIBTEX { @phdthesis{ Bestor:1998:phd, author = "Gareth Bestor", title = "Recovering Feature and Observer Position by Projected Error Refinement", school = "University of Wisconsin - Madison", address = "Madison, WI", year = 1998} } -------------------------------------------------------------- KEY { yu.1998.icip } TITLE { Direct Computation of Differential Invariants of Image Contours from Shading } AUTHORS { L-Y. Yu and C. R. Dyer } PUBLISHEDIN { Proc. 5th Int. Conf. Image Processing, 1998. } ABSTRACT { In this paper we present a framework combining differential geometry and scale-space to show that local geometric invariants of image contours such as tangent, curvature and derivative of curvature can be computed directly and stably from the raw image itself.

To solve the problem of noise amplification by differential operations, scale-parameterized local kernels are used to replace differential operations by integral operations,which can be carried out accurately when we adopt a continuous image model. We also show that tangent estimation along contours can be made quite accurately using only eight tangent estimators (a \pi/4 quantization) when contour location is known, and high precision and efficiency in computation can be achieved for each of the invariants regardless of the differential order involved. } BIBTEX { @inproceedings{ Yu:1998:icip, author = "L-Y. Yu and Charles R. Dyer", title = "Direct Computation of Differential Invariants of Image Contours from Shading", booktitle = "Proc. 5th Int. Conf. Image Processing", year = 1998} } -------------------------------------------------------------- ############################################################## ############################################################## ############################################################## YEAR { 1997 } -------------------------------------------------------------- KEY { seitz.1997.rochester } TITLE { Plenoptic Image Editing } AUTHORS { S. M. Seitz and K. N. Kutulakos } PUBLISHEDIN { Computer Science Department Technical Report 647, University of Rochester, Rochester, NY, January 1997. } ABSTRACT { This paper presents a new class of interactive image editing operations designed to maintain consistency between multiple images of a physical 3D object. The distinguishing feature of these operations is that edits to any one image propagate automatically to all other images as if the (unknown) 3D object had itself been modified. The approach is useful first as a power-assist that enables a user to quickly modify many images by editing just a few, and second as a means for constructing and editing image-based scene representations by manipulating a set of photographs. The approach works by extending operations like image painting, scissoring, and morphing so that they alter an object's plenoptic function in a physically-consistent way, thereby affecting object appearance from all viewpoints simultaneously. A key element in realizing these operations is a new volumetric decomposition technique for reconstructing an object's plenoptic function from an incomplete setof camera viewpoints. } ##LOCATION { ftp://ftp.cs.rochester.edu/pub/papers/robotics/97.tr647.Plenoptic_image_editing.ps.gz } BIBTEX { @techreport{ Seitz:1997:rochester, author = "S. M. Seitz and K. N. Kutulakos", title = "Plenoptic Image Editing", institution = "Computer Sciences Department, University of Rochester", number = "647", month = "January", year = 1997} } -------------------------------------------------------------- KEY { seitz.1997.thesis } TITLE { Image-Based Transformation of Viewpoint and Scene Appearance } AUTHORS { S. M. Seitz } PUBLISHEDIN { Ph.D. Dissertation, Computer Sciences Department Technical Report 1354, University of Wisconsin - Madison, October 1997. } ABSTRACT { This thesis addresses the problem of synthesizing images of real scenes under three-dimensional transformations in viewpoint and appearance. Solving this problem enables interactive viewing of remote scenes on a computer, in which a user can move a virtual camera through the environment and virtually paint or sculpt objects in the scene. It is demonstrated that a variety of three-dimensional scene transformations can be rendered on a video display device by applying simple transformations to a set of basis images of the scene. The virtue of these transformations is that they operate directly on images and recover only the scene information that is required in order to accomplish the desired effect. Consequently, they are applicable in situations where accurate three-dimensional models are difficult or impossible to obtain.

A central topic is the problem of *view synthesis*, i.e.,
rendering images of a real scene from different camera viewpoints by
processing a set of basis images. Towards this end, two algorithms
are described that warp and resample pixels in a set of basis images
to produce new images that are physically-valid, i.e., they correspond
to what a real camera would see from the specified viewpoints.
Techniques for synthesizing other types of transformations, e.g.,
non-rigid shape and color transformations, are also discussed. The
techniques are found to perform well on a wide variety of real and
synthetic images.

A basic question is uniqueness, i.e., for which views is the
appearance of the scene uniquely determined from the information
present in the basis views. An important contribution is a uniqueness
result for the no-occlusion case, which proves that all views on the
line segment between the two camera centers are uniquely determined
from two uncalibrated views of a scene. Importantly, neither dense
pixel correspondence nor camera information is needed. From this
result, a *view morphing* algorithm is derived that produces
high quality viewpoint and shape transformations from two uncalibrated
images.

To treat the general case of many views, a novel *voxel coloring*
framework is introduced that facilitates the analysis of ambiguities in
correspondence and scene reconstruction. Using this framework, a new type
of scene invariant, called *color invariant*, is derived, which provides
intrinsic scene information useful for correspondence and view synthesis.
Based on this result, an efficient voxel-based algorithm is introduced to
compute reconstructions and dense correspondence from a set of basis views.
This algorithm has several advantages, most notably its ability to easily
handle occlusion and views that are arbitrarily far apart, and its
usefulness for *panoramic* visualization of scenes. These factors
also make the voxel coloring approach attractive as a means for obtaining
high-quality three-dimensional reconstructions from photographs.
}
BIBTEX
{
@phdthesis{ Seitz:1997:phd,
author = "Steven M. Seitz",
title = "Image-Based Transformation of Viewpoint and Scene Appearance",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 1997}
}
--------------------------------------------------------------
KEY { seitz.1997.cvpr }
TITLE { Photorealistic Scene Reconstruction by Voxel Coloring }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1997, 1067-1073. }
ABSTRACT
{
A novel scene reconstruction technique is presented, different from
previous approaches in its ability to cope with large changes in
visibility and its modeling of intrinsic scene color and texture
information. The method avoids image correspondence problems by
working in a discretized scene space whose voxels are traversed in a
fixed visibility ordering. This strategy takes full account of
occlusions and allows the input cameras to be far apart and widely
distributed about the environment. The algorithm identifies a special
set of invariant voxels which together form a spatial and photometric
reconstruction of the scene, fully consistent with the input
images. The approach is evaluated with images from both inward- and
outward-facing cameras.
}
BIBTEX
{
@inproceedings{seitz:cvpr97,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Photorealistic Scene Reconstruction by Voxel Coloring",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
pages = {1067-1073},
year = 1997}
}
--------------------------------------------------------------
KEY { seitz.1997.iuw.b }
TITLE { Photorealistic Scene Reconstruction by Voxel Coloring }
AUTHORS { S. M. Seitz }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1997, 935-942. }
ABSTRACT
{
A novel scene reconstruction technique is presented, different from
previous approaches in its ability to cope with large changes in
visibility and its modeling of intrinsic scene color and texture
information. The method avoids image correspondence problems by
working in a discretized scene space whose voxels are traversed in a
fixed visibility ordering. This strategy takes full account of
occlusions and allows the input cameras to be far apart and widely
distributed about the environment. The algorithm identifies a special
set of invariant voxels which together form a spatial and photometric
reconstruction of the scene, fully consistent with the input images.
}
BIBTEX
{
@inproceedings{seitz:iuw97b,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Photorealistic Scene Reconstruction by Voxel Coloring",
booktitle = "Proc. Image Understanding Workshop",
year = 1997,
pages = {935-942} }
}
--------------------------------------------------------------
KEY { dyer.1997.iuw }
TITLE { Image-Based Scene Rendering and Manipulation Research at the University of Wisconsin }
AUTHORS { C. R. Dyer }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1997, 63-67. }
ABSTRACT
{
This report summarizes the research effort at the University of
Wisconsin in support of the VSAM Program. Our primary goal is to
develop technologies so a user can interactively visualize and
virtually modify a 3D environment from a set of images. Current
approaches are described for image-based scene rendering, scene
manipulation, and appearance modeling.
}
BIBTEX
{
@inproceedings{ Dyer:1997:iuw,
author = "Charles R. Dyer",
title = "Image-Based Scene Rendering and Manipulation Research at the University of Wisconsin",
booktitle = "Proc. Image Understanding Workshop",
pages = "63--67",
year = 1997}
}
--------------------------------------------------------------
KEY { seitz.1997.imagina }
TITLE { Bringing Photographs to Life with View Morphing }
AUTHORS { S. M. Seitz }
PUBLISHEDIN { Proc. Imagina 97, 1997, 153-158. }
ABSTRACT
{
Photographs and paintings are limited in the amount of information they
can convey due to their inherent lack of motion and depth. Using image
morphing methods, it is now possible to add 2D motion to photographs
by moving and blending image pixels in creative ways. We have taken this
concept a step further by adding the ability to convey three-dimensional
motions, such as scene rotations and viewpoint changes, by manipulating one
or more photographs of a scene. The effect transforms a photograph or
painting into an interactive visualization of the underlying object or scene
in which the world may be rotated in 3D. Several potential
applications of this technology are discussed, in areas such
as virtual reality, image databases, and special effects.
}
BIBTEX
{
@inproceedings{seitz:ima97,
author = "Steven M. Seitz",
title = "Bringing Photographs to Life with View Morphing",
booktitle = "Proc. Imagina 97 Conf.",
address = "Monaco",
year = 1997,
pages = {153-158} }
}
--------------------------------------------------------------
KEY { seitz.1997.iuw.a }
TITLE { View Morphing: Uniquely Predicting Scene Appearance from Basis Images }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. Image Understanding Workshop, 1997, 881-887. }
ABSTRACT
{
This paper analyzes the conditions when a discrete set of images
implicitly describes scene appearance for a continuous range of
viewpoints. It is shown that two basis views of a static scene
uniquely determine the set of all views on the line between their
optical centers when a visibility constraint is satisfied. Additional
basis views extend the range of predictable views to 2D or 3D regions
of viewpoints. A simple scanline algorithm called *view
morphing* is presented for generating these views from a set of
basis images. The technique is applicable to both calibrated and
uncalibrated images.
}
BIBTEX
{
@inproceedings{ Seitz:1997:iuw,
author = "Steven M. Seitz and Charles R. Dyer",
title = "View {M}orphing: Uniquely Predicting Scene Appearance from Basis Images",
booktitle = IUW,
pages = {881--887},
year = 1997}
}
--------------------------------------------------------------
KEY { seitz.1997.ijcv }
TITLE { View-Invariant Analysis of Cyclic Motion }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Int. J. Computer Vision, **25(3)**, 1997, 231-251. }
ABSTRACT
{
This paper presents a general framework for image-based analysis of 3D
repeating motions that addresses two limitations in the state of the art.
First, the assumption that a motion be perfectly even from one cycle to
the next is relaxed. Real repeating motions tend not to be perfectly even,
i.e., the length of a cycle varies through time because of physically important
changes in the scene. A generalization of {\em period} is defined for
repeating motions that makes this temporal variation explicit. This
representation, called the period trace, is compact and purely temporal,
describing the evolution of an object or scene without reference to spatial
quantities such as position or velocity. Second, the requirement that the
observer be stationary is removed. Observer motion complicates image analysis
because an object that undergoes a 3D repeating motion will generally not
produce a repeating sequence of images. Using principles of affine
invariance, we derive necessary and sufficient conditions for an image
sequence to be the projection of a 3D repeating motion, accounting for
changes in viewpoint and other camera parameters.
Unlike previous work in visual invariance, however, our approach is
applicable to objects and scenes whose motion is highly non-rigid.
Experiments on real image sequences demonstrate how the approach may be
used to detect several types of purely temporal motion features, relating to
motion trends and irregularities.
Applications to athletic and medical motion analysis are discussed.
}
BIBTEX
{
@article{seitz:ijcv97,
author = "Steven M. Seitz and Charles R. Dyer",
title = "View-Invariant Analysis of Cyclic Motion",
journal = "Int. J. of Computer Vision",
volume = 25,
number = 3,
year = 1997,
pages = {231--251} }
}
--------------------------------------------------------------
KEY { seitz.1997.mbr }
TITLE { Cyclic Motion Analysis using the Period Trace }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Motion-Based Recognition, M. Shah and R. Jain, eds., Kluwer, Boston, 1997, 61-85. }
ABSTRACT
{
A new technique is presented for computing 3D scene structure from point
and line features in monocular image sequences. Unlike previous methods,
the technique guarantees the completeness of the recovered scene, ensuring
that every scene feature that is detected in each image is reconstructed.
The approach relies on the presence of four or more reference features
whose correspondences are known in all the images. Under an orthographic
or affine camera model, the parallax of the reference features provides
constraints that simplify the recovery of the rest of the visible scene.
An efficient recursive algorithm is described that uses a unified framework
for point and line features. The algorithm integrates the tasks of feature
correspondence and structure recovery, ensuring that all reconstructible
features are tracked. In addition, the algorithm is immune to outliers and
feature-drift, two weaknesses of existing structure-from-motion techniques.
Experimental results are presented for real images.
}
BIBTEX
{
@incollection{seitz:mbr97,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Cyclic Motion Analysis Using the Period Trace",
booktitle = "Motion-Based Recognition (M. Shah and R. Jain, Eds.)",
publisher = "Kluwer Academic Publishers",
address = "Boston, MA",
pages = {61-85},
year = 1997}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1996 }
--------------------------------------------------------------
KEY { seitz.1996.sigg }
TITLE { View Morphing }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. SIGGRAPH 96, 1996, 21-30. }
ABSTRACT
{
Image morphing techniques can generate compelling 2D transitions
between images. However, differences in object pose or viewpoint
often cause unnatural distortions in image morphs that are difficult
to correct manually. Using basic principles of projective geometry,
this paper introduces a simple extension to image morphing that
correctly handles 3D projective camera and scene transformations. The
technique, called * view morphing*, works by prewarping two
images prior to computing a morph and then postwarping the
interpolated images. Because no knowledge of 3D shape is required,
the technique may be applied to photographs and drawings, as well as
rendered scenes. The ability to synthesize changes both in viewpoint
and image structure affords a wide variety of interesting 3D effects
via simple image transformations.
}
BIBTEX
{
@inproceedings{ Seitz:1996:siggraph,
author = "Steven M. Seitz and Charles R. Dyer",
title = "View Morphing",
booktitle = SIGGRAPH96,
pages = {21--30},
year = 1996}
}
--------------------------------------------------------------
KEY { seitz.1996.icpr }
TITLE { Toward Image-Based Scene Representation Using View Morphing }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. 13th Int. Conf. Pattern Recognition, Vol. I, Track A: Computer Vision, 1996, 84-89. }
ABSTRACT
{
The question of which views may be inferred from a set of basis images
is addressed. Under certain conditions, a discrete set of images
implicitly describes scene appearance for a continuous range of viewpoints.
In particular, it is demonstrated that two basis views of a static scene
determine the set of all views on the line between their optical centers.
Additional basis views further extend the range of predictable views to a
two- or three-dimensional region of viewspace. These results are shown to
apply under perspective projection subject to a generic visibility
constraint called monotonicity. In addition, a simple scanline algorithm is
presented for actually generating these views from a set of basis images.
The technique, called * view morphing* may be applied to both calibrated
and uncalibrated images. At a minimum, two basis views and their
fundamental matrix are needed. Experimental results are presented on
real images. This work provides a theoretical foundation for image-based
representations of 3D scenes by demonstrating that perspective view
synthesis is a theoretically well-posed problem.
}
BIBTEX
{
@inproceedings{ Seitz:1996:icpr,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Toward Image-Based Scene Representation Using View Morphing",
booktitle = "Proc. 13th Int. Conf. on Pattern Recognition, Vol. I",
pages = {84--89},
year = 1996}
}
--------------------------------------------------------------
KEY { yu.1996.fest }
TITLE { Shape Recovery from Stationary Surface Contours by Controlled Observer Motion }
AUTHORS { L-Y. Yu and C. R. Dyer }
PUBLISHEDIN { Advances in Image Understanding: A Festschrift for Azriel Rosenfeld, IEEE Computer Society Press, Los Alamitos, Ca., 1996, 177-193. }
ABSTRACT
{
The projected deformation of stationary contours and markings on
object surfaces is analyzed in this paper. It is shown that given a
marked point on a stationary contour, an active observer can move
deterministically to the osculating plane for that point by observing
and controlling the deformation of the projected contour. Reaching the
osculating plane enables the observer to recover the object surface
shape along the contour as well as the Frenet frame of the
contour. Complete local surface recovery requires either two
intersecting surface contours and the knowledge of one principle
direction, or more than two intersecting contours. To reach the
osculating plane, two strategies involving both pure translation and a
combination of translation and rotation are analyzed. Once the Frenet
frame for the marked point on the contour is recovered, the same
information for all points on the contour can be recovered by staying
on osculating planes while moving along the contour. It is also shown
that occluding contours and stationary contours deform in a
qualitatively different way and the problem of discriminating between
these two types of contours can be resolved before the recovery of
local surface shape.
}
BIBTEX
{
@inbook{ Yu:1996:chapter,
author = "Liangyin Yu and Charles R. Dyer",
title = "Shape Recovery from Stationary Surface Contours by Controlled Observer Motion",
booktitle = "Advances in Image Understanding: A Festschrift for Azriel Rosenfeld (K. Bowyer and N. Ahuja, Eds.)",
publisher = "IEEE Computer Society Press",
pages = {177--193},
year = 1996}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1995 }
--------------------------------------------------------------
KEY { seitz.1995.rvs }
TITLE { Physically-Valid View Synthesis by Image Interpolation }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. Workshop on Representation of Visual Scenes, 1995, 18-25. }
ABSTRACT
{
Image warping is a popular tool for smoothly transforming one image to
another. ``Morphing'' techniques based on geometric image
interpolation create compelling visual effects, but the validity of
such transformations has not been established. In particular, does 2D
interpolation of two views of the same scene produce a sequence of
physically valid in-between views of that scene? In this paper, we
describe a simple image rectification procedure which guarantees that
interpolation does in fact produce valid views, under generic
assumptions about visibility and the projection process. Towards this
end, it is first shown that two basis views are sufficient to predict
the appearance of the scene within a specific range of new viewpoints.
Second, it is demonstrated that interpolation of the rectified basis
images produces exactly this range of views. Finally, it is shown
that generating this range of views is a theoretically well-posed
problem, requiring neither knowledge of camera positions nor 3D scene
reconstruction. A scanline algorithm for view interpolation is
presented that requires only four user-provided feature
correspondences to produce valid orthographic views. The quality of
the resulting images is demonstrated with interpolations of real
imagery.
}
BIBTEX
{
@inproceedings{ Seitz:1995:wrvs,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Physically-Valid View Synthesis by Image Interpolation",
booktitle = "Proc. IEEE Workshop on Representations of Visual Scenes",
pages = {18--25},
year = 1995}
}
--------------------------------------------------------------
KEY { seitz.1995.iccv }
TITLE { Complete Scene Structure from Four Point Correspondences }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. 5th Int. Conf. Computer Vision, 1995, 330-337. }
ABSTRACT
{
A new technique is presented for computing 3D scene structure from
point and line features in monocular image sequences. Unlike previous
methods, the technique guarantees the completeness of the recovered
scene, ensuring that every scene feature that is detected in each
image is reconstructed. The approach relies on the presence of four or
more reference features whose correspondences are known in all the
images. Under an orthographic or affine camera model, the parallax of
the reference features provides constraints that simplify the recovery
of the rest of the visible scene. An efficient recursive algorithm is
described that uses a unified framework for point and line
features. The algorithm integrates the tasks of feature correspondence
and structure recovery, ensuring that all reconstructible features are
tracked. In addition, the algorithm is immune to outliers and
feature-drift, two weaknesses of existing structure-from-motion
techniques. Experimental results are presented for real images.
}
BIBTEX
{
@inproceedings{seitz:iccv95,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Complete Scene Structure from Four Point Correspondences",
booktitle = "Proc. Fifth Int. Conf. on Computer Vision",
pages = {330-337},
year = 1995}
}
--------------------------------------------------------------
KEY { lai.1995.pami }
TITLE { Deformable Contours: Modeling and Extraction }
AUTHORS { K. F. Lai and R. T. Chin }
PUBLISHEDIN { IEEE Trans. Pattern Analysis and Machine Intell. **17**, 1995, 1084-1090. }
ABSTRACT
{
This paper considers the problem of modeling and extracting arbitrary deformable
contours from noisy images. We propose a global contour model based on a stable
and regenerative shape matrix, which is invariant and unique under rigid
motions. Combined with Markov random field to model local deformations, this
yields prior distribution that exerts influence over a global model while
allowing for deformations. We then cast the problem of extraction into posterior
estimation and show its equivalence to energy minimization of a generalized
active contour model. We discuss pertinent issues in shape training, energy
minimization, line search strategies, minimax regularization and initialization
by generalized Hough transform. Finally, we present experimental results and
compare its performance to rigid template matching.
}
BIBTEX
{
@article{ Lai:1995:pami,
author = "K. F. Lai and R. T. Chin",
title = "Deformable Contours: Modeling and Extraction",
journal = "IEEE Trans. Pattern Analysis and Machine Intelligence",
volume = "17",
year = "1995",
pages = "1084--1090"}
}
--------------------------------------------------------------
KEY { hibbard.1995.thesis }
TITLE { Visualizing Scientific Computations: A System based on Lattice-Structured Data and Display Models }
AUTHORS { W. L. Hibbard }
PUBLISHEDIN
{
Ph.D. Dissertation, Computer Sciences Department Technical Report 1226,
University of Wisconsin - Madison, May 1995.
}
ABSTRACT
{
In this thesis we develop a system that makes scientific
computations visible and enables physical scientists to perform visual
experiments with their computations. Our approach is unique in the way
it integrates visualization with a scientific programming language. Data
objects of any user-defined data type can be displayed, and can be
displayed in any way that satisfies broad analytic conditions, without
requiring graphics expertise from the user. Furthermore, the system is
highly interactive.

In order to achieve generality in our architecture, we first analyze the nature of scientific data and displays, and the visualization mappings between them. Scientific data and displays are usually approximations to mathematical objects (i.e., variables, vectors and functions) and this provides a natural way to define a mathematical lattice structure on data models and display models. Lattice-structured models provide a basis for integrating certain forms of scientific metadata into the computational and display semantics of data, and also provide a rigorous interpretation of certain expressiveness conditions on the visualization mapping from data to displays. Visualization mappings satisfying these expressiveness conditions are lattice isomorphisms. Applied to the data types of a scientific programming language, this implies that visualization mappings from data aggregates to display aggregates can always be decomposed into mappings of data primitives to display primitives.

These results provide very flexible data and display models, and
provide the basis for flexible and easy-to-use visualization of data objects
occurring in scientific computations.
}
BIBTEX
{
@phdthesis{ Hibbard:1995:phd,
author = "William L. Hibbard",
title = "Visualizing Scientific Computations: A System based on Lattice-Structured Data and Display Models",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 1995}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1994 }
--------------------------------------------------------------
KEY { kutulakos.1994.ijcv }
TITLE { Recovering Shape by Purposive Viewpoint Adjustment }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Int. J. Computer Vision **12**, 1994, 113-136 }
ABSTRACT
{
We present an approach for recovering surface shape from the occluding
contour using an active (i.e., moving) observer. It is based on a
relation between the geometries of a surface in a scene and its
occluding contour: If the viewing direction of the observer is along a
principal direction for a surface point whose projection is on the
contour, surface shape (i.e., curvature) at the surface point can be
recovered from the contour. Unlike previous approaches for recovering
shape from the occluding contour, we use an observer that
*purposefully* changes viewpoint in order to achieve a
well-defined geometric relationship with respect to a 3D shape prior
to its recognition. We show that there is a simple and efficient
viewing strategy that allows the observer to align the viewing
direction with one of the two principal directions for a point on the
surface. This strategy depends on only curvature measurements on the
occluding contour and therefore demonstrates that recovering
quantitative shape information from the contour does not require
knowledge of the velocities or accelerations of the observer.
Experimental results demonstrate that our method can be easily
implemented and can provide reliable shape information from the
occluding contour.
}
BIBTEX
{
@article{kutu:ijcv94,
author = "Kiriakos N. Kutulakos and Charles R. Dyer",
title = "Recovering shape by purposive viewpoint adjustment",
journal = "Int. J. of Computer Vision",
volume = 12,
number = 2,
pages = {113-136},
year = 1994}
}
--------------------------------------------------------------
KEY { kutulakos.1994.ai }
TITLE { Global Surface Reconstruction by Purposive Control of Observer Motion }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Artificial Intelligence **78**, No. 1-2, 1995, 147-177. }
ABSTRACT
{
What viewpoint-control strategies are important for performing global
visual exploration tasks such as searching for specific surface
markings, building a global model of an arbitrary object, or
recognizing an object? In this paper we consider the task of
purposefully controlling the motion of an active, monocular observer
in order to recover a global description of a smooth,
arbitrarily-shaped object. We formulate global surface reconstruction
as the task of controlling the motion of the observer so that the
visible rim slides over the maximal, connected, reconstructible
surface regions intersecting the visible rim at the initial
viewpoint. We show that these regions are bounded by a subset of the
visual event curves defined on the surface.

By studying the epipolar parameterization, we develop two basic
strategies that allow reconstruction of a surface region around any
point in a reconstructible surface region. These strategies control
viewpoint to achieve and maintain a well-defined geometric
relationship with the object's surface, rely only on information
extracted directly from images (e.g., tangents to the occluding
contour), and are simple enough to be performed in real time. We
then show how global surface reconstruction can be provably achieved
by (1) appropriately integrating these strategies to iteratively
``grow'' the reconstructed regions, and (2) obeying four simple
rules.
}
BIBTEX
{
@article{kutu:ai95,
author = "Kiriakos N. Kutulakos and Charles R. Dyer",
title = "Global surface reconstruction by purposive control of observer motion",
journal = "Artificial Intelligence",
pages = {147-177},
volume = 78,
year = 1995}
}
--------------------------------------------------------------
KEY { kutulakos.1994.cvpr.a}
TITLE { Occluding Contour Detection using Affine Invariants and Purposive Viewpoint Control }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1994, 323-330. }
ABSTRACT
{
We present an approach for identifying the occluding contour and
determining its sidedness using an active (i.e., moving) observer. It
is based on the *non-stationarity property* of the visible rim:
When the observer's viewpoint is changed, the visible rim is a
collection of curves that ``slide,'' rigidly or non-rigidly, over the
surface. We show that the observer can deterministically choose three
views on the tangent plane of selected surface points to distinguish
such curves from stationary surface curves (i.e., surface
markings). Our approach demonstrates that the occluding contour can be
identified *directly*, i.e., without first computing surface
shape (distance and curvature).
}
BIBTEX
{
@inproceedings{kutu:cvpr94a,
author = "Kiriakos N. Kutulakos and Charles R. Dyer",
title = "Occluding Contour Detection Using Affine Invariants and Purposive Viewpoint Control",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 1994,
pages = {323-330}}
}
--------------------------------------------------------------
KEY { kutulakos.1994.cvpr.b }
TITLE { Global Surface Reconstruction by Purposive Control of Observer Motion }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1994, 331-338 }
ABSTRACT
{
What real-time, qualitative viewpoint-control behaviors are important
for performing global visual exploration tasks such as searching for
specific surface markings, building a global model of an arbitrary
object, or recognizing an object? In this paper we consider the task
of purposefully controlling the motion of an active, monocular
observer in order to recover a global description of a smooth,
arbitrarily-shaped object using the occluding contour. By studying the
epipolar parameterization, we develop two basic behaviors that allow
reconstruction of a patch around any point in a reconstructible
surface region. These behaviors rely only on information extracted
directly from images (e.g., tangents to the occluding contour), and
are simple enough to be executed in real time. We then show how global
surface reconstruction can be provably achieved by(1) integrating
these behaviors to iteratively "grow" the reconstructed regions, and
(2) obeying four simple rules.
}
BIBTEX
{
@inproceedings{kutu:cvpr94b,
author = "Kiriakos N. Kutulakos and Charles R. Dyer",
title = "Global Surface Reconstruction by Purposive Control of Observer Motion",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 1994,
pages = {331-338}}
}
--------------------------------------------------------------
KEY { kutulakos.1994.cbvw }
TITLE { Building Global Object Models by Purposive Viewpoint Control }
AUTHORS { K. N. Kutulakos, W. B. Seales, and C. R. Dyer }
PUBLISHEDIN { Proc. 2nd CAD-Based Vision Workshop, 1994, 169-182. }
ABSTRACT
{
We present an approach for recovering a global surface model of an
object from the deformation of the occluding contour using an active
(i.e., mobile) observer able to control its motion. In particular, we
consider two problems: (1) How can the observer's viewpoint be
controlled in order to generate a dense sequence of images that allows
incremental reconstruction of an unknown surface, and (2) how can we
construct a global surface model from the generated image sequence?
Solving these two problems is crucial for automatically constructing
models of objects whose surface is non-convex and self-occludes. We
achieve the first goal by *purposefully* and *qualitatively*
controlling the observer's instantaneous direction of motion in order
to control the motion of the visible rim over the surface. We achieve
the second goal by using a calibrated trinocular camera rig and a
mechanism for controlling the relative position and orientation of the
viewed surface with respect to the trinocular rig.
}
BIBTEX
{
@inproceedings{ Kutulakos:1994:cbvw,
author = "Kiriakos N. Kutulakos and W. Brent Seales and Charles R. Dyer",
title = "Building Global Object Models by Purposive Viewpoint Control",
booktitle = "Proc. 2nd CAD-Based Vision Workshop",
year = 1994,
pages = {169--182} }
}
--------------------------------------------------------------
KEY { kutulakos.1994.thesis }
TITLE { Exploring Three-Dimensional Objects by Controlling the Point of Observation }
AUTHORS { K. N. Kutulakos }
PUBLISHEDIN
{
Ph.D. Dissertation, Computer Sciences Department Technical Report 1251,
University of Wisconsin - Madison, October 1994.
}
ABSTRACT
{
In this thesis we study how controlled movements of a camera can be
used to infer properties of a curved object's three-dimensional shape.
The unknown geometry of an environment's objects, the effects of
self-occlusion, the depth ambiguities caused by the projection
process, and the presence of noise in image measurements are a few of
the complications that make object-dependent movements of the camera
advantageous in certain shape recovery tasks. Such movements can
simplify local shape computations such as curvature estimation, allow
use of weaker camera calibration assumptions, and enable the
extraction of global shape information for objects with complex
surface geometry. The utility of object-dependent camera movements is
studied in the context of three tasks, each involving the extraction
of progressively richer information about an object's unknown shape:
(1) detecting the occluding contour, (2) estimating surface curvature
for points projecting to the contour, and (3) building a
three-dimensional model for an object's entire surface. Our main
result is the development of three distinct active vision strategies
that solve these three tasks by controlling the motion of a camera.

Occluding contour detection and surface curvature estimation are achieved by exploiting the concept of a special viewpoint: For any image there exist special camera positions from which the object's view trivializes these tasks. We show that these positions can be deterministically reached, and that they enable shape recovery even when few or no markings and discontinuities exist on the object's surface, and when differential camera motion measurements cannot be accurately obtained.

A basic issue in building three-dimensional global object models is how to control the camera's motion so that previously-unreconstructed regions of the object become reconstructed. A fundamental difficulty is that the set of reconstructed points can change unpredictably (e.g., due to self-occlusions) when ad hoc motion strategies are used. We show how global model-building can be achieved for generic objects of arbitrary shape by controlling the camera's motion on automatically-selected surface tangent and normal planes so that the boundary of the already-reconstructed regions is guaranteed to "slide" over the object's entire surface.

Our work emphasizes the need for (1) controlling camera motion
through efficient processing of the image stream, and (2) designing
provably-correct strategies, i.e., strategies whose success can be
accurately characterized in terms of the geometry of the viewed
object. For each task, efficiency is achieved by extracting from
each image only the information necessary to move the camera
differentially, assuming a dense sequence of images, and using 2D
rather than 3D information to control camera motion. Provable
correctness is achieved by controlling camera motion based on the
occluding contour's dynamic shape and maintaining specific
task-dependent geometric constraints that relate the camera's motion
to the differential geometry of the object.
}
BIBTEX
{
@phdthesis{ Kutulakos:1994:phd,
author = "Kiriakos N. Kutulakos",
title = "Exploring Three-Dimensional Objects by Controlling the Point of Observation",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 1994}
}
--------------------------------------------------------------
KEY { kutulakos.1994.icra }
TITLE { Provable Strategies for Vision-Guided Exploration in Three Dimensions }
AUTHORS { K. N. Kutulakos, C. R. Dyer, and V. J. Lumelsky }
PUBLISHEDIN { Proc. 1994 IEEE Int. Conf. Robotics and Automation, 1994, 1365-1372. }
ABSTRACT
{
An approach is presented for exploring an unknown, arbitrary surface
in three-dimensional (3D) space by a mobile robot. The main
contributions are (1) an analysis of the capabilities a robot must
possess and the trade-offs involved in the design of an exploration
strategy, and (2) two provably-correct exploration strategies that
exploit these trade-offs and use visual sensors (e.g., cameras and
range sensors) to plan the robot's motion. No such analysis existed
previously for the case of a robot moving freely in 3D space. The
approach exploits the notion of the *occlusion boundary*, i.e.,
the points separating the visible from the occluded parts of an
object. The occlusion boundary is a collection of curves that
``slide'' over the surface when the robot's position is continuously
controlled, inducing the visibility of surface points over which they
slide. The paths generated by our strategies force the occlusion
boundary to slide over the entire surface. The strategies provide a
basis for integrating motion planning and visual sensing under a
common computational framework.
}
BIBTEX
{
@inproceedings{ Kutulakos:1994:icra,
author = "K. N. Kutulakos and Charles R. Dyer and V. J. Lumelsky",
title = "Provable Strategies for Vision-Guided Exploration in Three Dimensions",
booktitle = "Proc. 1994 IEEE Int. Conf. Robotics and Automation",
pages = "1365-1372",
year = 1994}
}
--------------------------------------------------------------
KEY { lai.1994.cvpr }
TITLE { Deformable Contours: Modeling and Extraction }
AUTHORS { K. F. Lai and R. T. Chin }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1994, 601-608. }
ABSTRACT
{
This paper considers the problem of modeling and extracting arbitrary
deformable contours from noisy images. We propose a global contour
model based on a stable and regenerative shape matrix, which is
invariant and unique under rigid motions. Combined with Markov random
field to model local deformations, this yields prior distribution that
exerts influence over a global model while allowing for
deformations. We then cast the problem of extraction into posterior
estimation and show its equivalence to energy minimization of a
generalized active contour model. We discuss pertinent issues in shape
training, energy minimization, line search strategies, minimax
regularization and initialization by generalized Hough
transform. Finally, we present experimental results and compare its
performance to rigid template matching.
}
BIBTEX
{
@inproceedings{ Lai:1994:cvpr,
author = "K. F. Lai and R. T. Chin",
title = "Deformable Contours: Modeling and Extraction",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
pages = "601--608",
year = 1994}
}
--------------------------------------------------------------
KEY { seitz.1994.nram }
TITLE { Detecting Irregularities in Cyclic Motion }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. Workshop on Motion of Non-Rigid and Articulated Objects, 1994, 178-185. }
ABSTRACT
{
Real cyclic motions tend not to be perfectly even, i.e., the period varies
slightly from one cycle to the next, because of physically important changes
in the scene. A generalization of period is defined for cyclic motions
that makes periodic variation explicit. This representation, called the
period trace, is compact and purely temporal, describing the evolution
of an object or scene without reference to spatial quantities such as
position or velocity. By delimiting cycles and identifying correspondences
across cycles, the period trace provides a means of temporally registering
a cyclic motion. In addition, several purely temporal motion features are
derived, relating to the nature and location of irregularities. Results
are presented using real image sequences and applications to athletic and
medical motion analysis are discussed.
}
BIBTEX
{
@inproceedings{seitz:mnao94,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Detecting Irregularities in Cyclic Motion",
booktitle = "Proc. Workshop on Motion of Non-Rigid and Articulated Objects",
pages = {178-185},
year = 1994}
}
--------------------------------------------------------------
KEY { seitz.1994.cvpr }
TITLE { Affine Invariant Detection of Periodic Motion }
AUTHORS { S. M. Seitz and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1994, 970-975. }
ABSTRACT
{
Current approaches for detecting periodic motion assume a stationary camera
and place limits on an object's motion. These approaches rely on the
assumption that a periodic motion projects to a set of periodic image
curves, an assumption that is invalid in general.
Using affine-invariance, we
derive necessary and sufficient conditions for an image sequence to be
the projection of a periodic motion. No restrictions are placed on
either the motion of the camera or the object.
Our algorithm is shown to be provably-correct for
noise-free data and is extended to be robust with respect to
occlusions and noise. The extended algorithm is evaluated with real and
synthetic image sequences.
}
BIBTEX
{
@inproceedings{seitz:cvpr94,
author = "Steven M. Seitz and Charles R. Dyer",
title = "Affine Invariant Detection of Periodic Motion",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
pages = {970-975},
year = 1994}
}
--------------------------------------------------------------
KEY { lai.1994.icarcv }
TITLE { On Classifying Deformable Contours Using the Generalized Active Contour Model }
AUTHORS { K. F. Lai and R. T. Chin }
PUBLISHEDIN { Proc. Int. Conf. Automation, Robotics and Computer Vision, Singapore, 1994. }
ABSTRACT
{
Recently, we proposed the generalized active contour model (g-snake) to model
and extract deformable contours from noisy images. This paper demonstrates the
usefulness of g-snake in classifying among several candidate deformable
contours. The g-snake is suitable for this task because its shape
representation is unique, affine invariant and possesses
metric properties. We derive the
optimal classification test and show that this requires marginalization of the
distribution. However, as the summation is peaked around the posterior estimate
in most practical applications, only small regions need to be considered.
Finally, we performed extensive experimentations and report significant
improvement over matched template in handwritten numeral recognition.
}
BIBTEX
{
@inproceedings{ Lai:1994:icarcv,
author = "K. F. Lai and R. T. Chin",
title = "On Classifying Deformable Contours Using the Generalized Active Contour Model",
booktitle = "Proc. Int. Conf. Automation, Robotics and Computer Vision",
year = 1994}
}
--------------------------------------------------------------
KEY { lai.1994.thesis }
TITLE { Deformable Contours: Modeling, Extraction, Detection and Classification }
AUTHORS { K. F. Lai }
PUBLISHEDIN
{
Ph.D. Dissertation, Electrical and Computer Engineering Department, August 1994.
}
ABSTRACT
{
This thesis presents an integrated approach in modeling, extracting,
detecting and classifying deformable contours directly from noisy
images. We begin by conducting a case study on regularization,
formulation and initialization of the active contour models
(snakes). Using minimax principle, we derive a regularization
criterion whereby the values can be automatically and implicitly
determined along the contour. Furthermore, we formulate a set of
energy functionals which yield snakes that contain Hough transform as
a special case. Subsequently, we consider the problem of modeling and
extracting arbitrary deformable contours from noisy images. We
combine a stable, invariant and unique contour model with Markov
random field to yield prior distribution that exerts influence over
an arbitrary global model while allowing for deformation. Under the
Bayesian framework, contour extraction turns into posterior
estimation, which is in turn equivalent to energy minimization in a
generalized active contour model. Finally, we integrate these lower
level visual tasks with pattern recognition processes of detection and
classification. Based on the Nearman-Pearson lemma, we derive the
optimal detection and classification tests. As the summation is
peaked in most practical applications, only small regions need to be
considered in marginalizing the distribution. The validity of our
formulation have been confirmed by extensive and rigorous
experimentations.
}
BIBTEX
{
@phdthesis{ Lai:1999:phd,
author = "K. F. Lai",
title = "Deformable Contours: Modeling, Extraction, Detection and Classification",
school = "University of Wisconsin - Madison",
address = "Madison, WI",
year = 1994}
}
--------------------------------------------------------------
KEY { hibbard.1994.computer }
TITLE { Interactive Visualization of Earth and Space Science Computations }
AUTHORS { W. L. Hibbard, B. E. Paul, A. L. Battaiola, D. A. Santek, M-F. Voidrot-Martinez, and C. R. Dyer }
PUBLISHEDIN { Computer** 27**, No. 7, July 1994, 65-72. }
ABSTRACT
{
We describe techniques that enable Earth and space scientists
to interactively visualize and experiment with their computations.
Numerical simulations of the Earth's atmosphere and oceans generate
large and complex data sets, which we visualize in a highly interactive
virtual Earth environment. We use data compression and distributed
computing to maximize the size of simulations that can be explored,
and a user interface tuned to the needs of environmental modelers.
For the broader class of computations used by scientists we have
developed more general techniques, integrating visualization with an
environment for developing and executing algorithms. The key is
providing a flexible data model that lets users define data types
appropriate for their algorithms, and also providing a display model
that lets users visualize those data types without placing a substantial
burden of graphics knowledge on them.
}
BIBTEX
{
@article{ Hibbard:1994:computer,
author = "W. L. Hibbard and B. E. Paul and A. L. Battaiola and D. A. Santek and M-F. Voidrot-Martinez and C. R. Dyer",
title = "Interactive Visualization of Earth and Space Science Computations",
journal = "Computer",
volume = "27",
number = "7",
month = "July",
year = "1994",
pages = "65--72"}
}
--------------------------------------------------------------
KEY { hibbard.1994.vis }
TITLE { A Lattice Model for Data Display }
AUTHORS { W. L. Hibbard, C. R. Dyer, and B. E. Paul }
PUBLISHEDIN { Proc. Visualization '94, 1994, 310-317. }
ABSTRACT
{
In order to develop a foundation for visualization, we develop
lattice models for data objects and displays that focus on the fact that
data objects are approximations to mathematical objects and real
displays are approximations to ideal displays. These lattice models
give us a way to quantize the information content of data and displays
and to define conditions on the visualization mappings from data to
displays. Mappings satisfy these conditions if and only if they are
lattice isomorphisms. We show how to apply this result to scientific
data and display models, and discuss how it might be applied to
recursively defined data types appropriate for complex information
processing.
}
BIBTEX
{
@inproceedings{ Hibbard:1994:vis,
author = "W. L. Hibbard and Charles R. Dyer and B. E. Paul",
title = "A Lattice Model for Data Display",
booktitle = "Proc. Visualization '94",
pages = "310--317",
year = 1994}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1993 }
--------------------------------------------------------------
KEY { kutulakos.1993.cvpr }
TITLE { Toward Global Surface Reconstruction by Purposive Viewpoint Adjustment }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1993, 726-727. }
ABSTRACT
{
We consider the following problem: How should an observer change viewpoint
in order to generate a dense image sequence of an arbitrary smooth surface
so that it can be incrementally reconstructed using the occluding contour
and the epipolar parameterization? We present a collection of qualitative
behaviors that, when integrated appropriately, purposefully control
viewpoint based on the appearance of the surface in order to provably solve
this problem.
}
BIBTEX
{
@inproceedings{ Kutulakos:1993:cvpr,
author = "K. N. Kutulakos and C. R. Dyer",
title = "Toward Global Surface Reconstruction by Purposive Viewpoint Adjustment",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
year = 1993,
pages = {726--727} }
}
--------------------------------------------------------------
KEY { kutulakos.1993.icra }
TITLE { Vision-Guided Exploration: A Step toward General Motion Planning in Three Dimensions }
AUTHORS { K. N. Kutulakos, V. J. Lumelsky, and C. R. Dyer }
PUBLISHEDIN { Proc. 1993 IEEE Int. Conf. on Robotics and Automation, 1993, 289-296. }
ABSTRACT
{
We present an approach for solving the path planning problem for a mobile
robot operating in an unknown, three dimensional environment containing
obstacles of arbitrary shape. The main contributions of this paper are (1)
an analysis of the type of sensing information that is necessary and
sufficient for solving the path planning problem in such environments, and
(2) the development of a framework for designing a provably-correct
algorithm to solve this problem. Working from first principles, without any
assumptions about the environment of the robot or its sensing capabilities,
our analysis shows that the ability to explore the obstacle surfaces (i.e.,
to make all their points visible) is intrinsically linked with the ability
to plan the motion of the robot. We argue that current approaches to the
path planning problem with incomplete information simply do not extend to
the general three-dimensional case, and that qualitatively different
algorithms are needed.
}
BIBTEX
{
@inproceedings{ Kutulakos:1993:icra,
author = "K. N. Kutulakos and V. J. Lumelsky and Charles R. Dyer",
title = "Vision-Guided Exploration: A Step toward General Motion Planning in Three Dimensions",
booktitle = "Proc. 1993 IEEE Int. Conf. on Robotics and Automation",
pages = "289--296",
year = 1993}
}
--------------------------------------------------------------
KEY { allmen.1993.cvgip }
TITLE { Computing Spatiotemporal Relations for Dynamic Perceptual Organization }
AUTHORS { M. Allmen and C. R. Dyer }
PUBLISHEDIN { Computer Vision, Graphics and Image Processing: Image Understanding** 58**, 1993, 338-351 }
ABSTRACT
{
To date, the overwhelming use of motion in computational vision has
been to recover the three-dimensional structure of the scene. We
propose that there are other, more powerful, uses for motion. Toward
this end, we define dynamic perceptual organization as an extension of
the traditional (static) perceptual organization approach. Just as
static perceptual organization groups coherent features in an image,
dynamic perceptual organization groups coherent motions through an
image sequence. Using dynamic perceptual organization, we propose a
new paradigm for motion understanding and show why it can be done
independently of the recovery of scene structure and scene motion.
The paradigm starts with a spatiotemporal cube of image data and
organizes the paths of points so that interactions between the paths
and perceptual motions such as common, relative and cyclic are made
explicit. The results of this can then be used for high-level motion
recognition tasks.
}
BIBTEX
{
@article{ Allmen:1993:cvgip,
author = "M. Allmen and C. R. Dyer",
title = "Computing Spatiotemporal Relations for Dynamic Perceptual Organization",
journal = "Computer Vision, Graphics and Image Processing: Image Understanding",
volume = "58",
year = "1993",
pages = "338--351"}
}
--------------------------------------------------------------
KEY { waldon.1993.wqv }
TITLE { Dynamic Shading, Motion Parallax and Qualitative Shape }
AUTHORS { S. Waldon and C. R. Dyer }
PUBLISHEDIN { Proc. IEEE Workshop on Qualitative Vision, 1993, 61-70. }
ABSTRACT
{
We address the problem of qualitative shape
recovery from moving surfaces. Our analysis is unique in that we
consider specular inter-reflections and explore the effects of both
motion parallax and changes in shading. To study this situation we
define an image flow field called the reflection flow field,
which describes the motion of reflection points and the motion of the
surface. From a kinematic analysis, we show that the reflection flow
is qualitatively different from the motion parallax because it is
discontinuous at or near parabolic curves. We also show that when the
gradient of the reflected image is strong, gradient-based flow
measurement techniques approximate the reflection flow field and not
the motion parallax. We conclude from these analyses that reliable
qualitative shape information is generally available only at
discontinuities in the image flow field.
}
BIBTEX
{
@inproceedings{ Waldon:1993:wqv,
author = "S. Waldon and Charles R. Dyer",
title = "Dynamic Shading, Motion Parallax and Qualitative Shape",
booktitle = "Proc. IEEE Workshop on Qualitative Vision",
pages = "61--70",
year = 1993}
}
--------------------------------------------------------------
KEY { eggert.1993.pami }
TITLE { The Scale Space Aspect Graph }
AUTHORS { D. W. Eggert, K. W. Bowyer, C. R. Dyer, H. I. Christensen, and D. B. Goldgof }
PUBLISHEDIN { IEEE Trans. Pattern Analysis and Machine Intelligence** 15**, 1993, 1114-1130. }
ABSTRACT
{
Currently the aspect graph is computed from the theoretical standpoint
of perfect resolution in object shape, the viewpoint and the projected image.
This means that the aspect graph may include details that an observer could
never see in practice. Introducing the notion of scale into the aspect graph
framework provides a mechanism for selecting a level of detail that is
"large enough" to merit explicit representation. This effectively allows
control over the number of nodes retained in the aspect graph. This paper
introduces the concept of the scale space aspect graph, defines three
different interpretations of the scale dimension, and presents a detailed
example for a simple class of objects, with scale defined in terms of the
spatial extent of features in the image.
}
BIBTEX
{
@article{ Eggert:1993:pami,
author = "D. W. Eggert and K. W. Bowyer and C. R. Dyer and H. I. Christensen and D. B. Goldgof",
title = "The Scale Space Aspect Graph",
journal = "IEEE Trans. Pattern Analysis and Machine Intelligence",
volume = "15",
year = "1993",
pages = "1114--1130"}
}
--------------------------------------------------------------
KEY { lai.1993.accv }
TITLE { On Regularization, Formulation and Initialization of Active Contour Models (Snakes) }
AUTHORS { K. F. Lai and R. T. Chin }
PUBLISHEDIN { Proc. 1st Asian Conf. on Computer Vision, 1993, 542-545. }
ABSTRACT
{
In snake formulation, large regularization enhances the robustness against noise
and incomplete data, while small values increase the accuracy in capturing
boundary variations. We present a local minimax criterion which automatically
determines the optimal regularization at every locations along the boundary with
no added computation cost. We also modify existing energy formulations to repair
deficiencies in internal energy and improve performance in external energy. This
yields snakes that contain Hough transform as a special case. We can therefore
initialize the snake efficiently and reliably using Hough transform.
}
BIBTEX
{
@inproceedings{ Lai:1993:accv,
author = "K. F. Lai and Charles R. Dyer",
title = "On Regularization, Formulation and Initialization of Active Contour Models (Snakes)",
booktitle = "Proc. 1st Asian Conf. on Computer Vision",
pages = "542--545",
year = 1993}
}
--------------------------------------------------------------
KEY { kutulakos.1993.spie }
TITLE { Building Global Object Models by Purposive Viewpoint Control }
AUTHORS { K. N. Kutulakos, W. B. Seales, and C. R. Dyer }
PUBLISHEDIN { Proc. SPIE: Sensor Fusion VI, 1993, 368-383. }
ABSTRACT
{
We present an approach for recovering a global surface model of an
object from the deformation of the occluding contour using an active
(i.e., mobile) observer able to control its motion. In particular, we
consider two problems: (1) How can the observer's viewpoint be
controlled in order to generate a dense sequence of images that allows
incremental reconstruction of an unknown surface, and (2) how can we
construct a global surface model from the generated image sequence? We
achieve the first goal by purposefully and qualitatively controlling
the observer's instantaneous direction of motion in order to control
the motion of the visible rim over the surface. We achieve the second
goal by using a stationary calibrated trinocular camera rig and a
mechanism for controlling the relative position and orientation of the
viewed surface with respect to the trinocular rig. Unlike previous
shape-from-motion approaches which derive quantitative shape
information from an arbitrarily generated sequence of images, we
develop a collection of simple and efficient viewing strategies that
allow the observer to achieve the global reconstruction goal by
maintaining specific geometric relationships with the viewed
surface. These relationships depend only on tangent computations on
the occluding contour. To demonstrate the feasibility and
effectiveness of our approach we apply the developed algorithms to
synthetic and real scenes.
}
BIBTEX
{
@inproceedings{ Kutulakos:1993:spie,
author = "K. N. Kutulakos and W. B. Seales and Charles R. Dyer",
title = "Building Global Object Models by Purposive Viewpoint Control",
booktitle = "Proc. SPIE: Sensor Fusion VI",
pages = "368--383",
year = 1993}
}
--------------------------------------------------------------
KEY { kutulakos.1993.tr1141 }
TITLE { Global Surface Reconstruction by Purposive Control of Observer Motion }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1141,
University of Wisconsin - Madison, April 1993.
}
ABSTRACT
{
What real-time, qualitative viewpoint-control behaviors are important
for performing global visual exploration tasks such as searching for
specific surface markings, building a global model of an arbitrary
object, or recognizing an object? In this paper we consider the task
of purposefully controlling the motion of an active, monocular
observer in order to recover a global description of a smooth,
arbitrarily-shaped object.

We formulate global surface reconstruction as the qualitative task of controlling the motion of the observer so that the visible rim slides over the maximal, connected, reconstructible surface regions intersecting the visible rim at the initial viewpoint. We show that these regions are bounded by a subset of the visual event curves defined on the surface.

By studying the epipolar parameterization, we develop four basic
behaviors that allow reconstruction of a surface patch around any
point in a reconstructible surface region. These behaviors control
viewpoint to achieve and maintain a well-defined geometric
relationship with the object's surface, rely only on information
extracted directly from images (e.g., tangents to the occluding
contour), and are simple enough to be executed in real time. We then
show how global surface reconstruction can be provably achieved by (1)
appropriately integrating these behaviors to iteratively "grow" the
reconstructed regions, and (2) obeying four simple rules.
}
NOTES { *A longer version.* }
BIBTEX
{
@techreport{ Kutulakos:1993:tr,
author = "K. N. Kutulakos and Charles R. Dyer",
title = "Global Surface Reconstruction by Purposive Control of Observer Motion",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = "1141",
month = "April",
year = 1993}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1992 }
--------------------------------------------------------------
KEY { kutulakos.1992.cvpr }
TITLE { Recovering Shape by Purposive Viewpoint Adjustment }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1992, 16-22. }
ABSTRACT
{
We present an approach for recovering surface shape from the occluding
contour using an active (i.e., moving) observer. It is based on a
relation between the geometries of a surface in a scene and its
occluding contour: If the viewing direction of the observer is along a
principal direction for a surface point whose projection is on the
contour, surface shape (i.e., curvature) at the surface point can be
recovered from the contour. Unlike previous approaches for recovering
shape from the occluding contour, we use an observer that purposefully
changes viewpoint in order to achieve a well-defined geometric
relationship with respect to a 3D shape prior to its recognition. We
show that there is a simple and efficient viewing strategy that allows
the observer to align the viewing direction with one of the two
principal directions for a point on the surface. Experimental results
demonstrate that our method can be easily implemented and can provide
reliable shape information.
}
BIBTEX {
@inproceedings{ Kutulakos:1992:cvpr,
author = "K. N. Kutulakos and Charles R. Dyer",
title = "Recovering Shape by Purposive Viewpoint Adjustment",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
pages = "16--22",
year = 1992}
}
--------------------------------------------------------------
KEY { kutulakos.1992.tr1124 }
TITLE { Object Exploration By Purposive, Dynamic Viewpoint Adjustment }
AUTHORS { K. N. Kutulakos, C. R. Dyer, V. J. Lumelsky }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1124,
University of Wisconsin - Madison, November 1992.
}
ABSTRACT
{
We present a viewing strategy for exploring the surface of an unknown
object (i.e., making all of its points visible) by purposefully
controlling the motion of an active observer. It is based on a simple
relation between (1) the instantaneous direction of motion of the
observer, (2) the visibility of points projecting to the occluding
contour, and (3) the surface normal at those points: If the dot product of
the surface normal at such points and the observer's velocity is positive,
the visibility of the points is guaranteed under an infinitesimal
viewpoint change. We show that this leads to an object exploration
strategy in which the observer *purposefully* controls its motion
based on the occluding contour in order to impose structure on the set of
surface points explored, make its representation simple and qualitative,
and provably solve the exploration problem for smooth generic surfaces of
arbitrary shape. Unlike previous approaches where exploration is cast as a
discrete process (i.e., asking where to look next?) and where the
successful exploration of arbitrary objects is not guaranteed, our
approach demonstrates that dynamic viewpoint control through directed
observer motion leads to a qualitative exploration strategy that is
provably-correct, depends only on the dynamic appearance of the
occluding contour, and does not require the recovery of detailed
three-dimensional shape descriptions from every position of the observer.
}
BIBTEX
{
@techreport{ Kutulakos:1992:tr,
author = "K. N. Kutulakos and Charles R. Dyer",
title = "Object Exploration By Purposive, Dynamic Viewpoint Adjustment",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = "1124",
month = "November",
year = 1992}
}
--------------------------------------------------------------
KEY { eggert.1992.cvpr }
TITLE { The Scale Space Aspect Graph }
AUTHORS { D. W. Eggert, K. W. Bowyer, C. R. Dyer, H. I. Christensen, and D. B. Goldgof }
PUBLISHEDIN { Proc. Computer Vision and Pattern Recognition Conf., 1992, 335-340. }
ABSTRACT
{
Currently the aspect graph is computed from the theoretical standpoint
of perfect resolution in the viewpoint, the projected image and the
object shape. This means that the aspect graph may include details
that an observer could never see in practice. Introducing the notion
of scale into the aspect graph framework provides a mechanism for
selecting a level of detail that is "large enough" to merit explicit
representation. This effectively allows control over the number of
nodes retained in the aspect graph. This paper introduces the concept
of the scale space aspect graph, defines an interpretation of the
scale dimension in terms of the spatial extent of features in the
image and presents a detailed example for a simple class of objects.
}
BIBTEX
{
@inproceedings{ Eggert:1992:cvpr,
author = "D. W. Eggert and K. W. Bowyer and Charles R. Dyer and H. I. Christensen and D. B. Goldgof",
title = "The Scale Space Aspect Graph",
booktitle = "Proc. Computer Vision and Pattern Recognition Conf.",
pages = "335--340",
year = 1992}
}
--------------------------------------------------------------
KEY { seales.1992.cvgip }
TITLE { Viewpoint from Occluding Contour }
AUTHORS { W. B. Seales and C. R. Dyer }
PUBLISHEDIN { Computer Vision, Graphics and Image Processing: Image Understanding** 55**, 1992, 198-211. }
ABSTRACT
{
In this paper we present the geometry and the algorithms for organizing a
viewer-centered representation of the occluding contour of polyhedra.
The contour is computed from a polyhedral boundary model as it would appear
under orthographic projection into the image plane from every viewpoint
on the view sphere.
Using this representation, we show how to derive constraints on regions in
viewpoint space from the relationship between detected image features and
our precomputed contour model.
Such constraints are based on both qualitative (viewpoint extent) and
quantitative (angle measurements and relative geometry) information that has
been precomputed about how the contour appears in the image plane as a set
of projected curves and T-junctions from self-occlusion.
The results we show from an experimental system demonstrate that features
of the occluding contour can be computed in a model-based framework,
and their geometry constrains the viewpoints from which a model will project
to a set of occluding contour features in an image.
}
BIBTEX
{
@article{ Seales:1992:cvgip,
author = "W. B. Seales and C. R. Dyer",
title = "Viewpoint from Occluding Contour",
journal = "Computer Vision, Graphics and Image Processing: Image Understanding",
volume = "55",
year = "1992",
pages = "198--211"}
}
--------------------------------------------------------------
KEY { seales.1992.ecai }
TITLE { An Occlusion-Based Representation of Shape for Viewpoint Recovery }
AUTHORS { W. B. Seales and C. R. Dyer }
PUBLISHEDIN { Proc. 10th European Conf. on Artificial Intelligence, 1992, 816-820. }
ABSTRACT
{
In this paper we present the geometry and the algorithms for
organizing and using a viewer-centered representation of the occluding
contour of polyhedra. The representation is computed from a
polyhedral model under orthographic projection for all viewing
directions. Using this representation, we derive constraints on
viewpoint correspondences between image features and model contours.
Our results show that the occluding contour, computed in a model-based
framework, can be used to strongly constrain the viewpoints where a 3D
model matches the occluding contour features of the image.
}
BIBTEX
{
@inproceedings{ Seales:1992:ecai,
author = "W. B. Seales and Charles R. Dyer",
title = "An Occlusion-Based Representation of Shape for Viewpoint Recovery",
booktitle = "Proc. 10th European Conf. on Artificial Intelligence",
pages = "816--820",
year = 1992}
}
--------------------------------------------------------------
KEY { hibbard.1992.vis }
TITLE { Display of Scientific Data Structures for Algorithm Visualization }
AUTHORS { W. Hibbard, C. R. Dyer, and B. Paul }
PUBLISHEDIN { Proc. Visualization '92, 1992, 139-146. }
ABSTRACT
{
We present a technique for defining graphical depictions for all
the data types defined in an algorithm. The ability to display arbitrary
combinations of an algorithm's data objects in a common frame of
reference, coupled with interactive control of algorithm execution,
provides a powerful way to understand algorithm behavior. Type
definitions are constrained so that all primitive values occurring in data
objects are assigned scalar types. A graphical display, including user
interaction with the display, is modeled by a special data type.
Mappings from the scalar types into the display model type provide a
simple user interface for controlling how all data types are depicted,
without the need for type-specific graphics logic.
}
BIBTEX
{
@inproceedings{ Hibbard:1992:vis,
author = "W. Hibbard and Charles R. Dyer and B. Paul",
title = "Display of Scientific Data Structures for Algorithm Visualization",
booktitle = "Proc. Visualization '92",
pages = "139--146",
year = 1992}
}
--------------------------------------------------------------
KEY { allmen.1992.tr1130 }
TITLE { Computing Spatiotemporal Relations for Dynamic Perceptual Organization }
AUTHORS { M. Allmen and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1130,
University of Wisconsin - Madison, December 1992.
}
ABSTRACT
{
To date, the overwhelming use of motion in computational vision has
been to recover the three-dimensional structure of the scene. We
propose that there are other, more powerful, uses for motion. Toward
this end, we define dynamic perceptual organization as an extension of
the traditional (static) perceptual organization approach. Just as
static perceptual organization groups coherent features in an image,
dynamic perceptual organization groups coherent motions through an
image sequence. Using dynamic perceptual organization, we propose a
new paradigm for motion understanding and show why it can be done
independently of the recovery of scene structure and scene motion. The
paradigm starts with a spatiotemporal cube of image data and organizes
the paths of points so that interactions between the paths and
perceptual motions such as common, relative and cyclic are made
explicit. The results of this can then be used for high-level motion
recognition tasks.
}
BIBTEX
{
@techreport{ Allmen:1992:tr,
author = "M. Allmen and Charles R. Dyer",
title = "Computing Spatiotemporal Relations for Dynamic Perceptual Organization",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = "1130",
month = "December",
year = 1992}
}
--------------------------------------------------------------
##############################################################
##############################################################
##############################################################
YEAR { 1991 }
--------------------------------------------------------------
KEY { kutulakos.1991.tr1035 }
TITLE { Recovering Shape by Purposive Viewpoint Adjustment }
AUTHORS { K. N. Kutulakos and C. R. Dyer }
PUBLISHEDIN
{
Computer Sciences Department Technical Report 1035,
University of Wisconsin - Madison, August 1991.
}
ABSTRACT
{
We present an approach for recovering surface shape from the occluding
contour using an active (i.e., moving) observer. It is based on a
relation between the geometries of a surface in a scene and its
occluding contour: If the viewing direction of the observer is along a
principal direction for a surface point whose projection is on the
contour, surface shape (i.e., curvature) at the surface point can be
recovered from the contour. Unlike previous approaches for recovering
shape from the occluding contour, we use an observer that purposefully
changes viewpoint in order to achieve a well-defined geometric
relationship with respect to a 3D shape prior to its recognition. We
show that there is a simple and efficient viewing strategy that allows
the observer to align their viewing direction with one of the two
principal directions for a point on the surface. This strategy depends
on only curvature measurements on the occluding contour and therefore
demonstrates that recovering quantitative shape information from the
contour does not require knowledge of the velocities or accelerations
of the observer. Experimental results demonstrate that our method can
be easily implemented and can provide reliable shape information from
the occluding contour.
}
BIBTEX
{
@techreport{ Kutulakos:1991:tr,
author = "K. N. Kutulakos and Charles R. Dyer",
title = "Recovering Shape by Purposive Viewpoint Adjustment",
institution = "Computer Sciences Department, University of Wisconsin-Madison",
number = "1035",
month = "August",
year = 1991}
}
--------------------------------------------------------------
KEY { allmen.1991.thesis }
TITLE { Image Sequence Description using Spatiotemporal Flow Curves: Toward Motion-Based Recognition }
AUTHORS { M. C. Allmen }
PUBLISHEDIN
{
Ph.D. Dissertation, Computer Sciences Department Technical Report 1040,
University of Wisconsin - Madison, August 1991.
}
ABSTRACT
{
Recovering a hierarchical motion description of a long image sequence
is one way to recognize objects and their motions. Intermediate-level
and high-level motion analysis, i.e., recognizing a coordinated
sequence of events such as walking and throwing, has been formulated
previously as a process that follows high-level object
recognition. This thesis develops an alternative approach to
intermediate-level and high-level motion analysis. It does not depend
on complex object descriptions and can therefore be computed prior to
object recognition. Toward this end, a new computational framework for
low and intermediate-level processing of long sequences of images is
presented.

Our new computational framework uses spatiotemporal (ST) surface flow and ST flow curves. As contours move, their projections into the image also move. Over time, these projections sweep out ST surfaces. Thus, these surfaces are direct representations of object motion. ST surface flow is defined as the natural extension of optical flow to ST surfaces. For every point on an ST surface, the instantaneous velocity of that point on the surface is recovered. It is observed that arc length of a rigid contour does not change if that contour is moved in the direction of motion on the ST surface. Motivated by this observation, a function measuring arc length change is defined. The direction of motion of a contour undergoing motion parallel to the image plane is shown to be perpendicular to the gradient of this function.

ST surface flow is then used to recover ST flow curves. ST flow curves are defined such that the tangent at a point on the curve equals the ST surface flow at that point. ST flow curves are then grouped so that each cluster represents a temporally-coherent structure, i.e., structures that result from an object or surface in the scene undergoing motion. Using these clusters of ST flow curves, separate moving objects in the scene can be hypothesized and occlusion and disocclusion between them can be identified.

The problem of detecting cyclic motion, while recognized by the psychology community, has received very little attention in the computer vision community. In order to show the representational power of ST flow curves, cyclic motion is detected using ST flow curves without prior recovery of complex object descriptions. } BIBTEX { @phdthesis{ Allmen:1991:phd, author = "Mark C. Allmen", title = "Image Sequence Description using Spatiotemporal Flow Curves: Toward Motion-Based Recognition", school = "University of Wisconsin - Madison", address = "Madison, WI", year = 1991} } -------------------------------------------------------------- KEY { seales.1991.thesis } TITLE { Appearance Models of Three-Dimensional Shape for Machine Vision and Graphics } AUTHORS { W. B. Seales } PUBLISHEDIN { Ph.D. Dissertation, Computer Sciences Department Technical Report 1042, University of Wisconsin - Madison, August 1991. } ABSTRACT { A fundamental problem common to both computer graphics and model-based computer vision is how to efficiently model the appearance of a shape. Appearance is obtained procedurally by applying a projective transformation to a three-dimensional object-centered shape representation. This thesis presents a viewer-centered representation that is based on the visual event, a viewpoint where a specific change in the structure of the projected model occurs. We present and analyze the basis of this viewer-centered representation and the algorithms for its construction. Variations of this visual-event-based representation are applied to two specific problems: hidden line/surface display, and the solution for model pose given an image contour.

The problem of how to efficiently display a polyhedral scene over a path of viewpoints is cast as a problem of computing visual events along that path. A visual event is a viewpoint that causes a change in the structure of the image structure graph, a model's projected line drawing. The information stored with a visual event is sufficient to update a representation of the image structure graph. Thus the visible lines of a scene can be displayed as viewpoint changes by first precomputing and storing visual events, and then using those events at display time to interactively update the image structure graph. Display rates comparable to wire-frame display are achieved for large polyhedral models.

The rim appearance representation is a new, viewer-centered, exact representation of the occluding contour of polyhedra. We present an algorithm based on the geometry of polyhedral self-occlusion and on visual events for computing a representation of the exact appearance of occluding contour edges. The rim appearance representation, organized as a multi-level model of the occluding contour, is used to constrain the viewpoints of a three-dimensional model that can produce a set of detected occluding-contour features. Implementation results demonstrate that precomputed occluding-contour information efficiently and tightly constrains the pose of a model while consistently accounting for detected occluding-contour features. } BIBTEX { @phdthesis{ Seales:1991:phd, author = "William Brent Seales", title = "Appearance Models of Three-Dimensional Shape for Machine Vision and Graphics", school = "University of Wisconsin - Madison", address = "Madison, WI", year = 1991} } -------------------------------------------------------------- ############################################################## ############################################################## ############################################################## THREAD { manning.2000.tr1417, manning.2001.iccv } -------------------------------------------------------------- THREAD { seitz.1999.ijcv, seitz.1997.cvpr, seitz.1997.iuw.b } -------------------------------------------------------------- THREAD { manning.1999.cvpr, manning.1998.iuw, manning.1998.tr1387 } -------------------------------------------------------------- THREAD { kutulakos.1993.spie, kutulakos.1994.cbvw } -------------------------------------------------------------- THREAD { seitz.1998.iccv, seitz.1997.rochester } -------------------------------------------------------------- THREAD { lai.1994.cvpr, lai.1995.pami } -------------------------------------------------------------- THREAD { kutulakos.1994.ijcv, kutulakos.1992.cvpr, kutulakos.1991.tr1035 } -------------------------------------------------------------- THREAD { kutulakos.1994.ai, kutulakos.1994.cvpr.b, kutulakos.1993.tr1141 } -------------------------------------------------------------- THREAD { allmen.1993.cvgip, allmen.1992.tr1130 } -------------------------------------------------------------- THREAD { eggert.1993.pami, eggert.1992.cvpr }