著者
Carlo Tomasi
タイトル
Shape and Motion from Image Streams: a Factorization Method
日時
Sep 1991
概要
We propose a method for estimating the three-dimensional shape of objects and the motion of the camera from a stream of images The goal is to give a robot the ability to localize itself with respect to the environment, draw a map of its own surroundings, and perceive the shape of objects in order to recognize or grasp them. Solutions proposed in the past were so sensitive to noise as to be of little use in practical applications. This sensitivity is closely related to the viewercentered representation of scene geometry known as a depth map, and to the use of stereo triangulation to infer depth from the images. In fact, when objects are more than a few focal lengths away from the camera, parallax effects become subtle, and even a small amount of noise in the images produces large errors in the final shape and motion results. In our formulation, we represent shape in object-centered coordinates, and model image formation by orthographic, rather than perspective projection. In this way, depth, the distance between viewer and scene, play no role, and the problem's sensitivity to noise is critically reduced. We collect the image coordinates of P feature points tracked through F frames into a 2F X P measurement matrix. If these coordinates are measured with respect to their cent- roid, we show that represent the measurement matrix can be written as the product of two matrices that represent the camera rotation and the positions of the feature points in space. The bilinear nature of this model, and its matrix formulation, lead to a factorization method for the computation of shape and motion, based on the Singular Value Decomposition. Previous solutions assumed motion to be smooth, in one form or another, in an attempt to constrain the solution and achieve reliable convergence. The factorization method, on the other hand, makes on assump- tion about the camera motion, and can deal with the large jumps from frame to frame found, for instance, in sequences taken with a hand-held camera. To make the factorization method into a working system, we solve several corollary problems: how to select image features, how to track them from frame to frame, how to deal with occlusions, and how to cope with the noise and artifacts that corrupt image features, how to track them from frame to frame, how to deal with occlusions, and how to cope with the noise and artifacts that corrupt images recorded with ordinary equip- ment. We test the entire system with a series of experiments on real images taken both in the lab, for an accurate performance evaluation, and outdoors, to demonstrate the applicability of the method in real-life situations.
カテゴリ
CMUTR
Category: CMUTR
Institution: Department of Computer Science, Carnegie
        Mellon University
Abstract: We propose a method for estimating the three-dimensional shape
        of objects and the motion of the camera from a stream of images
        The goal is to give a robot the ability to localize itself with
        respect to the environment, draw a map of its own surroundings,
        and perceive the shape of objects in order to recognize or 
        grasp them.
        Solutions proposed in the past were so sensitive to noise as to
        be of little use in practical applications.
        This sensitivity is closely related to the viewercentered 
        representation of scene geometry known as a depth map, and to
        the use of stereo triangulation to infer depth from the images.
        In fact, when objects are more than a few focal lengths away
        from the camera, parallax effects become subtle, and even a 
        small amount of noise in the images produces large errors in 
        the final shape and motion results.
        In our formulation, we represent shape in object-centered 
        coordinates, and model image formation by orthographic, rather
        than perspective projection.
        In this way, depth, the distance between viewer and scene, play
        no role, and the problem's sensitivity to noise is critically
        reduced.
        We collect the image coordinates of P feature points tracked
        through F frames into a 2F X P measurement matrix.
        If these coordinates are measured with respect to their cent-
        roid, we show that represent the measurement matrix can be 
        written as the product of two matrices that represent the 
        camera rotation and the positions of the feature points in
        space.
        The bilinear nature of this model, and its matrix formulation,
        lead to a factorization method for the computation of shape 
        and motion, based on the Singular Value Decomposition.
        Previous solutions assumed motion to be smooth, in one form or
        another, in an attempt to constrain the solution and achieve 
        reliable convergence.
        The factorization method, on the other hand, makes on assump-
        tion about the camera motion, and can deal with the large jumps
        from frame to frame found, for instance, in sequences taken 
        with  a hand-held camera.
        To make the factorization method into a working system, we 
        solve several corollary problems: how to select image features,
        how to track them from frame to frame, how to deal with 
        occlusions, and how to cope with the noise and artifacts that
        corrupt image features, how to track them from frame to frame,
        how to deal with occlusions, and how to cope with the noise
        and artifacts that corrupt images recorded with ordinary equip-
        ment.
        We test the entire system with a series of experiments on real
        images taken both in the lab, for an accurate performance 
        evaluation, and outdoors, to demonstrate the applicability of
        the method in real-life situations.  
        
        
        
        
        
        
Number: CMU-CS-91-172
Bibtype: TechReport
Month: Sep
Author: Carlo Tomasi		
Title: Shape and Motion from Image Streams: a Factorization Method
Year: 1991
Address: Pittsburgh, PA
Super: @CMUTR