Difference between revisions of "Real-Time Geometry Scanning System"

From Immersive Visualization Lab Wiki
Jump to: navigation, search
(Real-Time Geometry Scanning System)
 
(Project Overview)
Line 2: Line 2:
 
The field of structure from motion within the study of computer vision is active and evolving. Existing approaches for using cameras to obtain the 3D structure of a scene use visual correspondence and tracking across multiple views to triangulate the position of points in the scene. This is a well-studied problem with entire textbooks about the various stages of its solution written, such as ''An Invitation to 3-D Vision: From Images to Geometric Models'', by Yi Ma, Stefano Soatto, Jana Kosecka, and Shankar Sastry.
 
The field of structure from motion within the study of computer vision is active and evolving. Existing approaches for using cameras to obtain the 3D structure of a scene use visual correspondence and tracking across multiple views to triangulate the position of points in the scene. This is a well-studied problem with entire textbooks about the various stages of its solution written, such as ''An Invitation to 3-D Vision: From Images to Geometric Models'', by Yi Ma, Stefano Soatto, Jana Kosecka, and Shankar Sastry.
  
However, purely vision-based approaches for using camera images to calculate the 3D geometry of a scene suffer from a number of well-known drawbacks. High-quality visual features must exist, and correspondences between them must be established within multiple views. The  process of matching correspondences is subject to noise which depends on each view being analyzed. Views without visual features, like images of the floors or walls of a building, are not suitable for use at all. In addition, aligningment of the views and triangulation of the geometry involves a considerable amount of computational expense. For many applications, this expense is acceptable, but once the geometry is constructed, it may be incomplete due to the presenece of "holes" from places where the user forgot to scan.
+
However, purely vision-based approaches for using camera images to calculate the 3D geometry of a scene suffer from a number of well-known drawbacks. High-quality visual features must exist, and correspondences between them must be established within multiple views. The  process of matching correspondences is subject to noise which depends on each view. Views without visual features, like images of the floors or walls of a building, are not suitable for use at all. In addition, aligningment of the views and triangulation of the geometry involves a considerable amount of computational expense. For many applications, this expense is acceptable, but once the geometry is constructed, it may be incomplete due to the presenece of "holes" from places where the user forgot to scan.
  
 
The recent emergence of geometry cameras that use structured patterns of infrared light to construct a camera-space depth map in hardware solve a number of these problems. The infrared light projection and reconstruction occur outside of the visible light spectrum, so the system does not depend on visible features at all.
 
The recent emergence of geometry cameras that use structured patterns of infrared light to construct a camera-space depth map in hardware solve a number of these problems. The infrared light projection and reconstruction occur outside of the visible light spectrum, so the system does not depend on visible features at all.

Revision as of 18:35, 26 April 2011

Project Overview

The field of structure from motion within the study of computer vision is active and evolving. Existing approaches for using cameras to obtain the 3D structure of a scene use visual correspondence and tracking across multiple views to triangulate the position of points in the scene. This is a well-studied problem with entire textbooks about the various stages of its solution written, such as An Invitation to 3-D Vision: From Images to Geometric Models, by Yi Ma, Stefano Soatto, Jana Kosecka, and Shankar Sastry.

However, purely vision-based approaches for using camera images to calculate the 3D geometry of a scene suffer from a number of well-known drawbacks. High-quality visual features must exist, and correspondences between them must be established within multiple views. The process of matching correspondences is subject to noise which depends on each view. Views without visual features, like images of the floors or walls of a building, are not suitable for use at all. In addition, aligningment of the views and triangulation of the geometry involves a considerable amount of computational expense. For many applications, this expense is acceptable, but once the geometry is constructed, it may be incomplete due to the presenece of "holes" from places where the user forgot to scan.

The recent emergence of geometry cameras that use structured patterns of infrared light to construct a camera-space depth map in hardware solve a number of these problems. The infrared light projection and reconstruction occur outside of the visible light spectrum, so the system does not depend on visible features at all.