3d scanning technology overview: kinect reconstruction algorithms explained

35
3D Scanning Technology Milwaukee 3D Printing Meetup & Voxel Metric

Upload: voxelmetric

Post on 15-May-2015

2.880 views

Category:

Technology


5 download

DESCRIPTION

Primesense depth cameras are the new standard in 3D scanning technology. The sensors have been mass-produced, and thus sold for a much lower price since the debut of Microsoft Kinect, which uses Primesense infrared LightCoding structured light technology. In this slide deck, we will describe the basics of Primesense-based 3D scanning technology from a physical and computational viewpoint.

TRANSCRIPT

Page 1: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

3D Scanning Technology

Milwaukee 3D Printing Meetup & Voxel Metric

Page 2: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Agenda• Demo• Overview of scanning technologies• Primesense sensors in-”depth”• Freehand camera reconstruction core algorithms• 3D freehand reconstruction algorithm, step-by-step

Page 3: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Demo

Page 4: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Overview of scanning technologies

ContactPrecise

Slow

Touch not always practical

Line LaserLine appears distorted from camera’s point-of-view

Geometry inferred from distortion

StereoscopicTwo offset cameras, like human vision

Computationally Expensive

Page 5: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Time-of-flight

Single-pointNot a 3D scanner, but underlies other technology

LIDARLike single point, but uses mirrors to rapidly take many measurements across scene

ToF CameraModulated light & phase detection

Gated/Shuttered

Page 6: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

More scanning technologies

PhotogrammetryLike stereoscopic, but with many “eyes” in undefined positions

Images stitched together and depth is inferred

Doesn’t work well with concave surfaces

VolumetricMany see-through images

Complex reconstruction algorithms

Structured LightLike a laser scanner on steroids

Patterns projected on object

Camera positioned at offset captures images of projection

Distortions in captured pattern used to infer geometry

Page 7: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Primesense’s LightCoding™ Structured Light Scanners

IR pattern projectorIR laser light reflects off of hologram to paint pattern on scene

IR patternDoes not change over time

Looks random, but are a few markers

Pattern gridThe pattern is repeated in a 3 x 3 grid

Page 8: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

PrimesenseLightCoding™

The dot pattern, or information about it, is hardcoded on the Primesense chip.

The sensor’s IR camera takes video of the pattern as it is projected on the scene.

Objects in the scene distort the way the pattern looks to the camera. Depth is inferred from these distortions.

Page 9: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

How exactly is depth determined?• It’s proprietary!• But that won’t stop people from guessing.• There are at least two likely methods…

Page 10: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Pattern shifting with distance

Page 11: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Astigmatic OpticsCamera has lens with different focal lengths in X and Y direction

Dots become blurry as the target surface moves away from the camera’s focal point, but in a specific way.

They may change shape or apparent orientation with distance.

Page 12: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

What is computed on the sensor chip?• The depth map is computed within the sensor, the host

computer needs only to read the depth data.• This reduces required computation for the host device.• It also probably helps to keep the Primesense algorithms secret.

• Skeletal tracking is run on the host computer• Algorithms created through machine learning• Designed to be fast for minimal stress on Xbox 360• Many, many processor hours were spent “learning” these

algorithms.• The hard work is done previously, the game only needs to run the

depth data through the process and get skeletal data out.

Page 13: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

What is sent from sensor to host?

IR StreamWhat the chip “sees” to produce the depth image

Not often used by host device

Color StreamNot used by host or device for depth

Can be “registered” with depth image for correspondence between depth and RGB pixels

Used to produce XYZRGB point data

Depth StreamResult of computation on IR camera video

Like a normal image/video stream, except pixel intensity represents distance from sensor, not color and brightness

Page 14: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Reconstruction algorithm used with Primesense cameras

How we get from a series of depth images to representations of real-world objects.

Examples: Kinect Fusion, KinFu, ReconstructMe, Skanect, Digifii

Page 15: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

SLAM• S.L.A.M. – Simultaneous Location And Mapping• Track where the camera is located, while building a model of

the scene at the same time• Done simultaneously since there is much overlap in the

types of information that is calculated.

Page 16: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

ICP – Iterative Closest Point• The core of camera location and point cloud alignment• Algorithm summary:• As the camera moves, it sees a different perspective of the

scene at every frame, but there is some overlap.• ICP repeatedly rotates and translate what the camera sees

this frame until it finds the best overlap with what the camera saw in the last frame.• When the best matching rotation and translation is found,

we not only know how to stitch the frames together, but also know how the camera moved between frames.

Page 17: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

SLAM on GPU• The full SLAM process should occur 30 times per

second, at the rate the depth frames are sent from the camera.• If the process is too slow, frames must be skipped.• Skipped frames make it harder for ICP to be

successful.• To have the algorithm run as fast as possible, the

problem is broken up into chunks and run in parallel on a standard graphics card.• Graphics cards contain thousands of processor units,

and excel at tasks which can be parallelized.

Page 18: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Reconstruction,step-by-step

Page 19: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 1:Receive a depth frame

The software receives a depth frame from the camera.

Raw depth data from the camera is often noisy.

Page 20: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 2:Bilateral filtering

Bilateral filtering removes noise from the image, while maintaining sharp transitions between pixels.

Page 21: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 3:Downsampling

The full size depth image is scaled down to half- and quarter-sized copies.

The copies are used to do 3 levels of ICP alignment.

The small image enables quick rough alignment.

As we repeat ICP with the more detailed images, the alignment is refined.

Page 22: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 4: Map the 3 depth frames• Images of depth pixels are not easy to work with

mathematically• They visualize depth in parts of an image, but not explicit

geometric coordinates• To run ICP, we need to rotate and translate the scene• Before ICP, convert full and resized images from pixel-based

to 3D coordinate-based representation• Result: a vertex and a normal map for each of the 3 depth

images – a pair of maps at 3 different levels of accuracy

Page 23: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 4: Map the 3 depth frames

Vertex Map (point cloud) Normal Map

Page 24: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 4: Map the 3 depth frames• Vertex mapping• Pixel brightness = distance• Pixel position in image ->

X/Y• Also known: camera field-

of-view• Have angles, distance: Do

trigonometry to get 3D coordinate for pixel• Repeat for every pixel• Do in parallel on GPU

• Normal mapping• Find the orientation of the

surface for each vertex just mapped• Useful for ICP• Look at closest neighbors• Implementations vary, but

a common way is to compute a vector cross product between a vertex and two neighbors.• Repeat for each vertex.

Page 25: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 5: Do ICP• Vertex and normal maps are used together so each

vertex has a position and orientation• First, align the low-res maps to get a rough estimate of

alignment and camera position. Iterate four times.• Next, align the medium-res maps to get a better

estimate. Iterate 5 times.• Finally, align the full-scale maps to get the final

estimate for alignment and camera orientation. This is repeated 10 times.• Total of 19 iterations per frame – the most time

consuming step.

Page 26: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Questions so far?

Page 27: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Representing the surface in memory• Truncated Surface Distance Function (TSDF)• Makes handheld scanning on personal computers feasible

• Faster than other high-accuracy methods• Allows for continuous refinement of the model

• When scanning, a (usually cubic) virtual volume is defined. The real-world target object is reconstructed within this volume.• The volume is subdivided into a grid of many smaller cubes,

called voxels.• Voxels are volumetric picture elements.

Page 28: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

TSDF representation

Each voxel is assigned a distance to the surface

Negative is behind, positive is in front

Each distance is also assigned a weight (not shown)

Weights represent an estimate of accuracy for a voxel’s distance

For example: a surface facing the camera is likely more accurate than a surface at an angle, so those measurements are given a higher weight

Page 29: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 6:Calculating TSDF Values

Line from each vertex to camera through voxel grid.

Intersected voxels near the surface are updated

Distance from vertex to voxel center is distance value for a given voxel

Assign a weight for the measurement based on the voxel orientation

Use the measurement weight and distance to update the voxel’s current value.

Repeat for other intersected voxels

Repeat for each vertex.

Page 30: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

TSDF RefinementAs the camera moves around and captures more vertices, the TSDF is continuously updated and refined.

Since TSDF voxels contain distances and are not just representative of the edge of a surface, the surface is represented far more accurately than the actual voxel resolution.

Page 31: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 7:Raycasting

The TSDF Volume is raycasted and converted to an image.

The image gives the user feedback on how the scan is going and areas which should be refined.

Page 32: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 8: Extract points for next ICP• The previous ICP alignment had a tiny amount of error• If it were used for the next ICP round, errors would

compound.• Instead, in the final step of SLAM, we extract vertices

and normal from the TSDF volume itself, so all ICP iterations have a common reference.

Page 33: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Do it again, 30x per sec

“Repeat as necessary”

Page 34: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Step 9:Export and Use the Data

Extract points from TSDF

Convert to mesh if desired

View

Measure

Print!

Page 35: 3D Scanning Technology Overview: Kinect Reconstruction Algorithms Explained

Questions?