the nasa vision workbench: reflections on image processing in c++

81
Intelligent Systems Division NASA Ames Research Center The NASA Vision Workbench Matt Hancher & Michael Broxton Intelligent Robotics Group January 7, 2009, Willow Garage Reflections on Image Processing in C++

Upload: matt-hancher

Post on 10-May-2015

10.907 views

Category:

News & Politics


5 download

DESCRIPTION

Slides that accompanied a discussion of the NASA Vision Workbech, an open-source C++ image processing library developed by the Intelligent Robotics Group at the NASA Ames Research Center.

TRANSCRIPT

Page 1: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

The NASAVision Workbench

Matt Hancher & Michael BroxtonIntelligent Robotics GroupJanuary 7, 2009, Willow Garage

Reflections on Image Processing in C++

Page 2: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Talk Overview

• Overview and Background

• Introduction to the Vision Workbench

• Vision Workbench Modules and Applications

• Under the Hood: Templates, Views, and Lazy Evaluation

• Lessons Learned and Future Directions

Page 3: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

NASA Ames Research Center

• NASA’s Silicon Valley research center

• Small spacecraft

• Supercomputers

• Lunar & Planetary Science

• Intelligent Systems

• Human Factors

• Thermal protection systems

• Aeronautics

• Astrobiology

Page 4: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Intelligent Robotics Group (IRG)

• Areas of expertise• Applied computer vision

• Human-robot interaction

• Instrument deployment & placement

• Interactive 3D visualization

• Robot software architectures

• Science-driven exploration• Instrument placement, resource

mapping, analysis support

• Low speed, deliberative operation

• Fieldwork-driven operations• Precursor missions (site survey, site

survey, deployment, etc.)

• Manned missions (human-paced interaction, inspection, etc.)

Page 5: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

The NASA Vision Workbench

• Open-source image processing and machine vision library in C++

• Developed as a foundation for unifying image processing work at NASA Ames

• A “second-generation” C++ image processing library, drawing on lessons learned by VXL, GIL, VIGRA, etc.

• Designed for easy, expressive coding of efficient image processing algorithms

Page 6: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Obtaining the Vision Workbench

• Available under the NASA Open Source Agreement (NOSA), an OSI-approved non-viral open source license.

• VW version 2.0 alpha snapshots currently being released for the brave. (We use it.)

http://ti.arc.nasa.gov/visionworkbench/

Page 7: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Image Module Basics

Page 8: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

API Philosophy

• Simple, natural, mathematical, expressive

• Treat images as first-class mathematical data types whenever possible

• Example: IIR filtering for background subtraction

background += alpha * ( image - background );

• Direct, intuitive function calls

• Example: A Gaussian smoothing filter

result = gaussian_filter( image, 3.0 );

Page 9: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

The Core Image Type

• Stores a reference-counted array of pixels.

• Templatized on the pixel type; e.g.

ImageView<PixelRGB<uint8> >

• Supports an arbitrary number of image planes.

ImageView<PixelT>

Page 10: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Constructing ImageView<...> img; ImageView<...> img(cols,rows); ImageView<...> img(cols,rows,planes);

Changing dimensions img.set_size(cols,rows);

img.set_size(cols,rows,planes);

Getting dimensions img.cols()

img.rows()

img.planes()

Accessing pixels img(col,row)

img(col,row,plane)

STL iterator ImageView<...>::iterator img.begin()

img.end()

Pixel accessor ImageView<...>::pixel_accessor img.origin()

The ImageView Public Interface

Page 11: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Grayscale PixelGray<float32>

PixelGrayA<uint8>

RGB PixelRGB<double>

PixelRGBA<int16>

Color spaces

PixelHSV<float32>

PixelXYZ<float32>

PixelLuv<float32>

PixelLab<float32>

Unitless (e.g. kernels) float32, float64 and 8,16,32,

64 bit signed and unsigned integer

Vectors Vector<float64,4>

Unitless (e.g. kernels) float32, float64 and 8,16,32,

64 bit signed and unsigned integer

Masked Pixels PixelMask<float>

PixelMask<PixelRGBA<uint8> >

Built-In Pixel Types

typedef ImageView<PixelRGB<double> > Image;

• Try something like this at the top of your code:

Page 12: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Simple ImageView Operations

transpose(img) rotate_180(img)

flip_vertical(img) flip_horizontal(img)

rotate_90cw(img) rotate_90ccw(img)

crop(img,x,y,cols,rows)

subsample(img,factor)

subsample(img,xfactor,yfactor)

• Operations like these are inexpensive and “shallow” or “lazy.”

• Use copy() to make a deep copy if you need one.

copy(img)

Page 13: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Slicing and Dicing

• Select an individual plane or channel “slice”:

• Interpret pixel channels as image planes:

select_plane(img,plane)

select_channel(img,channel)

channels_to_planes(img)

• Example: making a PixelRGBA<float32> image opaque:

fill( select_channel(img,3), 1.0 );

Page 14: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

ImageView Filtering Operations

convolution_filter(img,kernel)

separable_convolution_filter(img,xkernel,ykernel)

gaussian_filter(img,sigma)

derivative_filter(img,xderiv,yderiv)

laplacian_filter(img)

threshold_filter(img,thresh,hi,lo)

...

img = gaussian_filter(img, 3.0, ZeroEdgeExtention());

• There are several options, including edge extensions:

Page 15: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Some Simple Filtering Examples

Original Gaussian

X Derivative Laplacian

Page 16: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

ImageView Operators

• Mathematical operators on images work as you’d like.

• Add, subtract, multiply, and divide images (per-pixel).

• Add or subtract a constant pixel value offset.

• Multiply or divide by scalars.

• Example: IIR filtering for background subtraction.

• Operators are the best way to do image arithmetic with the Vision Workbench.

bkg_img += 0.02 * (src_img - bkg_img);

Page 17: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

More ImageView Math

• Most standard math functions work on images too.

• Example: Computing gradient orientation.

abs exp log

sqrt pow hypot

sin cos tan

asin acos atan

sinh cosh tanh

asinh acosh atanh

...and more!

orientation = atan2(grad_y, grad_x);

Page 18: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

ImageView Math ExamplesGradient Orientation Gradient Magnitude

Absolute Difference of Gaussians Logarithmic Map

Page 19: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Per-Pixel ImageView Operations

• Cast to a new pixel type or channel type:

• Apply an arbitrary function to each pixel, or to each channel of each pixel:

• Explicit casts are generally not needed to convert between color spaces.

pixel_cast<NewPixelT>(img)

channel_cast<NewChannelT>(img)

per_pixel_filter(img,func)

per_pixel_channel_filter(img,func)

Page 20: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Example: Color Detection

ImageView<PixelRGB<double> > input = ...;

double hue_ref = 0.54;

ImageView<PixelHSV<double> > hsv_im = gaussian_filter( input, 1.0 );

ImageView<double> hue = select_channel( hsv_im, 0 );

ImageView<double> sat = select_channel( hsv_im, 1 );

ImageView<double> match_im = ( 1.0 - 20.0*abs(hue-hue_ref) ) * sat*sat;

• E.g. in color fiducial tracking and object tracking

Page 21: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Image Transformation

• Arbitrary image transformations via transform “functors” that define a mapping.

warped = transform( image, my_txform );

• Simple wrappers for common cases.

resample(img,xscale,yscale) resize(img,xsize,ysize)

translate(img,xoff,yoff) rotate(img,angle)

• Customizable interpolation and image edge extension via optional arguments.

Page 22: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Transformation ExamplesRotation Homography

Radial Distortion Arbitrary Transformation

Page 23: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Modules & Applications

Page 24: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Interest Point & Alignment Module

Page 25: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Interest Point & Alignment Module

Page 26: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Interest Point & Alignment Module

Page 27: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Interest Point & Alignment Module

Page 28: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Interest Point & Alignment Module

Original Images

Aligned Images

Page 29: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Mosaic Module Basics

Page 30: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Mosaic Module Basics

Page 31: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Mosaic Module Basics

Page 32: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

CTX Polar Mosaic

• Based on pre-release polar data captured by CTX on Mars Reconnaissance Orbiter

• Two weeks of development time

• Stats:

• 1610 source images

• 305-GB of source imagery

• 40.3 Gigapixels

Page 33: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Cartography Module

Page 34: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

High Dynamic Range Module

LDR HDR

• Merge multiple exposures of the same scene to increase dynamic range.

• Closely related to photometric calibration of orbital imagery.

Page 35: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

HDR Module

Page 36: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

HDR Module

Page 37: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

HDR Module

Page 38: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Application: Image Matching

• Problem: Given an image, find others like it.

Example database: Apollo Metric Camera images

Page 39: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Texture-Based Image Matching

Texture bank filtering

(Gaussian 1st derivative and LOG)

Grouping to remove orientation

Energy in a window

E-M Gaussian mixture model

Iterative tryouts, MDL

Max vote

Grouping

Mean energy in segment

Euclidian distance

Summarization

Post-processing

Output Representation

Filtering

Model image

Segmentation

Vector Comparison

Matched image

Page 40: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Texture Matching Filter Bank

Page 41: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Image Matching: Results

Page 42: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Discrete Correlation

Sub-pixel Correlation

3. Consistency Checks• Left/Right Cross Check

• Median Filtering

Other methods to be added soon:

• Epipolar, photometric, continuity/smoothness constraints.

• Robust Cost Function

Template Region(from Left Image)

Right Image

Search Area

1. Discrete

Correlation• Find the integer

offset (disparity) that minimizes the sum of absolute difference between template region and the right image.

For speed:

• Coarse-to-fine processing.

• Disparity search sub-regioning1

• Box filter-optimized correlator.

2. Sub-pixel

Refinement• Fit a 2D convex quadratic

surface to the nine nearest points in correlation fitness space.

Candidate Disparity(dx, dy)

1. Changming Sun. Rectangular Subregioning and 3-D Maximum-Surface Techniques for Fast Stereo Matching. In Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision (2001)

Stereo Module

Page 43: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Improved Stereo Matching:Affine-adaptive Sub-pixel Correlation

Right Image

AS15-M-1134 AS15-M-1135

• Foreshortening is the geometric effect that gives rise to stereo processing. However, the

change in perspective on a sloped surface can confuse an area-based stereo correlator.

• The solution is to use an iterative algorithm to adapt the correlation window (e.g. affine).

Page 44: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Improved Stereo Matching:Handling “Noise”

Right Image

• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

• We have shown that we can mitigate this effect somewhat by using robust statistics.

Dust and lint on AS15-M-1134

Page 45: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Improved Stereo Matching:Handling “Noise”

Right Image

• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

• We have shown that we can mitigate this effect somewhat by using robust statistics.

DEM (Note error due to dust...)

Page 46: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Improved Stereo Matching:Handling “Noise”

Right Image

• The occasional speck of dust or lint on the Apollo scans can throw off our stereo correlator.

• We have shown that we can mitigate this effect somewhat by using robust statistics.

DEM (with error corrected using Cauchy robust weighting)

Page 47: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

The Ames Stereo Pipeline

• Problem: Given multiple images, compute the 3D terrain.

NASA Ames has been developing surface reconstruction techniques for planetary exploration since the mid 1990s.

Mars Exploration Rovers (MER) & Viz

Mars Polar Lander & Viz

Mars Pathfinder & MarsMap

Page 48: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Architectural Overview

Vision Workbench Overview

• Modular, extensible, C++ machine vision and image processing library (Linux, OS-X, Win32)

• Developed as a framework for unifying image processing work at NASA Ames

• Designed for easy, expressive coding of efficient image processing algorithms.

Vision Workbench Modules

• Core (abstract datatypes & utilities)

• Camera (models & calibration)

• Cartography (geospatial images)

• GPU (HW accelerated processing)

• HDR (high-dynamic range images)

• Interest Point (tracking & matching)

• Mosaic (composite & blend huge images)

• Stereo Processing (high-quality DEMs & 3D models)

Stereo PipelineMission Specific Code

Vision Workbench

Image

Image Processing

FileIO

ISIS File I/O

InterestPoint

Image Alignment

Stereo

Dense Stereo Correlation

Stereo Camera Geometry

Cartography

DEM Generation

Georeferenced File I/O

Camera

VW Camera Models

ISIS Camera Models

Image File I/O

http://ti.arc.nasa.gov/visionworkbench/

The Stereo Pipeline is a relatively thin application built upon the open source ARC Vision Workbench and USGS ISIS toolkits.

ISIS

Page 49: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Mars Stereo: MOC NA

MGS MOC-Narrow Angle

• Malin Space Science Systems

• Altitude: 388.4 km (typical)

• Line Scan Camera: 2048 pixels

• Focal length: 3.437m

• Resolution: 1.5-12m / pixel

• FOV: 0.5 deg

Page 50: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Galaxius Fluctus Channel

This VRML model was generated from MOC image pair M01-00115 and E02-01461 (34.66°N, 141.29°E). The complete stereo reconstruction process takes approximately five minutes on a 3.0GHz workstation (1024x8064 pixels). This model is shown without vertical elevation exaggeration.

Page 51: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Warrego Vallis System

Lower Left: This 3D model was generated from MOC-NA images E01-02032 and M07-02071 (42.66°S, 93.55°E). Upper Right: Ortho-image overlay. Areas of interpolated data are colored red.

Page 52: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

NE Terra Meridiani

Upper Left: This DTM was generated from MOC images E04-01109 and M20-01357 (2.38°N, 6.40°E). The contour lines (20m spacing) overlay an ortho-image generated from the 3D terrain model. Lower Right: An oblique view of the corresponding VRML model.

!!"""#$

!!"""$

!%""#$

!%""#$

!%""#$

Page 53: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Lunar Stereo: Apollo Orbiter Cameras

ITEK Panoramic Camera

• Focal length: 610 mm (24”)

• Optical bar camera

• Apollo 15,16,17 Scientific Instrument Module (SIM)

• Film image: 1.149 x 0.1149 m

• Resolution: 108-135 lines/mm

Page 54: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Apollo 17 Landing Site

Top: Stereo reconstruction

Right: Handheld photo taken by an orbiting Apollo 17 astronaut

Page 55: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Public Outreach: Haydn Planetarium

Page 56: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Public Outreach: Haydn Planetarium

Page 57: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

• The Vision Workbench handles

arbitrarily large images via

intelligent caching and a flexible

abstraction of an image called an

“image view.”

• Image operations are evaluated lazily, allowing for optimization down the line.

• Processing occurs one tile at a time, and is usually driven by the output operation (i.e. writing a tile to disk).

• DiskImageView, BlockCacheView, ImageViewRef, block_rasterize(), and blocked-savvy write-image()/FileIO

• Scalable performance on multi-

threaded machines (soon to include

Columbia, NASA’s supercomputer)

• Thread and ThreadPool/WorkQueue objects

• Specifically targeting the stereo correlator, outlier rejection, and stereo intersection algorithms.

HiRISE(20,000x40,000)

Apollo Metric Camera(16,000x16,000)

CTX(5064x16000)

LROC(10000x50000)

HRSC(5184x16000)

MOC-NA(2048x4800)

MER(1024x1024)

Nominal Resolutions for Various Imagers. All sizes given in pixels.Apollo Panoramic Camera is not shown (25400 x 244000 pixels)!

Recent Developments:Processing Large Satellite Imagery

Page 58: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Right Image

Refining Apollo SPICE Kernels

• Camera position and pose in “historical” SPICE kernels provided by ASU provide a good initial solution, but they will require refinement.

• Incorporate new Apollo Metric Camera tie-points into ULCN 2005 - or - tie these points to the preliminary LOLA control network in late 2009.

• This work will be carried out as part of a USGS/ARC LASER proposal during FY09/FY10.

Automating Bundle Adjustment

• Automate tie-point matching using the SIFT and SURF algorithms.

• Experimenting with reducing sensitivities to outliers using Robust Statistics (i.e. error models with a “heavy tailed” probability distributions)

Top: Partial view of Orbit 33 stereo reconstruction. Note the discontinuities in the colored, hillshaded terrain. Bottom: KSU “Bundlevis” visualization of bundle adjustment for AS15-M-113[5-7]

Recent Developments:Least Squares Bundle Adjustment

Page 59: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

A Peek Under the Hood

Page 60: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Problem: Intermediate Results

• What happens when you chain operations?

• Normally those would be the same as these:

• That would be terribly inefficient! Computing the intermediate requires an extra pass over the data.

result = image1 + image2 + image3;

result = transpose( crop(x,y,31,31) );

Image tmp = image1 + image2;

result = tmp + image3;

Image tmp = crop(image,x,y,31,31);

result = transpose(tmp);

Page 61: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Solution: Lazy Evaluation

• The + operator returns a special image sum object.

• The actual computation is only performed when you set an ImageView equal to one of these objects.

• The entire operation is performed in the inner loop, once per pixel.

• No intermediate image is needed!

• No second pass over the data is needed, either!

Page 62: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Generalizing the View Concept

• An image view is any object that you can access just like a regular old ImageView object.

Type definitions Image::pixel_type

Image::result_type

Getting dimensionsimg.cols()

img.rows()

img.planes()

Accessing pixelsimg(col,row)

img(col,row,plane)

Pixel accessorImage::pixel_accessor

img.origin()

RasterizationImage::prerasterize_type

prerasterize(bbox)

template <DestT> rasterize(dest,bbox)

• The data can be anywhere, or it can be computed.

Page 63: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

The Pixel Accessor Public Interface

• Pixel accessors are the most efficient way to move around the pixels in an image, and are typically used to implement rasterization functions.

• They behave somewhat like standard C++ iterators.

Iteration

acc.prev_col()

acc.next_col()

acc.prev_row()

acc.next_row()

acc.prev_plane()

acc.next_plane()

Advancement acc.advance(cols,rows)

acc.advance(cols,rows,planes)

Pixel access *acc

Page 64: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Views, Views, Everywhere!

• None of the functions we’ve seen so far do anything.

• Instead, they immediately return view objects that represent processed views of the underlying data.

• Nested function calls produce nested view types.

• The computation happens in either the assignment operator or the constructor of the destination.

• We call this final step the “rasterization” of one view into another view.

Page 65: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Block Rasterization

• Ultra-large (larger than memory) images are are easily supported.

• All image views natively support block-by-block computation (“rasterization”).

• write_image() computes per-block or -line

• QuadTreeGenerator computes per-block

• BlockCacheView allows you to manually control block computation in a nested view.

template <DestT> Image::rasterize(DestT const& dest, BBox2i const& bbox);

Page 66: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

A Trivial First Example

• SLOG: Sign of Laplacian of Gaussian

Image slog =

threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );

Page 67: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Generic View Types Can be Complicated!

• The type of the resulting view object becomes complex very quickly.

Image slog =

threshold_filter( laplacian_filter( gaussian_filter( img, 1.5 ) ) );

UnaryPerPixelView<ConvolutionView<SeparableConvolutionView<ImageView<PixelRGB<float> >,

double, ConstantEdgeExtension>,

double, ConstantEdgeExtension>,

UnaryCompoundFunctor<ChannelThresholdFunctor<PixelRGB<float> > >

Page 68: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Other Advantages to Views

• Generalized views emerged as the solution to several problems at once.

• On-disk images can be supported cleanly.

• Procedurally-generated images can be, too.

• If you only want a small number of processed pixel values, e.g. near interest points, make the view and just ask it for those values.

• Lazy evaluation permits more sophisticated algorithmic optimizations down the road.

Page 69: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Naïve Laziness can be Very Bad™

• What happens when you chain convolutions?

• Now the intermediate result is an important cache:

• Without this cache, performance will be terrible.

• In the Vision Workbench, intermediate results are computed and cached when necessary.

result = convolution_filter(convolution_filter(image,kern1),kern2);

Image tmp = convolution_filter(image,kern1);

result = convolution_filter(tmp,kern2);

Page 70: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Generic vs. Abstract Views

• Views could be either template-based (generic) or virtual-function-based (abstract).

• Because pixel access often appears in tight inner loops, the template-based solution performs better.

• Templates are also more flexible. Virtualization can only recover one hidden type at a time.

• Alas, keeping track of complex types can be annoying. Fortunately, the end user usually doesn’t have to.

Page 71: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Virtualizing Image Views

• Sometimes the abstract base class approach is better.• Run-time polymorphism.

• Hiding complex types altogether.

• The ImageViewRef class wraps an arbitrary view in a veil of abstraction.• Templatized only on the pixel type.

• Contains a pointer to a special abstract base class.

• Has reference semantics (but re-bindable).

• Great for keeping a lazy view around if you only want to evaluate it at select points.

ImageViewRef<float> img_ref = My(Complex(Image(View(Type(img)))));

Page 72: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Image Resources

• Image resources, such as image files on disk, may have unknown pixel/channel types.

Getting type infoPixelFormatEnum pixel_format()

ChannelTypeEnum channel_type()

Getting dimensionsint32 img.cols()

int32 img.rows()

int32 img.planes()

Accessing pixel datavoid read( ImageBuffer buf, BBox2i bbox )

void write( ImageBuffer buf, BBox2i bbox )

OtherVector2i native_block_size()

void flush()

• ImageBuffer is a simple struct describing a block of contiguous pixels in memory.

• Read/write functions call helper functions to convert to/from the desired pixel type.

Page 73: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Lessons Learned and Thoughts for the Future

Page 74: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Templates and Laziness Revisited

• The image view framework currently serves multiple purposes:

• Lazy evaluation of pixels on demand

• Block rasterization of gigantic images

• Eliminating unwanted temporaries

• This sometimes results in confused design.

• Lazy views need not be fully statically defined: that is a premature optimization that complicates design.

Page 75: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Example: Image Transformation

rotate( image, 45*M_PI/180 )

• This simple expression:

• Returns this complex type (assuming an RGB8 image):

TransformView< InterpolationView< EdgeExtensionView< ImageView<PixelRGB<uint8> >,

ZeroEdgeExtension >,

BilinearInterpolation >,

RotateTransform >

• Nested views are very powerful, but the resulting view is needlessly complex.

• Virtualizing the edge extension step has negligible impact on performance. Virtualizing the interpolation step is impossible.

Page 76: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Template Pitfalls

template <class PixelT>

int do_something_useful(…) {

// Your actual program code

};

int main(int argc, char *argv) {

// Parse the arguments...

DiskImageResource *resource = DiskImageResource::open(image_filename);

ChannelTypeEnum channel_type = resource->channel_type();

PixelFormatEnum pixel_format = resource->pixel_format();

switch(pixel_format) {

case VW_PIXEL_GRAY:

switch(channel_type) {

case VW_CHANNEL_UINT8: return do_something_useful<PixelGray<uint8> >(…);

case VW_CHANNEL_UINT16: return do_something_useful<PixelGray<uint16> >(…);

// And so on...

}

// And so on...

}

}

• A common and frustratingly terrible idiom for supporting multiple pixel types:

• Annoying to write, takes forever to compile, and results in huge executables.

Page 77: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

A More Pythonic Way

>>> import vw

>>> input = vw.read_image( ‘my_image.jpg’ )

>>> filtered = vw.gaussian_filter( input, 3 )

>>> vw.write_image( ‘filtered_image.jpg’, filtered )

• Process an image using its native pixel type, as long as its a standard type:

>>> input = vw.read_image( ‘my_image.jpg’, ptype=vw.PixelRGB, ctype=vw.uint8 )

• Coercion to a specific pixel type:

• Successfully implemented in the Python bindings.

• It’s great to use, and terrible to implement.

• Results in huge Python bindings, especially due to SWIG limitations on multiple compilation units.

Page 78: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Proliferation of Image Concepts

• ImageView : Static pixel type, pixels stored contiguously in memory.

• ImageViewRef : Static pixel type, abstracts arbitrary block image computation.

• ImageResource : Dynamic pixel type, block image access with conversion.

• ImageBuffer : Dynamic pixel type, pixels stored in a block in memory.

• A dynamically typed version of ImageViewRef?

Page 79: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

A Dynamic View Abstraction?

• ImageView needs to be templatized on the pixel type for fast and easy pixel access, but this does not prevent it from also adhering to a dynamically typed view abstraction.

• Automatic pixel type casting/coercion is needed to avoid a combinatorial explosion.

• Existing ImageResource interface may be close.... (for 3.0?)

• Currently exploring an intermediate solution (essentially a dynamic version of ImageViewRef) for 2.0 release.

Getting type info PixelFormatEnum pixel_format()

ChannelTypeEnum channel_type()

Getting dimensionsint32 img.cols()

int32 img.rows()

int32 img.planes()

Rasteriztion void rasterize( ImageBuffer buf, BBox2i bbox )

Page 80: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

An OpenCV – VW Bridge?

• OpenCV contains many algorithms that Vision Workbench users would love to use.

• The simplest approach would be a direct bridge between ImageView and IplImage.

• A more powerful approach would be to produce Vision Workbench views whose rasterizers invoke OpenCV algorithms.

• This would automatically support applying many OpenCV algorithms to gigantic images, and fit naturally into the VW view ecosystem.

Page 81: The NASA Vision Workbench: Reflections on Image Processing in C++

Intelligent Systems Division NASA Ames Research Center

Questions / Discussion

http://ti.arc.nasa.gov/visionworkbench/