v-pane virtual perspectives augmenting natural experiences€¦ · (pitch/roll) position...

V-PANEVirtual Perspectives Augmenting Natural Experiences

Kerry MoffittScientist

[email protected]

The views, opinions and/or findings expressed are those of the author(s) and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

Distribution Statement “A” (Approved for Public Release, Distribution Unlimited). DARPA DISTAR #27938

GTC 2017

Contents

• Background: DARPA, GXV-T

• V-PANE Overview and Architecture

• Geometry

• Video

• Ongoing Work

DARPA Ground X-Vehicle Technologies (GXV-T)

Goal: Improved vehicle survivability and mobility

Development Areas:

• Increased agility

• Enhanced mobility

• Crew augmentation

• Signature management

Source: http://www.darpa.mil/program/ground-x-vehicle-technologies

Objective: Develop the next generation of ground-vehicle human-machine interfaces that fuse real-time sensor feeds with video data projected onto a 3D geometric model of the environment. The program aims to develop a fully user-controlled, multiple-perspective, live virtual representation of the vehicle’s surroundings.

V-PANE Overview

IED

Boomerang Shot Detection

NetworkedIED Report

ATAK RoutePlan

Secondary View& Touchscreen

Synthetic 1st

Person View

Popular Press:Wired MagazineNational DefenseAmerican Security TodayProduct Design and Development:

High-Level V-PANE VisionLWIR

Color

Point Cloud

Video Image Array LWIR Video

Imagery Projected onto 3D Geometry

Ground X-Vehicle

Boomerang Shot Detection

Static Imagery& Maps

3D View Renderer

Gunner View & Controls

Driver/CommanderView & Controls

SA System (e.g., ATAK)PathPlan

Lidar

AdditionalOnboardSensors

Future

‘Live’3D Model

1. The real-time fusion of:A. Lidar point clouds into a 3D model andB. Multiple 2D video streams onto that model andC. Other 2D or 3D threat, position or mapping data

2. The real-time rendering of that model with video into 2D displays from multiple perspectives

3. The real-time control of the multiple perspectives

V-PANE User Capabilities

360˚ Visualization

Arbitrary Visualization Perspectives

Multi-Spectral Image Fusion

Location Probing

Cue, Slew, Track a

Location

Fused Semantic Information

Real-timeRoute Analysis

Object Detection

Range, Bearing, Elevation

Slope (pitch/roll)

Position

Reconnaissance

After-Action

Reports

ObstaclesObstructions

Slope

Shot Reports

Blue Force Locations

Routing

Offline Viewing

Include Pre-existing

Data

Dynamic Controls

PeopleVehicles

Sensor Array

Real-Time Scanning, Modeling, Rendering

• Lidar + Video + IMU = The World

Real-Time Scanning, Modeling, Rendering

V-PANE Workstation ArchitectureLidar IMU/GPS Cameras

USBLAN

M6000Video Cache,

Rendering

2 x BMD Q2Video Grabbers

CPU1P100

Fusion

5 BMD HD

1 NovAtel1 VectorNav

2 x Velodyne HDL-32E

2 LWIR NTSC

P100Raycasting

P100Video Frusta

Projection

CPU2

1 IOI 4KSDI

K80Compression

SSD

V-PANE Data PipelineLidar IMU/GPS Cameras

USBLAN

M6000Video Cache,

Rendering


CPU1P100

Fusion

5 BMD HD



Position,Orientation800 Hz2 x 40 KB/s

Raw IMU80 KB/s

1400k pts/s3 MB/s

Raw Lidar3 MB/s

Live Video2.9 GB/s

ProcessedLidar

18 MB/s

PixelPositions3 GB/s

2 LWIR NTSC

P100Raycasting

HD Video500 MB/s

Depths, Indices4 GB/s

P100Video Frusta

Projection

CompressedVideo500 MB/s

CPU2

Video3.3 GB/sDepths, Indices 4 GB/s

SSD

Video, Voxels600 MB/s

Voxels 1 GB/s

Voxels5 GB/s

NB: 1 - P2P links require no transfer to or from CPU2 - Non-P2P links count twice3 - QPI link hits both CPUs

Inter-GPGPU (P2P)

CPU-Driven

QPI

DisplayPort

1 IOI 4KSDI

Processing and Recording in a Single Server – Maximum-Load Analysis with PEX 8747 PCIe Switches

K80CompressionLive Video

2.9 GB/s

HD Video500 MB/s

Inter-GPGPU (non-P2P)

Voxels5 GB/s

Depths, Indices 4 GB/sVoxels 100 MB/s

Video, Voxels1.4 GB/s

Voxels5 GB/s

1

2

3

4

Lidar Projection: Geometry Fusion• Lidar lasers (1) reflect back

from surfaces in the world (2), sampling depth

• During each fusion update, every voxel reverse-projects (3) to lidar focal point to determine distance from voxel to surface

• Depth samples stored as rectangular array with fixed distance between samples (4) to optimize lookup (this requires resampling fixed-angle-delta lidar data)

Beam Modeling

100 m2.32 m

1.33°

Wobbler

• Fill in the vertical gaps between lasers

• Over pixels in current view

– Project into voxel array

– Find nearest zero-crossing

Ray Casting

Raycast: Determine 3D Point per Pixel

2

3

1

• For every pixel to be rendered on screen (1), project from virtual camera through pixel (2) into voxel space, to find intersection point with nearest surface on that ray (3)

• Output is a 3-space point per pixel to be rendered

Video Projection: Color Index per Pixel

2

31

• For every pixel to be rendered on screen (1), reverse-project from 3D point to real-world camera frame (2), to find intersection point with image captured in that frame (3)

• Any given scene may involve 100s of frames

Grabber GPU 1

GPGPU 3

GPGPU 2

GPGPU 1

Video Capture

V-PANE – Data Processing Frequency and Latency

Geometry Fusion

Copy

Transfer Voxels

Ray Cast

Video Project

< 50 ms

< 16 ms

Update FrequencyGeometry: 20 Hz

Video Capture: 60 HzRendering: 60 Hz

< 16 ms

Geometry Fusion

< 50 ms

Render

< 16 ms

Video Capture

. . .

< 16 ms < 100 ms

LatencyGeometry: < 148 ms

Video: < 132 ms

Geometry Latency: < 148 ms

Video Latency: < 132 ms

Profiling Geometry Fusion and Ray Casting

Geometry Fusion Cycle Time: < 50 ms

Ray Casting: < 16 ms

V-PANE Workstation Architecture

USBLAN

M6000Video Cache,

Rendering


CPU1P100

Fusion

5 BMD HD



2 LWIR NTSC

P100Raycasting

P100Video Frusta

Projection

CPU2

1 IOI 4KSDI

K80Compression

SSD

Lidar IMU/GPS Cameras

Render

CPU GPUHost CUDA OpenGL

Color Convert

cudaMemcpy UndistortYUV to Pinned Host Memory

Copy Frame

From Live Cameras[Cache]

Fusion Projection

Build Frame

DXT5 to Pinned Host Memory

Copy Frame

From Disk

UploadDecompress

Undistort[Cache]

Fusion Projection

Build Frame

Copy Bits

Image Processing in V-PANE

CompressCopy BitsCompression

Download

Thread Activity (Video I/O)

Main Thread (M6000)

Video Load/Receive Threads

Enqueue Enqueue…

Decompress VBLANKRenderProcess Upload

Dequeue

Decompress…

Dequeue

Compression Thread (K80)

CompressProcess Upload

Dequeue

Process…Store

16.6 ms

Cycle: 17 ms

Video Latency: Best Observed Case

Time (milliseconds)0

…

25 50 75 100

Display Hardware

Host Processing

Grabber Capture

Camera Capture

DMA

Swap

Draw

Upld

Wait for VB

DMA

Swap…

Cycle: 17 ms

Event inWorld

Visible on Display

Hardware *

Software

CaptureThread

Render Thread

CapturedSignal

OutputSignal

After Capture, Render

Can Upload

* Total latency and host processing latency are observed

Hardware latency is inferred

Single 1080p60 Stream

LWIR Integration

Ongoing Work: Object Detection

• Image classification

Ongoing Work: Level of Detail

• Add a second voxel array:• 10x range, to 1 km radius

• 1/10th voxel resolution per dimension

• Ray caster uses per pixel when no hit in primary (hi-res) voxels

• Requires only 0.1% compute to update given 100 m lidar range

2,000 m 200 m

Ongoing Work and Challenges

• Voxel cache, Voxel LOD

• GPS/altitude

• Occlusion testing for video projection

– Bandwidth vs. render quality

• Timestamps

– GPS from IMU and lidar, but not from cameras

– Off by 100 ms = 2.5m

• What if video but no geometry? (Skybox)

• Voxel precision

– Image quality vs. geometry update rate (20 cm voxels? 10? 5?)

The End

Questions?

v-pane virtual perspectives augmenting natural experiences€¦ · (pitch/roll) position...

Documents