sc24/wg9 liaison meeting - khronos group€¦ · - brief introduction to khronos - khronos native...
TRANSCRIPT
© Copyright Khronos Group, 2011 - Page 1
SC24/WG9 Liaison Meeting Seoul, November 2011
Neil Trevett Vice President Mobile Content, NVIDIA
President, The Khronos Group
© Copyright Khronos Group, 2011 - Page 2
Thank you for the invitation to be here!
• Khronos warmly welcomes the opportunity to discuss
liaison opportunities regarding Augmented Reality with SC24/WG9
• Topics in this presentation:
- Brief introduction to Khronos
- Khronos Native API standards relevant to Augmented Reality
- Industry Adoption of Khronos APIs - examples in Android and HTML5
- Existing liaisons and possible future liaisons
© Copyright Khronos Group, 2011 - Page 3
Khronos - Connecting Software to Silicon • Creating open, royalty-free acceleration API standards
- Focus on graphics, dynamic media, compute and sensor hardware
• Low-level - just above raw silicon
- “Foundation” functionality needed on every platform
• Safe forum for industry cooperation
- „By the industry for the industry‟
- Open to any company to join
- IP framework to protect
members and industry
• Khronos APIs designed to
enable healthy implementation-level
innovation and competition
in the open market
APIs enable software developers to turn silicon
functionality into rich end user experiences
© Copyright Khronos Group, 2011 - Page 4
Board of Promoters
Over 100 members – any company worldwide is welcome to join
Apple
© Copyright Khronos Group, 2011 - Page 5
Khronos Family of Standards
Embedded and Mobile 3D
Cross platform desktop 3D
3D Digital Asset Exchange format
Surface Management
Au
tho
rin
g a
nd
a
cce
ssib
ilit
y
Ap
pli
ca
tio
n
Acce
lera
tio
n
Parallel Computing
Plugin-free 3D Web Content
Streaming Media
A coordinated ecosystem of compute, graphics and media
standards and APIs
Advanced Audio
Unified Sensor and Input Processing
StreamInput
Khronos creates royalty-free specifications to meet real market needs and helps drive industry adoption across multiple platforms
Web Compute
Vector 2D
© Copyright Khronos Group, 2011 - Page 7
Responding to Industry Trends
As platforms diversify – mobile,
TV, embedded – HTML5 will
become increasingly important
as a universal app platform
Breakthrough apps embrace
mobility‟s strengths – not just treat
phones as small PCs - will need
complex, interoperating APIs
e.g. Augmented Reality
Mobile is the new platform for apps
innovation. Mobile APIs are needed
to unlock hardware potential while
conserving battery life
High-end API technology is created on
high-end platforms
© Copyright Khronos Group, 2011 - Page 8
AR – Key Use Case driving Khronos APIs • Many APIs need to work closely together - great goal!
• Commercial uses of AR-related technologies continuing to evolve…
• … but - the technologies needed here will enable amazing apps!!
Camera video stream sent to the compositor
3D Augmentation Rendering
3D augmentations composited with video stream
Camera Tracking
Camera images used to track the
camera‟s location and
orientation
Camera-to-scene transform locks the 3D rendering to the real world
© Copyright Khronos Group, 2011 - Page 9
OpenGL ES – 3D Graphics • OpenGL for embedded and mobile devices
- Eliminates redundant and legacy features
- Adds extensions to make it mobile-friendly
• The dominant 3D API for mobile devices
- Widely adopted for STB, DTV, automotive,…
- Hundreds and hundreds of millions shipped
• Runs high-end content and engines
- UE3, Unity, Unigine, Rage
© Copyright Khronos Group, 2011 - Page 10 Copyright Khronos 2009
OpenCL – Heterogeneous Computing • Framework for programming diverse
parallel computing resources in a system
• Platform Layer API
- Query, select and initialize compute devices
• Runtime API
- Execute compute kernels – gather results
• Kernel Language Specification
- Subset of ISO C99 with language extensions
• OpenCL has Embedded profile
- No need for a separate “ES” spec
© Copyright Khronos Group, 2011 - Page 11
OpenMAX AL - Object Oriented Media • Object oriented processing of camera, images and video with AV sync
- Connect to variety of input and output objects to PLAY and RECORD media
• Object control interfaces
- Sources: Mix control, Seek, Rate, Metadata Extraction, Camera Controls
- Sinks: Encode control, Tuning, MIDI, Metadata Insertion
• Video and image stream routing to other APIs
- To CPU and GPU subsystems
OpenMAX AL Media Object
URI
Memory
Camera
Audio Input
URI
Memory
Audio Mix
Display Window DSrc DSnk
Analog Radio
Sources Sinks
EGLStream to ES Data Tap to CPU
© Copyright Khronos Group, 2011 - Page 12
Camera Controls for AR • AR needs extensive camera controls
- OpenMAX AL extensions currently in design
• Query camera information
- Focal length (fx, fy), principal point (cx, cy), skew (s), image resolution (h, w)
- Spatial information of how cameras and sensors are placed on device
- Calibration and lens distortion
• ROI extraction
- From wide angle and fish-eye lenses
• Extensive exposure parameters in single or burst mode
- Shutter speed, aperture, ISO, white balance, frame rate, focus modes, resolution
- Synchronization with other system sensors
• Data output format control
- Grayscale, RGB(A), YUV
- Access to the raw data e.g. Bayer pattern
© Copyright Khronos Group, 2011 - Page 13
?
EGLStream – Video/Graphics Interop
Camera File URL Etc.
OpenMAX AL MEDIA PLAYER
Object
OpenGL ES GL_TEXTURE_EXTERNAL
OpenMAX AL Media Player is the EGLStream “Producer” and controls
production of frames.
OpenGL ES GL_TEXTURE_EXTERNAL is the EGLStream “Consumer” and converts video format into RGB
OpenGL ES texture
EGLStreams enables and hides details of video frame transport. Enables multiple buffering modes for different uses cases
eg: FIFO and explicit latch/release
EGLStream
© Copyright Khronos Group, 2011 - Page 14
OpenSL ES – Advanced Audio • OpenSL ES does for audio what OpenGL ES does for graphics
- Advanced audio functionality from simple playback to 3D spatialized audio
• Object-based native audio API for simplicity and high performance
- Reduces development time
• Same API regardless of underlying implementation
- Software or hardware accelerated
• Cross OS portability
- Preserves application investment
© Copyright Khronos Group, 2011 - Page 15
StreamInput Connects Sensors to Apps
Advanced Sensors Everywhere Standard cameras, depth cameras
motion and position, touch, microphones wireless controllers
Apps Need Sophisticated Access to Sensor Data
Without coding to specific systems or sensor hardware
Universal Timestamps
Apps request semantic sensor information StreamInput defines list of possible semantic requests
“Am I in an elevator?” “Give me gestures and face position”
Sensor graph created to provide sensor information StreamInput defines graph creation API and node interconnects
Low-level sensor processing encapsulated in nodes – unleashes fusion innovation Apps gain „magical‟ situational awareness
Standardized Node Intercommunication
Input Device
Input Device
Input Device
Filter Node
Filter Node
App Filter Node
SHOULD NOT FORCE APPLICATIONS TO
ACCESS INDIVIDUAL SENSORS
© Copyright Khronos Group, 2011 - Page 16
Current StreamInput Participants • Aiming for production implementations in September 2012
© Copyright Khronos Group, 2011 - Page 17
OpenCV as Potential Khronos Standard • OpenCV is widely use open source project
for COMPUTER VISION
• Khronos Hardware Abstraction Layer
- Would enable hardware vendors to
provide accelerated imaging
and vision modules
• Being sponsored by NVIDIA,
Itseez and Willow Garage
- Decision to initiate in few weeks time
Application
Hardware Abstraction Layer
Open source sample implementation
Hardware vendor implementations
High-level CV algorithms library
Can optionally use OpenCL to implement
© Copyright Khronos Group, 2011 - Page 18
Separation of Vision Stack Layers
StreamInput Semantics and fusion of camera and positional sensors
Computer Vision and tracking
Parallel computation
© Copyright Khronos Group, 2011 - Page 19
Augmented Reality Functionality
Camera Processing
3D Rendering and Video Composition
Audio Rendering
Application on CPU
Positional and GPS Sensor Data
Computer Vision and Tracking
Position
and Tracking
Semantics
Control Camera, Preprocess and generate
video streams
Video TAP to CPU
Synchronization and sensor
fusion
Video stream to GPU
Positional Sensors
Camera
StreamInput
EGLStream
OpenCV
Much more flexibility than just “overlay augmentations
over background”
© Copyright Khronos Group, 2011 - Page 21
Android Native API Adoption
OpenGL ES OpenGL ES 2.0
Shipping - Android 2.2
OpenSL ES OpenSL ES 1.0
Shipping – Android 2.3
OpenMAX AL OpenMAX AL 1.0
Shipping - Android 4.0
EGL EGL 1.4
Shipping under SDK
OpenCL Not yet adopted
StreamInput Working group will ship spec in 2012
OpenCV Khronos voting to
establish WG
StreamInput ?
http://developer.android.com/resources/dashboard/platform-versions.html
© Copyright Khronos Group, 2011 - Page 22
Leveraging Native API Investment into HTML5 • HTML5 evolving into cross-platform programming platform
- Gradually exposing complete system capabilities
• Opportunity to synergize Web and native APIs development
- Leverage native API investments, reduce developer learning cycles
• Khronos and W3C creating close liaison
Native APIs shipping or working group underway
JavaScript API shipping or working group underway
HTML and Browser
Composition
WebAudio Advanced JavaScript
Audio
WebMAX? Camera
control and video
processing
Possible future JavaScript APIs
Device and Sensor APIs
Device Orientation
Working Groups
StreamInput Native
JavaScript
© Copyright Khronos Group, 2011 - Page 24
WebGL and HTML Interaction • 3D is not trapped in a rectangular window
- 3D can overlay and underlay HTML content
- Easy to make 2D HTML HUDs or 3D user interfaces
• Strong ties with other advanced HTML5
- WebGL can use HTML5 <video>
or canvas as a texture
• Can use 3D for core Web UI – as well as content
- Advanced transforms and special effects
• Render HTML DOM sub-tree as texture
- Mozilla and Google prototyping as extension
- Support user interaction when in 3D
© Copyright Khronos Group, 2011 - Page 25
WebGL Deployment • WebGL 1.0 Released at GDC March 2011
- Mozilla, Apple, Google and Opera working closely with GPU vendors
• Typed array 1.0 spec ratified by Khronos in May
- Supporting bulk data transfer between threads (workers)
- Many use cases - background mesh loading, generation, deformation, physics ...
• 1.0.1 release of WebGL spec and conformance suite imminent
- 100% robust stance on security
- Fixing bugs in 1.0.0 conformance suite
- Implementations will report getContext("webgl") (not experimental)
http://caniuse.com/#search=webgl WebGL is not enabled by default in Safari
© Copyright Khronos Group, 2011 - Page 26
WebCL – Parallel Computing for the Web • Khronos launching new WebCL initiative
- First announced in March 2011
- API definition already underway
• JavaScript binding to OpenCL
- Security is top priority
• Many use cases
- Physics engines to complement WebGL
- Image and video editing in browser
• Stay close to the OpenCL standard
- Maximum flexibility
- Foundation for higher-level middleware
© Copyright Khronos Group, 2011 - Page 27
?
?
?
Augmented Reality in the Web
Camera Processing
3D Rendering and Video Composition
Audio Rendering
Application on CPU
Positional and GPS Sensor Data
Computer Vision and Tracking
Position
and Tracking
Semantics
Control Camera, Preprocess and generate
video streams
Video TAP to CPU
Synchronization and sensor
fusion
Video stream to GPU
Positional Sensors
Camera
StreamInput
EGLStream
WebAudio?
Need to explore whether HTML composition can handle all AR use cases
Sufficiently sophisticated camera control
Semantic, synchronized sensor fusion
© Copyright Khronos Group, 2011 - Page 28
WebGL and Declarative 3D • Very synergistic
- WebGL can accelerate Declarative 3D
• Declarative 3D seems to be a major opportunity for X3D community
- By re-using existing HTML5 machinery – rather than duplicate – have a chance to
be very widely adopted
- Name Declarative 3D = X3D 4.0?
• Cameras and sensors will need to integrated into HTML5
- Other applications will need access to sensors and cameras
- Use Canvas as interop hub
DOM and events
Canvas
JavaScript WebGL
Cameras
Sensors
Don‟t Duplicate any of this
Canvas
2D 3D
Scenegraph
Immediate
© Copyright Khronos Group, 2011 - Page 30
Need for 3D Transmission Standard? • Approaching chaos in data encodings used in WebGL browsers and apps
- We need to enable native decompression in browsers rather than JavaScript
• Possible key requirements
- Full-scene - Geometry, textures, materials, animations, physics etc.
- Compression of textures and geometry
- Streaming support with LOD flexibility
• Some formats in use – but no widespread consensus
- COLLADA, KML, MPEG-4, VRML, JSON, X3D binary, PowerVR POD, GZIP etc. etc.
• Is the industry ready to work on this? Is the need real?
Audio Video Images 3D
All above defined by MPEG – should 3D be an MPEG standard too? Must be royalty free! Leverage MPEG-4 Part 16 „AFX‟, Part 25 – a lot of investment – but why no momentum?
MP3 H.264 PNG/JPEG ?
© Copyright Khronos Group, 2011 - Page 31
Relevant Current Khronos Liaisons • W3C
- HTML5/JavaScript bindings: WebGL, WebCL, sensors, video and camera
- Browser Acceleration: CSS shaders, composition, audio, SVG, Canvas
• Web3D
- Liaison agreement regarding X3D and COLLADA
- Declarative 3D community group at W3C - 3D scene graph in the DOM - WebGL
• ISO/IEC JTC1 SC4/TC184
- 3D Visualization Specifications - Copyright License for COLLADA to ISO Executed
• OMA and OGC
- Augmented reality cooperation - complementary activities
© Copyright Khronos Group, 2011 - Page 32
Summary • Khronos is working hard to create hardware acceleration standards that are
relevant to Augmented Reality and promoting their industry adoption
- Khronos is focused on one very small piece of the total AR picture – how
applications on various platforms access acceleration and sensor hardware
• Khronos welcomes liaison opportunities
- To assist in leveraging Khronos standards in larger ecosystems
- To cooperate to ensure Khronos APIs interoperate with other standards initiatives
- To help identify and help solve issues and gaps in the AR ecosystem
• Possible areas for cooperation
- ISO Standardization of OpenGL ES and other Khronos APIs?
- Transmission standard at ISO?
- Cooperative camera and sensor integration into Web for AR?
- QUESTION – cooperate over X3D or Declarative 3D or both?