paris video tech #2 - presentation by jean-yves avenard

19
Behind the scenes: The Media stack in Firefox

Upload: erica-beavers

Post on 19-Jan-2017

146 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Behind the scenes: The Media stack in Firefox

Page 2: Paris Video Tech #2 - Presentation by Jean-Yves Avenard
Page 3: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

What we support

Out of the box Firefox can handle the following codecs:• Video: VP8 (ffvp8), VP9 (ffvp9) and Theora (libtheora)• Audio: Vorbis (libvorbis) and Opus (libopus), FLAC (ffmpeg)

Relying on installed system frameworks for:• H264, AAC and MP3.

• Windows: Media Foundation Transform (MFT), supports hardware acceleration in combination with D3D9 and D3D11. Not available on XP. European editions (N, KN) require installing extra packages.

• Mac: Video Toolbox, supports hardware acceleration; CoreMedia.• Linux and others: FFmpeg. Software decoding only.

Page 4: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Source Extension

Everything is supported as of current draft specifications except:

• MPEG-TS• raw AAC and MP3 streams• Anything related to the Track elements

Limitations:• All multi-channels audio tracks are downmixed to stereo.• Only one source buffer type (audio or video) at once.

Page 5: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Source Extension

Always supported when we have local decoders:video/mp4: H264, AAC, MP3. Soon Opus and FLAC

video/webm: VP8, VP9, Vorbis and Opus

Note for webm.VP8 and VP9 codecs are only available by default if one of the conditions is true:• No H264 decoder found• No hardware acceleration (typically blacklisted drivers)• Machine is deemed fast enough• media.mediasource.webm.enabled preferences is set to true.

Page 6: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

HTML5 Media Element Architecture (Plain)

All operations between the media element and the media stacksare asynchronous and use a Promise-like communication mechanism.

HTML Media Element(manage events and

user operations)

Media Stack(loading, demuxing, decoding)

JS

● currentTime● readyState

● Load● Play / Pause● Seek

Video Compositor Audio Renderer

Page 7: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Stack (plain)

AsynchronousHeavily multi-threaded

MediaResource

MediaDecoderState Machine

MediaDataDemuxer

MediaDataDecoder

Platform Module

MediaFormatReader

MediaDataDecoder

MediaDataDecoder

MediaCache

Page 8: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

HTML5 Media Element Architecture (MSE)

All operations between the media element and the media stacksare asynchronous and use a Promise-like communication mechanism.

HTML Media Element(manage events and

user operations)

Media Stack(loading, demuxing, decoding)

JS

● currentTime● readyState

● Load● Play / Pause● Seek

Video Compositor Audio Renderer

MediaSource

SourceBuffer SourceBuffer

Page 9: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Stack (MSE)

AsynchronousHeavily multi-threaded MediaSourceResourc

e

MediaDecoderState Machine

MediaDataDemuxer

MediaDataDecoder

Platform Module

MediaFormatReader

MediaDataDecoder

MediaDataDecoder

TrackBuffer

Page 10: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Implementation Notes

• Mostly written in C++• All demuxers are written in house. While we often use external

libraries to provide core features, we control the entire demuxing chain.

Page 11: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

MSE Implementations notes

Eviction strategies:• In 50 and earlier, 100MB video source buffer, 30MB audio source

buffer (was both 100MB in 48 and earlier).• In 51 and later, 100MB video, 10MB audio.

First, attempt to evict data located prior currentTime.Second, attempt to evict future data, found after discontinuity

In the future, we are considering to stop having a set size, and instead base the eviction on the duration of data buffered (e.g. 30s for both audio and video).Combined maximum buffer size shared across all source buffers.

Page 12: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Most Common Issues

• Buggy video driversSolutions: blacklisting, out of process decoding

• Unsupported media fileSolutions: Decoder: tough luck, Demuxer: fix it.

• SecuritySolutions: rewriting some components in Rust language.

Page 13: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

MSE Most Common Issues

• Bad muxing. In particular invalid tagging of keyframes.

• Invalid timestamps, gap in data (in 51 and earlier, Firefox will not go over 125ms gap, 500ms in 52)

• Having to rely on platform decoder limitation or unique behaviour especially on Windows.

• Chrome centric code, or relying on invalid Chrome behaviour.

• Not listening to appendBuffer events, especially buffer full.

Page 14: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

HTML5 Media Element Architecture (EME)

EME is only working in combination with MSE

HTML Media Element(manage events and

user operations)

Media Stack(loading, demuxing, decoding)

JS

● currentTime● readyState

● Load● Play / Pause● Seek

Video Compositor Audio Renderer

MediaKeys

MediaKey session

MediaKey session

Page 15: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Media Stack (EME)The CDM runs in its own child process, within a sandbox.Decrypted and decoded data is fed back into our media stack for rendering

MediaSourceResource

MediaDecoderState Machine

MediaDataDemuxer

Platform Module

MediaFormatReader

EMEVideoDataDecoder

TrackBuffer

MediaKey session MediaKey

sessionEMEAudioDataDecoder

Page 16: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

EME Support

• Currently only supporting Google’s Widevine and Adobe’s Primetime and ClearKey CDM

• No access to Microsoft PlayReady or Apple FairPlay. This prevents us from having access to hardware decoding for encrypted content.

• Netflix only delivering 720p, same for Amazon with some contents.

Page 17: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

Gecko Future improvements.

• Out of process GPU decoding. When a driver crashes we can immediately recover with zero visible consequences

• Suspend decoding for videos when in the background to reduce CPU usage and increase battery life

• E10S: increasing the number of content processes

Page 18: Paris Video Tech #2 - Presentation by Jean-Yves Avenard

How can you help yourselves

• Test using Firefox!

• MSE implementation is very rigorous and 100% per spec.

• If it works in Firefox it will work with other compliant browsers. It’s also more likely to work with all other browsers.

• You’re better off testing with Firefox

Page 19: Paris Video Tech #2 - Presentation by Jean-Yves Avenard