apan media encoding
DESCRIPTION
APAN 2014 Bandung HDWG Session presentation Video Encoding for Web and ArchiveTRANSCRIPT
Video encoding for Web and Archive
High Definition Video Working Group APAN Bandung 2014
Andrew Howard - The Australian National University
Background• Significant Digital Humanities media asset library
stored on film, video, audio tapes, portable hard drives, CD, DVD and Blu-Ray
• Some assets reaching end of life requiring ongoing preservation activity
• Also need to handle media stored on online services like youtube and vimeo
• Build on experience with encoding for Digital Lecture Delivery system and iTunesU
Problem Space• Media degradation
• Maintain quality, original format, encapsulation of propriety playback system/application/OS environment using emulator and/or virtual machine
• Encode to an industry standard archive format at high bitrate to support re-encoding using evolving compression standards
• Deliver to a range of target playback systems
• Organise, Identify and Describe assets and the content of assets
• Cost of ingest, conversion, classification, storage and delivery
Some actual collections
Seismology data !(586 DAT tapes)
Media Degradation• Digital Media
• DVD
• Physical damage
• Dye process
• Storage
• Hard drives
• Physical damage
• Magnetic coherence
Media Degradation• Tape media (Video, Reel to Reel, DAT, Magtape)
• Storage
• Physical
• Replay devices
Content Management• Organisation, Classification and Description of
assets and the content of the assets
• E-Culture WG Session on Linked Data
Media preparation Video
• Video tape
• Format: PAL,NTSC,SECAM,HDV
• Aspect ratio: 4:3, 16:9, 3:2, 8:5, Anamorphic
• Frame type: Interlaced or Progressive
• Pixel format: Rectangular or Square
Media preparation Video
• Video tape - General Information
• Clean the VCR heads regularly
• Use a video enhancer hardware device like the Canopus to provide additional signal stabilisation, chroma correction and retiming
• Adjust VCR tracking
• Use highest available device resolution for capture
• Use highest available device connect for capture
• DV
• S-Video
• Composite
Media preparation DVD
• DVD
• Format: PAL,NTSC,SECAM,HDV
• Aspect ratio: 4:3, 16:9, Anamorphic
• Frame type: Interlaced or Progressive
• Region coding and DRM
DVD Encoding Tools• Older tools:
• (Windows)
• DVD Decrypter
• DVD Shrink
DVD ingest• Experienced problems on both commercial and
user created DVD media from both controlled and uncontrolled environments
• Best results using a Blu-Ray drive to read media which standard DVD drives failed to read
Encoding Tools• Contemporary tools (OSX & Windows):
• Handbrake
• DVD decoding
• DVD and file Encoding into many formats
• VLC
• The “Swiss Army Knife” for media
Cataloging, Tagging and Identification
• XMP:Description
• MP3 tags
• iTunes tags
• Tools
• exiftool
• read and write asset metadata
• mkvinfo
• Face and Object recognition with CoreImage and OpenCV
Command line tools
• ffmpeg
• VLC
• vpxenc
• MKVToolNix
ffmpeg recipesGenerate a JPEG poster frame from the video at #SECONDS from start (15-20) is typical. !ffmpeg -i {INPUT} -y -f mjpeg -vf scale="320:trunc(ow/a/2)*2" -vframes 1 -ss {#SECONDS} {OUTPUT} !
ffmpeg recipesTheora Video @1.2M, Vorbis Audio @128k !ffmpeg -i {INPUT} -y -codec:v libtheora -b:v 1200k -qscale:v 6 -codec:a libvorbis -qscale:a 5 -b:a 128k -ar 22050 {OUTPUT}
ffmpeg recipesH.264 Video @10Mbs, AAC Audio @384k, Lossless, width: preserve !ffmpeg -i {INPUT} -metadata media_type=10 -metadata hd_video=0 -threads 0 -acodec libfaac -ac:a 2 -b:a 384000 -vcodec libx264 -pix_fmt yuv420p -b:v 10240k -preset veryslow -tune film -qp 0 -movflags +faststart {OUTPUT}
ffmpeg recipesH.264 Video @1.2Mbs, AAC Audio @128k, scaled to height: 320, width: matching input ratio ffmpeg -i {INPUT} -metadata media_type=10 -metadata hd_video=0 -threads 0 -acodec libfaac -ac:a 2 -b:a 128000 -vcodec libx264 -pix_fmt yuv420p -b:v 1200k -vf scale="320:trunc(ow/a/2)*2" -profile:v main -preset medium -crf 18 -level 3.1 -movflags +faststart {OUTPUT}
ffmpeg recipesWebM (VP8) Video @1.2Mbs, Vorbis Audio @128k !ffmpeg -i {INPUT} -y -threads 8 -codec:v libvpx -qscale:v 6 -b:v 1.2M -codec:a libvorbis -crf 10 -qscale:a 5 -b:a 128k -ar 22050 !!
Required for multi threading
-threads 0 doesn’t work
Playback• Target: Web browsers & mobile devices using
HTML5 <VIDEO> and <AUDIO>
Firefox IE Chrome Safari/Webkit
H.264/AAC x xWebM x x x
Theora/Vorbis x x
FLV x x
MP3 x x x
YouTube html5
Web playback HTML5 and flash fallback for H.264
• Examined a range of open source players
• Projekktor, osmplayer, JWplayer and MediaElement
• Selected MediaElement for quality of API, documentation, support of SRT subtitles and plugin support
• mediaelementjs.com
Archive• Maintain original format to provide for re-code at
later time
• Generate a high quality 2-pass H.264 version
• Generate a high quality DVD version
Future Codecs• Increased range of macro block forms
• Larger inter frame comparison
• Decreased file sizes allow better bandwidth utilisation for existing assets and the delivery of higher definition and clarity operating on existing transmission systems
• H.265/HEVC
• VP9
• Jan 2014 code release
VP9• Google next generation codec
• libvpx code available
• Latest VLC and Chrome will play
• YouTube is a significant market driver
VP9• Google next generation codec
• Original video size: 108,887,661 (108.9Mb)
• x264 encode fps:
• VP8
• Single pass ffmpeg encode size: 122,716,927 (122.7Mb) includes Audio
• vpxenc 2pass size: 24,488,846 bytes (24,5Mb) Video only,
• encode fps:
• Pass 1/2 frame 3857/3858 555552B 1152b/f 27327b/s 131187 ms (29.40 fps)
• Pass 2/2 frame 3857/3857 24452079B 50717b/f 1202781b/s 118922 ms (32.43 fps)
• VP9
• Single pass encode vpxenc --codec=vp9 -t 7 -o APAN_demo_nasa.vp9.webm -w 1280 -h 720 --cpu-used=4 -p 1 --target-bitrate=1200 —kf-max-dist=360 APAN_demo_nasa.vp8_1.y4m
• Pass 1/1 frame 3857/3857 24813041B 51465b/f 1220537b/s 998347 ms (3.86 fps)
• vpxenc 2 pass encode fps:
File size comparison preliminary testing results
Video Sample MOV Mb AVI Mb Encode
FPS
Original 108,887,661 108 591,552,512 591.5
vp8 39,153,628 39.1 444,498,772 444.4 ~30
vp9 31,741.303 31.7 216,773,028 216.7 ~3-4
x264 97,624,542 97.6 379,207,254 379.2 ~450
VP8 and VP9 tools• vpxdec
• Extract a yuv4 uncompressed video • vpxdec --progress --postproc --mfqe -t 7 -o APAN_demo_nasa.vp8.y4m APAN_demo_nasa.vp8.webm
• mkvextract tracks
• mkvextract tracks "APAN_demo_nasa.vp8.webm" 1:APAN_demo_nasa.vp8.ogg
• mkvmerge
HEVC/H.265• svn checkout https://hevc.hhi.fraunhofer.de/svn/
svn_HEVCSoftware/tags/HM-1.0/ HM-1.0
• Still testing encoding
HM software: Encoder Version [1.0][Mac OS X][GCC 4.2.1][64 bit] !Input File : APAN_demo_nasa.vp8_1.y4mBitstream File : APAN_demo_nasa.vp8_1.binReconstruction File : APAN_demo_nasa.vp8_1_enc.yuvReal Format : 1280x720 30HzInternal Format : 1280x720 30HzFrame index : 0 - 8 (9 frames)Number of Ref. frames (P) : 1Number of Ref. frames (B_L0) : 1Number of Ref. frames (B_L1) : 1Number of Reference frames : 1CU size / depth : 128 / 5RQT trans. size (min / max) : 4 / 32Max RQT depth inter : 2Max RQT depth intra : 1Motion search range : 128Intra period : 32QP : 32.00GOP size : 8Rate GOP size : 8Bit increment : 4Luma interpolation : Samsung 12-tap filterChroma interpolation : Bi-linear filterEntropy coder : CABAC!TOOL CFG: ALF:1 IBD:1 HAD:1 SRD:1 RDQ:1 SQP:0 ASR:0 PAD:0 LDC:0 NRF:1 BQP:0 GPB:0 FEN:0 RQT:1 MRG:1 !POC 0 ( I-SLICE, QP 32 ) 928 bits [Y 68.0431 dB U 71.3615 dB V 99.9900 dB] [ET 32 ] [L0 ] [L1 ] POC 8 ( P-SLICE, QP 33 ) 123656 bits [Y 39.3042 dB U 43.1461 dB V 45.9746 dB] [ET 77 ] [L0 0 ] [L1 ] POC 4 ( B-SLICE, QP 34 ) 17248 bits [Y 40.1223 dB U 44.9601 dB V 47.3997 dB] [ET 412 ] [L0 0 ] [L1 8 ] POC 2 ( B-SLICE, QP 35 ) 6576 bits [Y 44.9149 dB U 48.0378 dB V 51.2781 dB] [ET 351 ] [L0 0 ] [L1 4 ] POC 6 ( B-SLICE, QP 35 ) 10208 bits [Y 37.7147 dB U 42.0799 dB V 45.0517 dB] [ET 479 ] [L0 4 ] [L1 8 ] POC 1 ( B-SLICE, QP 36 ) 984 bits [Y 63.3660 dB U 50.5459 dB V 51.4867 dB] [ET 195 ] [L0 0 ] [L1 2 ] POC 3 ( B-SLICE, QP 36 ) 2376 bits [Y 40.5029 dB U 45.3647 dB V 48.8884 dB] [ET 398 ] [L0 2 ] [L1 4 ] POC 5 ( B-SLICE, QP 36 ) 1864 bits [Y 37.1593 dB U 42.6453 dB V 45.4117 dB] [ET 322 ] [L0 4 ] [L1 6 ] POC 7 ( B-SLICE, QP 36 ) 5856 bits [Y 35.2758 dB U 40.2356 dB V 43.8896 dB] [ET 291 ] [L0 6 ] [L1 8 ] !SUMMARY -------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 9 a 565.6533 45.1559 47.5974 53.2634 !!I Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 1 i 27.8400 68.0431 71.3615 99.9900 !!P Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 1 p 3709.6800 39.3042 43.1461 45.9746 !!B Slices-------------------------------------------------------- Total Frames | Bitrate Y-PSNR U-PSNR V-PSNR 7 b 193.3371 42.7223 44.8385 47.6294 ! Total Time: 2558.456 sec.
Next Generation Codecs• Trade increased encoding time and cpu for
decreased bandwidth
• Promise of significant gains in compression
• Reference code and specifications now available
• Still tuning for performance
• Google VP9 developer videos on YouTube
VP9 test sequence• Convert input video to vp8
• Encode to vp9
• 2 pass ! vpxenc --codec=vp9 -t 7 -o APAN_demo_nasa.vp9_2_pass_clang.webm -w 1290 -h 720 --cpu-used=4 -p 2 --target- bitrate=1200 --kf-max-dist=360 APAN_demo_nasa.vp8_1.y4m
Summary• Media preparation
• ffmpeg recipes for media encoding for Web and Archive
• Next Generation Codecs