tutorial on video modeling · ©2020, amazon web services, inc. or its affiliates. all rights...
TRANSCRIPT
![Page 1: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/1.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Tutorial on Video ModelingDecord: an efficient video reader for deep learning
https://cvpr20-video.mxnet.io/
Yi Zhu and Zhi Zhang06/14/2020
![Page 2: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/2.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Motivation
Ø Videos have redundant frames, need video reader
Videos ---> Raw frames ----> Network training
![Page 3: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/3.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Motivation
Ø Videos have redundant frames, need video reader
videos450G
frames6.8T
Ø Pre-processing takes timeØ Data storage is hugeØ IO bottleneck during training
Videos ---> Network training
Videos ---> Raw frames ----> Network training
![Page 4: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/4.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Motivation
Ø Slowness in random access
![Page 5: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/5.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Motivation
Ø Slowness in random access
Wang etal, Temporal Segment Networks: Towards Good Practices for Deep Action Recognition, ECCV 2016
Segment1: index 9Segment2: index 51Segment3: index 102
Random access > sequential read
![Page 6: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/6.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Motivation
Ø Lack of flexibility or good user experience in terms of video handling
Decord
frame = vr[99]
OpenCV
![Page 7: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/7.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Decord
Decord: provide smooth experiences similar to random image loader for deep learning.
![Page 8: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/8.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Installation
Supports Windows/Mac/Linux
Need to build from source to enable GPU support
![Page 9: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/9.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Ease of Usage
![Page 10: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/10.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Pythonic interface
Easy to get video duration
Direct access to any frames by list indexing
![Page 11: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/11.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Batch read frames
![Page 12: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/12.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Segment1: index 9Segment2: index 51Segment3: index 102vrames = vr.get_batch([9, 51, 102])
![Page 13: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/13.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
3D CNNs, loading clips instead of frames
Segment1: index [1,2,3,4,5,6,7,8,9,10,11,12]Segment2: index [5,6,7,8,9,10,11,12,13,14,15,16,17]Segment3: index [9,10,11,12,13,14,15,16,17,18,19,20,21]
Duplication! (OpenCV -> slow, Lintel -> X)
![Page 14: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/14.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Batch read frames
Efficient handling of duplication
![Page 15: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/15.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Drop-in replacement of OpenCV
![Page 16: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/16.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Resize videos while video reading
![Page 17: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/17.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Batch reading using range, reduce python overhead
![Page 18: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/18.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Usage
Get all the key frames
![Page 19: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/19.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Deep learning framework
![Page 20: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/20.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Efficiency Comparison
![Page 21: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/21.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
2x faster than OpenCV
![Page 22: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/22.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Speed Comparison
![Page 23: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/23.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Ø OpenCV and PyAVSlow in random access pattern
Ø Lintel (https://github.com/dukebw/lintel)can’t handle duplication, no key_frame handling
Ø DALI (https://github.com/NVIDIA/DALI)complicated pipeline and usage, has to use Nvidia GPU
We will provide interface to other video readers!
Comparison to other video readers
![Page 24: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/24.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Previewed Features
Ø GPU decoding and data augmentationcomplete, but needs further optimization
Ø Video Loaderan all-in-one solutionhttps://github.com/dmlc/decord
![Page 25: Tutorial on Video Modeling · ©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark Tutorial on Video Modeling Decord: an efficient video reader](https://reader033.vdocument.in/reader033/viewer/2022050301/5f6ab7458efac7728605ada2/html5/thumbnails/25.jpg)
©2020, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Conclusion
Ø Easy to use and flexiblepythonic
Ø Efficient
Ø Notebook(https://github.com/dmlc/decord/blob/master/examples/video_reader.ipynb)
Ø Please try Decord at https://github.com/dmlc/decord