advances in computer vision / multimedia processing

28
Advances in Computer Vision Advances in Computer Vision / Multimedia Processing Petr Pulc 1 , Martin Holeˇ na 1 1 Institute of Computer Science, Czech Academy of Sciences Prague, Czech Republic December 20, 2018

Upload: others

Post on 16-Mar-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Advances in Computer Vision / Multimedia Processing

Petr Pulc1, Martin Holena1

1Institute of Computer Science,Czech Academy of Sciences

Prague, Czech Republic

December 20, 2018

Page 2: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Outline

1 Primary Methods of Computer VisionBased Directly on Pixel DataTransfer to Time SeriesBased on Time Series

2 Joint Detection, Tracking and DescriptionOn Limited ScopeWith Shot-based MetalearningWith Metalearning on Areas of Interest

3 Discussion

Page 3: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Outline

1 Primary Methods of Computer VisionBased Directly on Pixel DataTransfer to Time SeriesBased on Time Series

2 Joint Detection, Tracking and DescriptionOn Limited ScopeWith Shot-based MetalearningWith Metalearning on Areas of Interest

3 Discussion

Page 4: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Outline

Based directly on pixel colour data (not necessarily RGB)

Individual frames as independent pictures

Very often using (D)*(C)NN

Page 5: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Choose Your (Deep) Weapon!YOLO

448

448

3

7

7

Conv. Layer7x7x64-s-2

Maxpool Layer2x2-s-2

3

3

112

112

192

3

3

56

56

256

Conn. Layer

4096

Conn. LayerConv. Layer3x3x192

Maxpool Layer2x2-s-2

Conv. Layers1x1x1283x3x2561x1x2563x3x512

Maxpool Layer2x2-s-2

3

3

28

28

512

Conv. Layers1x1x2563x3x5121x1x512

3x3x1024Maxpool Layer

2x2-s-2

3

3

14

14

1024

Conv. Layers1x1x512

3x3x10243x3x1024

3x3x1024-s-2

3

3

7

7

1024

7

7

1024

7

7

30

}×4 }×2

Conv. Layers3x3x10243x3x1024

Excels in object detection, good in object description.

Page 6: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Choose Your (Deep) Weapon!YOLO

Excels in object detection, good in object description.

Page 7: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Choose Your (Deep) Weapon!VGG16

Good in object detection, commonly used for face description.

Page 8: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Choose Your (Deep) Weapon!ResNet

Much deeper, comparable model size, used for face description.

Page 9: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

However...

(a) Person (low) (b) Sports ball (low) (c) Untargeted (low)

(d) Person (high) (e) Sports ball (high) (f) Untargeted (high)

Chen, S. T., Cornelius, C., Martin, J., & Chau, D. H. P.: Physical Adversarial Attackon Object Detectors.

Page 10: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

However...

Dist. Angle Target: person Target: spor ts ball Untargeted

10’ 0◦

10’ 30◦

Chen, S. T., Cornelius, C., Martin, J., & Chau, D. H. P.: Physical Adversarial Attackon Object Detectors.

Page 11: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Use of Networks on Statistical Data

Data(InputDim)

Linear Layer(InputDim x 100)

SigmoidLinear Layer(100 x 50)

Sigmoid

Linear Layer(50 x outputDim)

SoftmaxNegative log

likelihood loss

Linear SVM Trainer

SVM learning

Back propagation

Prediction

Legend:

Linear SVM

Prediciton

Šabata, T., Pulc, P., & Holena, M.: Semi-supervised and Active Learning in VideoScene Classification from Statistical Features. In Workshop on Interactive

Adaptive Learning.

Page 12: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based Directly on Pixel Data

Use of Networks on Statistical Data

(a) Room 4A (b) Room 4B

0 200 400 600 800 1000 1200

Histogram bin

0.00

0.05

0.10

0.15

0.20

Rela

tive p

ixel count

Histogram comparison

4A

4B

(c) Histogram comparison

Šabata, T., Pulc, P., & Holena, M.: Semi-supervised and Active Learning in VideoScene Classification from Statistical Features. In Workshop on Interactive

Adaptive Learning.

Page 13: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Transfer to Time Series

Detect, Align, Describe, Match

Marcetic, D., & Ribaric, S.: An Online Multi-Face Tracker for Unconstrained Videos.

Page 14: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Transfer to Time Series

Detect, Align, Describe, Match

Marcetic, D., & Ribaric, S.: An Online Multi-Face Tracker for Unconstrained Videos.

Page 15: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow

Lamba, S. & Nain N.: Oriented Tracklets Approach for Anomalous SceneDetection in High Density Crowd

Page 16: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow with Camera Motion Compensation

Beumier, C. & Neyt X.: Trajectories and Camera Motion Compensation in AerialVideos

Page 17: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow with Camera Motion Compensation

Beumier, C. & Neyt X.: Trajectories and Camera Motion Compensation in AerialVideos

Page 18: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow with Camera Motion Compensation

A

A'

a

b

cd

e

ΔA

ΔA'

Δa

Information on motionfrom upper octave(∆A,∆A′)

Page 19: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow with Camera Motion Compensation

A

A'

a

b

cd

e

ΔA

ΔA'

ΔaΔa − ΔA

Information on motionfrom upper octave(∆A,∆A′)

Feature motion relativeto upper octave(∆a −∆A)

Page 20: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Primary Methods of Computer Vision

Based on Time Series

Optical Flow with Camera Motion Compensation

A

A'

a

b

cd

e

ΔA

ΔA'

ΔaΔa − ΔA

Δa'

Information on motionfrom upper octave(∆A,∆A′)

Feature motion relativeto upper octave(∆a −∆A)

Estimation of newposition of the feature(∆a′)

Page 21: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

Outline

1 Primary Methods of Computer VisionBased Directly on Pixel DataTransfer to Time SeriesBased on Time Series

2 Joint Detection, Tracking and DescriptionOn Limited ScopeWith Shot-based MetalearningWith Metalearning on Areas of Interest

3 Discussion

Page 22: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

On Limited Scope

Face Detect, Track, Describe

Tapu, R., Mocanu, B. & Zaharia T.: Face recognition in video streams for mobileassistive devices dedicated to visually impaired.

Page 23: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

With Shot-based Metalearning

Based on Type of ContentTraining

1 Split content to individual shots2 Based on statistical properties, pick set of processing methods3 Store the best-performing methods to database4 Train a method selection model based on statistical properties

Page 24: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

With Shot-based Metalearning

Based on Type of ContentTesting

1 Split content to individual shots2 Based on statistical properties, pick set of processing methods3 Run k best methods simultaneously4 Provide result with highest support

Page 25: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

With Metalearning on Areas of Interest

Based on Domains of InterestTraining

1 Split content to individual shots and “layers”

2 Based on statistical properties, pick set of processing methods3 Store the best-performing methods to database4 Train a method selection model based on statistical properties

Page 26: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Joint Detection, Tracking and Description

With Metalearning on Areas of Interest

Based on Domains of InterestLayer Splitting

Page 27: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Discussion

Outline

1 Primary Methods of Computer VisionBased Directly on Pixel DataTransfer to Time SeriesBased on Time Series

2 Joint Detection, Tracking and DescriptionOn Limited ScopeWith Shot-based MetalearningWith Metalearning on Areas of Interest

3 Discussion

Page 28: Advances in Computer Vision / Multimedia Processing

Advances in Computer Vision

Discussion

Thank you

[email protected]