![Page 1: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/1.jpg)
Csaba Beleznai
Senior Scientist
Video- and Safety Technology
Safety & Security Department
AIT Austrian Institute of Technology GmbH
Vienna, Austria
Task-oriented Computer Vision in 2D and
3D: from video text recognition to 3D
human detection and tracking Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng,
Andreas Zoufal, Julia Simon, Daniel Steininger,
Markus Hofstätter und Andreas Kriechbaum
![Page 2: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/2.jpg)
Austrian Institute of Technology
Short intro – who are we in 20 seconds
![Page 3: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/3.jpg)
2D
Optical flow driven motion
analysis
Motivate & stimulate
Algorithms through applied examples
Contents
3D
Queue length and
waiting time estimation
3D
Left item detection
2D
Video text recognition
![Page 4: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/4.jpg)
4 11.07.2016
A frequently asked question
Introduction
![Page 5: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/5.jpg)
Why is Computer Vision difficult?
Primary challenge in case of Vision Systems (incl. biological ones):
uncertainty/ ambiguity
Image formation (2D)
?
?
?
? Low-level
Mid-level
High-level
?
?
?
Features
Groupings
Concepts
Visual analysis
Complementary
cues
(depth, more views,
more frames)
Complementary
groupings
(spatial, temporal -
across frames)
Complementary
high-level
information
(user, learnt)
Motivation
Prior knowledge
Parameters, off-
line and
incrementally
learned
information
? ?
(from a Bayesian perspective)
Shadow
Texture True boundary
![Page 6: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/6.jpg)
6
Example: Crop detection
Radial symmetry
Near regular structure
Example for robust vision
![Page 7: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/7.jpg)
IDEA
branch & bound
research methodology
PRODUCT APPLICATION
RESEARCH DEVELOPMENT
Alg. A
Alg. B
Alg. C
MATLAB C++
Motivation
Introduction
Challenges when developing Vision Systems:
Complexity Algorithmic, Systemic, Data
Non-linear search for a solution
PROBLEM
SOLUTION
TIME
Process of problem solving
A deeper understanding towards the problem is
developed during the search for a solution
![Page 8: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/8.jpg)
Visual Surveillance - Motivating example
Object detection and
classification
Tracking
Activity recognition
Typical surveillance scenario:
Who : people, vehicle, objects, …
Where is their location, movement?
What is the activity?
When does an action occur?
Motivation
Algorithmic units:
![Page 9: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/9.jpg)
Object detection and
classification
Counting, Queue length,
Density, Overcrowding
Abandoned objects
Intruders
Tracking
Single objects
Video search
Flow
Activity recognition
Near-field (articulation)
Far-field (motion path)
Motivation
Algorithmic units:
Typical surveillance scenario:
Who : people, vehicle, objects, …
Where is their location, movement?
What is the activity?
When does an action occur?
Visual Surveillance - Motivating example
![Page 10: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/10.jpg)
10
Real-time optical flow based particle advection
2D
![Page 11: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/11.jpg)
Optical flow driven advection
11
ti ti+1
Dense optical flow field
Advection: transport mechanism induced by a force field
Vx,i
Vy,i A particle trajectory
induced by the OF field
![Page 12: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/12.jpg)
Particle advection with FW-BW consistency
A simple but powerful test
Forward:
Backward:
Successful
Failure
< x x : mean offset Consistency check:
![Page 13: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/13.jpg)
Pedestrian Flow Analysis
Public dataset: Grand Central Station, NYC: 720x480 pixels, 2000 particles, runs at 35 fps
![Page 14: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/14.jpg)
Other examples: wide area surveillance (small objects, nuisance, clutter)
Wide-area Flow Analysis
![Page 15: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/15.jpg)
15
End-to-end video text recognition
2D
![Page 16: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/16.jpg)
Overview
INPUT
Presence (y/n)
OUTPUT
Text Detection Localization Propagation Segmentation Recognition, Propagation
The End-to-End Video Recognition Process
Location (single frames)
(x, y, w, h)
Location (frame span)
(x, y, w, h)
Binary image regions
Text (e.g. in ASCII)
Characterizing dynamic elements: running text
Evaluation: High accuracy at each stage is necessary
Very high recall throughout the chain
Increasing Precision toward the end of the chain
![Page 17: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/17.jpg)
Algorithmic chain - Motivation
Main strategies for text detection:
What is text (when appearing in images)?:
An oriented sequence of characters in close proximity, obeying a certain regularity
(spatial offset, character type, color). Sample text region + complex background
![Page 18: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/18.jpg)
Algorithmic chain - Motivation
To detect Representing text appearance:
• Region based:
• Binary morphology (outdated technique: trying to find nearby characters and
segmenting lines)
• Statistics
• Edge density, frequency, orientation (popular: HOG), …
• Texture representation: filter banks, co-occurrence, …
Discriminative classifier relatively fast, but some hard-to-discriminate
cases (vegetation, dense regular patterns /grids, gravel/) + poor region segmentation
• Analysis at character-level
• Requires a full or partial segmentation (a challenge itself) character or
stroke
• Highly specific (stroke width is uniform, shape is very specific)
Segmentation rather slow, but yields accurate segmentation
• Analysis at grouped-character-level: a sequence of similar characters is
specific
• Analysis at OCR-level: comparison to a pre-trained alphanumeric set highly
specific (slow!!)
![Page 19: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/19.jpg)
(a)
(b)
(c)
Improved text detection – synthetic text generation (Classification using Aggregated Channel Features)
![Page 20: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/20.jpg)
20 11.07.2016
![Page 21: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/21.jpg)
Convolutional Neural Network based OCR - Training Generated single characters (0-9, A-Z, a-z): include spatial jitter, font variations
6000 „0“
6000 „A“
6000 „Z“
role of jitter: characters can be recognized despite an offset at detection time
![Page 22: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/22.jpg)
22
Convolutional Neural Network based OCR - Results Analysis window is scanned along the textline, and likelihood ration (score1/score2) is
plotted in the row (below textline) belonging to the maximum classification score.
Optimum partitioning
![Page 23: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/23.jpg)
23
Left-item detection using depth and
intensity information
2D + 3D
Composite task:
Static object detection
Human detection and tracking
![Page 24: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/24.jpg)
24
What is a static object?
“non-human” foreground which keeps still
over a certain period of time
Two fundamentally different approaches:
1. Background modeling (foreground regions
becoming static)
• +: simple, pixel-based
• -: object removal, ghosts
2. Tracking detected foreground regions
• +: many adequate tracking approaches
(blob-based, correlation-based)
• -: crowd, occlusion failure
Both techniques experience problems with
illumination variations motivation for depth-
based sensing
Detecting stationary objects
![Page 25: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/25.jpg)
25
A common approach
Temporal sub-sampling and combination procedure Liao,H-H.; Chang,J-Y.; Chen, L-G. “A localized Approach
to abandoned luggage detection with Foreground –Mask
sampling”, Proc. of AVSS 2008, pp. 132-139.
Detecting stationary objects
![Page 26: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/26.jpg)
Obtaining stereo depth information
![Page 27: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/27.jpg)
Passive stereo based depth measurement
• Depth ordering of people
• Robustness against illumination,
shadows,
• Enables scene analysis
Advantage:
3D stereo-camera system developed by AIT
Area-based, local-optimizing, correlation-
based stereo matching algorithm
Specialized variant of the Census Transform
Resolution: typically ~1 Mpixel
Run-time: ~ 14 fps (Core-i7, multithreaded, SSE-optimized)
Excellent “depth-quality-vs.-computational-costs” ratio
USB 2 interface
![Page 28: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/28.jpg)
Stereo camera characteristics
28
Trinocular setup:
3 baselines possible
3 stereo computations with
results fused into one disparity
image
large baseline
near-range
far-range
small medium
![Page 29: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/29.jpg)
29
Planar surface in 3D space
(x,y) image coordinates, d disparity
d(x,y)
Data characteristics
Intensity image Disparity image
y
d
y
![Page 30: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/30.jpg)
Height (world)
Ground plane (world)
correct measurement noisy measurement
Stereo setup
Computed top view of the 3D point cloud
2.5D approach
3D approach
2.5D vs. 3D algorithmic approaches 2.5D == using disparity as an intensity image
![Page 31: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/31.jpg)
Additional knowledge (compared to existing video analytics solutions):
• Stationary object (Geometry introduced to a scene)
• Object geometric properties (Volume, Size)
• Spatial location (on the ground)
Left Item Detection
![Page 32: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/32.jpg)
Methodology
Background
model
Stereo disparity Combination
of proposals
+ Validation
Final
candidates Input images
Ground plane
estimation
Change detection INTENSITY
DEPTH
Processing intensity and depth data
Ortho-map
generation
Object
detection and
validation in
the ortho-
map
Ortho-transform
![Page 33: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/33.jpg)
33 11.07.2016
Left Item Detection – Demos
![Page 34: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/34.jpg)
34
Quantitative evaluation
Detection results Ground truth Depth-based proposals Motion-based proposals
![Page 35: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/35.jpg)
35 11.07.2016 (a)
Human/Object detection as clustering
![Page 36: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/36.jpg)
A Frequently Occurring Task
Analysis of discrete two-dimensional distributions
… … LEARNED
CODEBOOK
![Page 37: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/37.jpg)
37
Task definition
Intermediate probabilistic representations 2D distributions
prior, structure-specific knowledge
Local grouping generate consistent object
window hypotheses
arbitrarily shaped distributions
multiple nearby modes
noise, clutter
Challenge:
![Page 38: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/38.jpg)
38
Related State-of-the-Art
Weakly constrained structural prior:
Using structure information:
Non-maximum suppression A. Neubeck, Van Gool. Efficient non-maximum suppression. ICPR 2006
R. Rothe et al., Non-maximum suppression for object detection by
passing messages between windows, ACCV2014
Mean Shift, CAMShift
D. Comaniciu, P. Meer. Mean shift: A robust approach toward
feature space analysis., 2002
G. R. Bradski, Computer vision face tracking for use in a perceptual user
interface, 1998 Local structural elements such as bricks, shapelets
local intensity and color distribution
Implicit Shape Model
Neubeck & Van Gool, 2006 R. Rothe et al., 2014
B. Leibe et al. 2005
Comaniciu & Meer, 2002 Bradski 1998
Jin&Geman2006
Structured random forests edge structure
semantic label distribution within local patches Dollar & Zitnick 2013
Kontschieder et al. 2011
Chua et al., 2012 B. Leibe et al. 2005
![Page 39: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/39.jpg)
Shape learning – Case: Compact clusters
2. Sampling using an analysis window
discretized into a ni×ni grid
1. Binary mask from manual annotation or
from synthetic data
M
Off-the-mode
samples
Mode-centered
samples
Spatial resolution of local structure
3. Building a codebook of binary shapes
with a coarse-to-fine spatial resolution
Codebook:
Shape learning
![Page 40: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/40.jpg)
Example Codebook – Case: Compact clusters
FULL TREE
ZOOM LEVEL 1
ZOOM LEVEL 2
Shape learning
![Page 41: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/41.jpg)
Shape learning – Case: Line structures
Spatial resolution of local structure
low mid high Binary mask from manually annotated
text lines
Codebook:
Shape learning
![Page 42: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/42.jpg)
Shape delineation – I.
Shape delineation
Density measure for each resolution level for the binary structure
Step 1: Fast Mode Seeking
Step 2: Local density analysis
Three integral images: and
Mode location:
COMPACT CLUSTERS LINE STRUCTURES
Enumerating all binary shapes at each resolution level
Finding best matching entry:
![Page 43: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/43.jpg)
Shape delineation – II.
Shape delineation
Recursive search for end points, starting from
mode locations:
Line-centered structures Off-the-line
structures
… …
END POINT
CANDIDATES
VOTE MAP
Relative line end locations define: - Search direction
- Line end positions
![Page 44: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/44.jpg)
Experimental results - Case: Compact clusters
Human detection by occupancy map clustering:
13 m
Passive stereo depth sensing depth data projected orthogonal to the
ground plane
Occupancy map (1246×728 pix.) clustering: 56 fps, overall system (incl. stereo computation): 6 fps
![Page 45: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/45.jpg)
Experimental results - Case: Compact clusters
![Page 46: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/46.jpg)
Input image
Probability distribution for text
Binarization is very sensitive to employed threshold
Our scheme has no threshold, only local structural priors
Simple binarization
Proposed scheme
Experimental results - Case: Line structures (Text line segmentation)
![Page 47: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/47.jpg)
Experimental results - Case: Text line segmentation
![Page 48: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/48.jpg)
48
Queue length detection using depth and
intensity information
2D + 3D
![Page 49: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/49.jpg)
Queue Length + Waiting Time estimation
What is waiting time in a queue?
Ch
eckp
oin
t
Waiting time
Time measurement relating to last
passenger in the queue
Example: Announcement of waiting times (App) customer satisfaction
Why interesting?
Example: Infrastructure operator load balancing
![Page 50: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/50.jpg)
Queue analysis
Challenging problem
Waiting time =
Shape
No predefined shape (context/situation-dependent and time-varying)
Motion not a pure translational pattern
Propagating stop-and-go behaviour with a noisy „background“
Signal-to-noise ratio depends on the observation distance
Length
Velocity
1. What is the shape and extent of the queue?
2. What is the velocity of the propagation?
DEFINITION: Collective goal-oriented motion pattern of multiple humans
exhibiting spatial and temporal coherence
simple complex
![Page 51: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/51.jpg)
How can we detect (weak) correlation?
Much data is necessary Simulating crowding phenomena in Matlab
Social force model (Helbing 1998)
Source: Parameswaran et al. Design and Validation of a
System for People Queue Statistics Estimation, Video
Analytics for Business Intelligence, 2012
t
x
y
Correlation in space and time
goal-driven kinematics – force field repulsion by walls repulsion by „preserving privacy “
Visual queue analysis - Overview
![Page 52: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/52.jpg)
52
Simulation tool Creating infinite number of possible queueing zones
Two simulated examples (time-accelerated view):
Queue analysis
![Page 53: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/53.jpg)
Queue analysis (length, dynamics)
Staged scenarios, 1280x1024 pixels, computational speed: 6 fps
Straight line Meander style
![Page 54: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/54.jpg)
Estimated configuration
(top-view) Detection results
Adaptive estimation of the spatial extent of the queueing zone
Left part of the image is intentionally blurred
for protecting the privacy of by-standers,
who were not part of the experimental setup. stereo sensor
![Page 55: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/55.jpg)
Scene-aware heatmap
![Page 56: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/56.jpg)
Implementation details and strategy
![Page 57: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/57.jpg)
Our development concept
Implementation details
MATLAB C/C++
Method, Prototype,
Demonstrator
Real-time
prototype
MATLAB:
Broad spectrum of algorithmic libraries,
Well-suited for image analysis,
Visualisation, debugging,
Rapid development Method, Prototype, Demonstrator
C/C++
Real-time capability
Porting
mex
shared library
Computationally
intensive
methods
Verification
Matlab engine
![Page 58: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/58.jpg)
PC GPU FPGA / DSP
Standard
Methods
Advanced
Methods
Products
Innovations
Static Objects
Moving Objects
Moving Objects
Advanced Background Model
Person Detection
Person Detection
and Tracking
Automatic Calibration
Multi-Camera Tracking
Soft Biometrics
Our development concept
![Page 59: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/59.jpg)
Research methodology
Thematic areas and trends in Computer Vision also distributed branch-and-bound
Balance: becoming a domain expert vs. being a „globalist“
Researchers tend to favour certain paradigms - Learn to outline trends, look upstream
Revisit old problems to see them under new light
Specialize the general & Generalize the specific
Factorize your know-how (code, topics, …) into components sustainable, scalable
GRAND
CHALLENGES
RECOGNITION
SEGMENTATION
RECONSTRUCTION
TIME
![Page 60: Task-oriented Computer Vision in 2D and 3D: from video ...ssip/2016/wp-content/uploads/... · Csaba Beleznai Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia](https://reader033.vdocument.in/reader033/viewer/2022050523/5fa688e087a3fa7c977fbae7/html5/thumbnails/60.jpg)
Thank you for your attention!
CSABA BELEZNAI
Senior Scientist
Digital Safety & Security Department
Video- and Security Technology
AIT Austrian Institute of Technology GmbH
Donau-City-Straße 1 | 1220 Vienna | Austria
T +43(0) 664 825 1257 | F +43(0) 50550-4170
[email protected] | http://www.ait.ac.at
http://www.d-sens.eu/