online security analytics on large scale video surveillance system by yu cao and xiaoyan guo

Post on 08-Jan-2017

1.580 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Online Security Analytics On Large Scale Video Surveillance System

Yu Cao, Xiaoyan GuoEMC Corporation

Security Analytics On Video Surveillance

• Search across video systems in all store locations toidentify the customer of a fraudulent card transactionand his/her other transactions

• Correlate register transactions and surveillance videoto identify employee fraud transactions where thereis no customer present

• If multiple stores in a region are robbed, identify anyfaces that were in all of those stores in the weeksleading up to the events

2

-- Retail Industry Customer Cases

Challenges In Big & Fast Data Era

3

Cloud Integration

M&O

Fast Data Ingestion

Multi-Latency Analytics

Scalable Data

Storage

EMC Video Analytics Data Lake

4

Where Spark Resides @ VADL

Offline Video Analytics & Model Training

Object Detection

Feature Extraction Classification

Abnormal Detection

Face Recognition

Feature Indexing

Online Video Processing and Detection

Ad-Hoc Video Content Search

Video & Feature Storage Analytics

Model

Streaming

MLlib & GraphX

Core & SQL

Deep Learning onDeep Learning Framework

5

Enable Spark to Process Raw Video Data

6

• Spark has no built-in video processing capability• Combine Spark program (Scala, Java) with video

processing library(C++)

PipedRDD: Invoking External Programs

7

• PipedRDD[T]: T => Linux Command(T) => String,T is text line

• Spark Pipe– pipe interface takes an input of an external command, and then execute it

externally. The input stream of this program is the content of RDD in spark, the output of this external program will form a new RDD

• JAVA API– JavaRDD<String> pipe(String command)– JavaRDD<String> pipe(java.util.List<String> command)– JavaRDD<String> pipe(java.util.List<String> command,

java.util.Map<String,String> env)– Return an RDD created by piping elements to a forked external process

Video Processing Function Implementation

8

• OpenCV– Popular open source computer vision library

• Home-grown algorithms, e.g. CNN

• Video Processing Functions– video file => video transcoding => list of frame images– frame image => background extraction => background

image– frame image => object detection => list of objects– object => feature extraction => object features– ……

Pipeline Video Processing Tasks

9

• Steps– Implement all required video processing sub-components

as external programs– Pipeline these processing units by utilizing PipedRDD in

Spark jobs• Pseudo-code (Chaining & Pipeline)

sc.fromCameraStream (“rtsp://10.67.89.10/road?fps=15”).pipe(“video_transcoding”).pipe(“object_detection”).pipe(“feature extraction”).writeToHBase()

Online Video Processing During Ingestion

Video Ingestion System10

Online Video Processing During Ingestion

11

video streams

Object Detection

Feature Extraction

Classification/Recognition

Indexing/ Storing

Deep Learning Platform

Model

Real-time Detection

Real-time Dashboard

Video Processing in Spark Streaming

12

• Receive Video Stream– val snapList = stream.queueStream(rddQueue)– Read video stream in certain time interval,put data into msgQueue– rddQueue += sc.makeRDD(msgQueue) – Then process snapList

Spark Job

Spark Streaming

rdd.pipe(“video_transcoding”).pipe(“object_detection”).pipe(“feature extraction”).writeToHBase()

Feature & object store

Online Video Analytics App

Video Content Search

13

• Content-based video object search

– Search similar objects by a given object instance

– E.g. search suspect from history video records by given the suspect's identification photo

• Semantic-based video object search

– Search matched objects by given semantic declaration

– E.g. given keywords: search "Red Porsche", "a woman sitting and smoking", etc

Video Content Search Workflow

14

camera streams

Object Detection

Feature Extraction

Index Building

HBase Ingestion

IndexIngestion

video pre-processing:object detection and feature

extraction

Web Dashboard

Web Backend

Search Engine

HDFS

Multi-Tier Video Storage

HBase

HBase Client

Feature Extraction

query image

similar object search

features

object information query

top-kobjects

similar objects

• Video Pre-processing and Feature Extraction • Scalable Storage • Object-based Indexing • Similar Object Search Engine

Feature & Index

Object Info

Original Video Data

Video Object Similarity

15

Local Binary Pattern(LBP)

• Similarity of Features == Similarity of Video Objects– Color, Texture, Shape– SIFT

• 160 features, each is a vector of 128 dimensions– Deep Learning Features …

Deep Learning Features in Different Layers

Feature Dimensionality Reduction

16

• PCA– MLlib version PCA (when D is small)

• Scalable PCA– Distributed PPCA implemented atop Spark (when D is large)

• LSH (Locality-Sensitive Hashing)– LSH hashes input items so that similar items map to the same “buckets” with

high probability

Resize

Grayscale

SIFT

PCA

Spark Top-K Query Pipeline

17

workers: --f(i)----f(i)--

--f(i)----f(i)--

--f(i)----f(i)--

map: --f(i)-- --qf--

--qf--query feature

--f(i)-- --qf--

--f(i)-- --qf--

order: Array[s1, s2, …,sn] Array[s1, s2, …,sn] Array[s1, s2, …,sn]

top-ktop-k most similar features

Scala Code Example

18

def topRankScore(sc:SparkContext, top:Int, queryInput:String, trainPath:String, useMethod:(Array[Double], Array[Double])=>Double ) = {

val query = sc.makeRDD(Array(queryInput)).map( _.split("\\ ").map( _.toDouble ) ).collect()(0)

featureFile.filter( _.length > 0 ).map{ line =>val parts = line.split("\\ ", 2)(useMethod(query, parts(1).split("\\ ").map( _.toDouble)), parts(0))

}.takeOrdered(top).map( i => (i._2, i._1) )

}topRankScore(sc, topNumber.toInt, imageFeaturesStr, names, cosScore).foreach(println)

Parameter: (sc, top-k, queried feature, HDFS feature file, similarity computing method)

Deep Learning @ VADL

19

• Feature extraction for detected video objects– faces– humans

• Classification of video objects– With trained model

• Suspect detection and recognition Training neural networks with many layers

20

Deep Learning With Spark• DeepLearning4J (DL4J)

– Open source– Variety of NNs &

Flexibility– Cross-platform & Scale– Java Implementation

• parallelization (Yarn, Spark)

• GPU support– Also supports multi-GPU

per host node

• DeepDist– Open source– Deep belief networks– Asynchronous

stochastic gradient descent for data stored on HDFS / Spark

– Python Implementation

THANK YOU.yu.cao@emc.com

top related