mining data from images and video for indexing and analysis

01/14/14 1

Mining Data from Images and Video for Indexing and AnalysisBill Brouwer 01/14/13

wjb19@psu.edu

01/14/14 2

Computational Scientist, Research Computing and Cyberinfrastructure (RCC), Penn State 06/2011-present

-Consultant, High Performance Computing (HPC)-Teaching & Personal Research-CUDA, C/C++ programming, code profiling/optimization-Co-writer/recipient of awards-Local XSEDE Campus Champion-Publication & Presentations-Maintain/use ~ 100 open source examples in software stack

wjb19@psu.edu

Current Role at PSU

01/14/14 3

Objective-Knowledge Discovery & Data Mining (KDD)-Machine vs Humans

Example Problem-Quantification in root structures

Methods-Computer Vision Algorithms-H.264/AVC codec

Solution-Avpipe

wjb19@psu.edu

Overview

01/14/14 4

Goal: simply put, to learn things from data; first need to get it in a database/usable state

Hard enough for text documents, much harder for images/video because it's binary data

Even with meta from tagging allowing indexing and retrieval, still difficult to analyze large amounts of image data

Want to make both indexing and analysis easier through software; we can create useful data from binary using machines or humans

wjb19@psu.edu

Knowledge discovery& Data Mining (KDD)

01/14/14 5

SKYTree-Startup recently secured ~18M series A funding, provide solutions to 'big data' problems, deriving value from disparate data using machine learning (ML)

Roistr-Startup dedicated to 'meaning discovery'-Good for product recommendation problems eg., take a customers twitter feed, and on this basis recommend some books to read

Plot2txt-Personal start-up devoted to mining technical content from images using unsupervised ML-Works well on spectroscopic, oil+gas data

wjb19@psu.edu

Machine: Examples

01/14/14 6

Crowd sourced solution to hard problems for machines, referred to as Human Intelligence Tasks (HIT)

Turkers are the masses, to whom other users can submit tasks, via web interface

Task examples including image tagging, comparison, writing product descriptions

Not really scalable; humans are expensive, bad at accurate measurement eg., quantitative data from images

wjb19@psu.edu

Humans:Amazon Mechanical Turk

01/14/14 7

Extract frames and for each:-Detect edges for structures of interest-Create VTK of volumes for subsequent visualization &measurement

Problem provided by J. Yang (Brown/Lynch lab)

wjb19@psu.edu

Quantifying Root Structure

01/14/14 8

Edge DetectionConnected ComponentsBinarization/thresholdingThreaded computation &synchronizationUbiquitous H.264/AVC codec common to HD format playback and transmission

-Associated IP issues made development/deployment of software tricky/expensive-Cisco recently open-sourced an implementation : http://blogs.cisco.com/collaboration/open-source-h-264-removes-barriers-webrtc/

wjb19@psu.edu

Methods

01/14/14 9

Takes AVI stream from stdin, decodes and sends frames to threads

Data output extracted from frames may be saved to file/sent to stderr

Frames after operation may be re-encoded and sent to stdout

Cat avpipe instances together using pipes

wjb19@psu.edu

Solution: avpipe

decode

encode(?)

stdout

outthreads

01/14/14 10

Basic framework released on github -https://github.com/wjb19/avpipe

Currently incorporating :-Codec-Binarization &CCL-VTK output using library devloped by Burak Korkut http://liberlocus.blogspot.com/

Other applications??

wjb19@psu.edu

Project Status

mining data from images and video for indexing and analysis

Technology

isax 2.0: indexing and mining one billion time...

indexing and mining biological images

efficient video indexing for fast motion video

video seo : a video indexing case study

effective and efﬁcient indexing for large video...

mining video content hierarchy for efficient accessmedical...

mining, indexing, and similarity search in graphs and...

indexing an intelligent video database using evolutionary...

efficient video indexing for fast-motion video

2010 effective content-based video retrieval using...

video indexing, summarization, and...

indexing and mining audiovisual data

news video indexing - northwestern...

video indexing and retrieval

vannotea – a collaborative video indexing, annotation and...

mining, indexing, and querying historical spatiotemporal...

nonchronological video synopsis and indexing

text mining and indexing: assessing the results of deeper...

automatic lecture video indexing using video ocr technology

ee6882 statistical methods for video indexing and analysis