Symposium on Future Trends in Service-Oriented Computing, HPI Potsdam, 20-21.06.2013
Semantic Multimedia Analysis and Search
Dr. Harald SackHasso-Plattner-Institut for IT-Systems Engineering
University of Potsdam
Potsdam, 21/06/2013
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
•Searching Multimedia Web vs. Archive
•How to Open Up Multimedia Data?Automated Multimedia Analysis
•How to Determine the Meaning of (Multimedia) Metadata? Context-Driven Semantic Analysis
•How to Make Use of Semantic Metadata?Exploratory Search and Intelligent Recommendations
Semantic Multimedia Analysis and Search
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
3
Searching the WebFreitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
4
Searching the WebFreitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
5
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
6
Google Knowledge Graph
= “search results with semantic- search information gathered from a wide variety of sources“
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, Workshop ,Corporate Semantic Web‘, XInnovations 2011, Berlin, 19. Sep. 2011Google Multimedia Search
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
‣Google Multimedia Search relies on text-based metadata and link context
How does Google find Multimedia?
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
Seach by Media Content
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
The Ordinary Archive is a Small World...
Neil Armstrong
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
But, wouldn‘t it be nice, if.....
Neil Armstrong
...but maybe you are also interested in
- Buzz Aldrin (1 videos)- John Glen (1 video)- Juri Gagarin (2 videos)
- Richard Nixon (3 videos)
- Apollo 11 (1 video)- NASA (20 videos)
- Moon (14 videos)
- space exploration (34 videos)
- technology (1.205 videos)
Sorry, no results found for ‘Neil Armstrong‘...
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
How to Search in Multimedia Archives?
Freitag, 21. Juni 13
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
Content-Based Search in Multimedia Archives relies on text-based Metadata Current Solution: Manual Annotation
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011 image
VisualConceptDetection
Text Recognition
Visual Analysis
(Selected) Automated Media Analysis
Face Detection
Face Detection
Logo Detection
audio-visual
text / images
Audio-Mining
structuralanalysis
AutomatedSpeech
Recognitionaudio event detection
audio
Freitag, 21. Juni 13
Structural Video Analysis
• Decomposition of time-based media into meaningful media fragments of coherent content that can be used as basic element for indexing and classification
scenes
shots
subshots
frames
video
keyframes
Freitag, 21. Juni 13
Video Optical Character Recognition (OCR)
Fig. 1. Workflow of the proposed text detection method. (b) is the vertical edge map of (a). (c) is the vertical dilation map of(b). (d) is the binary map of (c). (e) the result map of subsequent connected component analysis. (f) shows the binary map afterthe adaptive projection profile refinement. (g) is the final detection result.
for text detection of nature scene images. The operator com-putes for each pixel the width of the most likely stroke con-taining the pixel. The output of the operator is a stroke-featuremap, which has the same size as the input image, while eachpixel represents the corresponding stroke width value of theinput image.
3. TEXT DETECTION IN VIDEO IMAGES
Text detection is the first task of video OCR. Our approachdetermines, whether a single frame of a video file containstext lines, for which a tight bounding box is returned. In or-der to manage detected text lines efficiently, we have defined aclass ”text line object” with the following properties: bound-ing box location (the top-left corner position), bounding boxsize. After the first round of text detection, the refinement andthe verification procedures ensure the validity of the detectionresults in order to reduce false alarms.
3.1. Text detector
Before performing the text detection process, a gaussiansmooth filter is applied to the images that have an entropyvalue larger than a predefined threshold Tentr . For our pur-pose, Tentr =5.25 has proven to be to the best advantage.
We have developed an edge based text detector, subse-quently referred to edge text detector. The advantage of ourdetector is its computational efficiency compared to other ma-chine learning based approaches, because no computation-ally expensive training period is required. However, for vi-sually different video sequences a parameter adaption has tobe performed. The best suited parameter combination of ourmethod were learned from the test runs on the given test data.
Fig. 2. Workflow of the proposed adaptive text line refinementprocedure
The processing workflow for a single frame is depictedin Fig. 1 (a-e). First, a vertical edge map is produced usingSobel filter [8] (cf. Fig. 1 (b)). Then, the morphological dila-tion operation is adopted to link the vertical character edgestogether (cf. Fig. 1 (c)). Let MinW denote the detected min-imal text line width. A rectangle kernel:1�MinW is definedfor vertical dilation operator. Subsequently, a binary maskis generated by using Otsu’s thresholding method [9]. Ulti-mately, we create a binary map after Connected Component
• Video OCR is much more difficult than traditional print OCR• fast detection/filtering of text candidates• verification of text candidates• script separation from background• visual quality enhancement• application of standard OCR software• spell correction w.r.t. context and temporal
redundancy
Freitag, 21. Juni 13
• Face DetectionDetect candidate image regionsin a video frame that depict a human face
• Face TrackingTrack a detected face in videoover consecutive frames within shot boundaries
• Face ClusteringGroup faces detected and tracked in videos into visually similar sets within a single video
• Face Recognition/IdentificationReliable identification of detected faces
Video Face Detection, Tracking & Clustering
personfrontal face:90%
not a person
personprofile face:70%
Freitag, 21. Juni 13
Visual Concept Detection
• Adaption of traditional ,Bag of Words‘ approach from text retrieval
• Image is expressed as vector (histogram)of dictionary codeword frequencies
• classification via machine learning(Support Vector Machines)
• Konzeptzuordnung durch maschinelles Lernverfahren (hier Support Vector Machines)
Freitag, 21. Juni 13
Annotation of Audiovisual Data
Metadata Extraction
Metadata (e.g. MPEG-7) ... <SpatialDecomposition> <TextAnnotation> <KeywordAnnotation> <Keyword>Astronaut</Keyword> </KeywordAnnotation> </TextAnnotation> <SpatialMask> <SubRegion> <Polygon> <Coords> 480 150 620 480 </Coords> </Polygon> </SubRegion> </SpatialMask> ... </SpatialDecomposition> ...
• Multimedia data with spatiotemporal Annotations
Neil Armstrong
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
www.yovisto.com
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
• Authoritative Metadata• structured data• semi-structured data
• natural language text • Non-authoritative Metadata
• (free) user tags and comments• restricted vocabularies
• (Media) Analysis Metadata• low level features• high level features
• etc.
How to Determine the Meaning of Metadata?
SemanticAnalysis
reliability
context
pragmatics
location dependency
accuracy
timedependency
level ofabstraction
Freitag, 21. Juni 13
Neil Armstrong
Astronaut
is a
Person
is a
Science Occupation
subClassOf
Employment
subClassOf
Entities
Ontologies
has an
,Neil Armstrong‘ is more than just a character string
Kosmonautsame as
Juri Gagarin
is a
is NOT a
!
Freitag, 21. Juni 13
Where does the knowledge come from...?
Freitag, 21. Juni 13
Astronaut Person
Neil Armstrong
Science Occupation
Employment
is a is a
is a
is a has a
Web of Data
Freitag, 21. Juni 13
Web of Data = Linked Open DataBut what, if there is no trivial unique identification?
Armstronguser tag
Freitag, 21. Juni 13
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
Armstrong
Freitag, 21. Juni 13
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
ArmstrongArmstrong+Moon
Freitag, 21. Juni 13
Web of Data = Linked Open DataUnderstanding requires Context
Armstrong
Moon
EagleSpace
Freitag, 21. Juni 13
4242 42 4224424242 42 4242Semantic AnalysisSemantics is determined by Context
Context Item
N.Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013
„Armstrong landed the Eagle on the Moon.“Text
SEMEX Multimedia Context Model
Context Dimensions
TemporalContext
SpatialContext
ProvenanceContext
Relevance
determines
Ambiguity
influences
Accuracy
influences
Contextual Description
ClassDiversity
Level of Structure
SourceReliability
SourceDiversity
Freitag, 21. Juni 13
Armstrong
George Armstrong Custer
Neil Armstrong
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe ArmstrongArmstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Moon Crater)
Louis Armstrong
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
Eagle Moon
Eagle (Bird)
Eagle (heraldry)
USCGC Eagle
The Eagle (2011 film)
Eagle (song)
John H. EagleEagle (typeface)
Eagle Falls (Washington)
Eagle (Moon Crater)
Eagle (comic)
Eagle (lunar module)
Eagle TV
Armstrong Tunnel
The Eagle (Pub)
War Eagle
The Eagle (newspaper)
Eagle (racehorse)
Angela EagleLinda Eagle
James Philipp Eagle
95 entities448 entities
Armstrong (British Columbia)Karen Armstrong
Curtis Armstrong
Gillian Armstrong Hilary Armstrong
William L. Armstrong
156 entities
Man on the Moon (film)
Moon (song)
Moon Son-Ri
C Moon
The Moon (Tarot card)
Edgar Moon
Moon OSMoon (Band)
Moon
Moon 44
Man on the Moon (soundtrack)
William Moon
Lottie Moon
Mr. Moon (song)
Man on the Moon (musical)
Darvin Moon
Moon 83
Francis MoonGary Moon
Robert Charles Moon
Black Moon
Allan Moon
Ban-Ki Moon
Fly me to the Moon (song)
Semantic AnalysisNamed Entity Mapping
„Armstrong landed the Eagle on the Moon.“
Consider all entities within the same context
Freitag, 21. Juni 13
Select matching entities from all possible candidate entities: • Popularity based strategies• Linguistical strategies• Statistical strategies• Semantic based strategies
General Approach1. Make an assumption 2. Do the strategies support or contradict your assumption3. Make decision according to logical and probabilistic rules/constraints
Semantic AnalysisNamed Entity Recognition
N. Ludwig, H. Sack, “Named entity recognition for user-generated tags,TIR 2011
• reference text corpus(wikipedia)
• link graph (wikipedia)• semantic graph
(DBpedia)
Entity Selection Process
Freitag, 21. Juni 13
Armstrong
George Armstrong Custer
The Armstrong Twins
Armstrong, Florida
Armstrong, Ontario
Armstrong Automobile
Joe ArmstrongArmstrong County, Texass
Armstrong Gun
Craig Armstrong
Armstrong (Moon Crater)
Armstrong Tunnel
Louis Armstrong International Airport
Armstrong‘s Theorem
Sir Thomas Armstrong
Ian Armstrong
Eagle Moon
Eagle (Bird)
Eagle (heraldry)
USCGC Eagle
The Eagle (2011 film)
Eagle (song)
John H. EagleEagle (typeface)
Eagle Falls (Washington)
Eagle (Moon Crater)
Eagle (comic)
Eagle TV
Armstrong Tunnel
The Eagle (Pub)
War Eagle
The Eagle (newspaper)
Eagle (racehorse)
Angela EagleLinda Eagle
James Philipp Eagle
95 entities448 entities
Armstrong (British Columbia)Karen Armstrong
Curtis Armstrong
Gillian Armstrong Hilary Armstrong
William L. Armstrong
156 entities
Man on the Moon (film)
Moon (song)
Moon Son-Ri
C Moon
The Moon (Tarot card)
Edgar Moon
Moon OSMoon (Band)
Moon 44
Man on the Moon (soundtrack)
William Moon
Lottie Moon
Mr. Moon (song)
Man on the Moon (musical)
Darvin Moon
Moon 83
Francis MoonGary Moon
Robert Charles Moon
Black Moon
Allan Moon
Ban-Ki Moon
Neil Armstrong
Eagle (lunar module)
Moon
Louis Armstrong
Fly me to the Moon (song)
Semantic AnalysisNamed Entity Recognition
„Armstrong landed the Eagle on the Moon.“
N. Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, 2013
Entity Selection Process(Semantic) Graph Analysis
Freitag, 21. Juni 13
4242 42 4224424242 42 4242
vfm - Seminar: Metadatenmanagement in Medienunternehmen, 05. September 2012, Bonn Jörg Waitelonis, Hasso-Plattner-Institut Potsdam
33
Semantically Annotated Multimedia
Video Analysis /Metadata Extraction
timemetadata
metadatametadata
metadatametadata
e.g., person xylocation yzevent abc
e.g., bibliographical data,geographical data,encyclopedic data, ..
Entity Recognition/ Mapping
N. Ludwig, H. Sack: Named Entity Recognition for User-Generated Tags. In Proc. of the 8th Int. Workshop on Text-based Information Retrieval, IEEE CS Press, 2011
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
34
Entity Based Search
• linguistic ambiguities of traditional keyword based search can be avoided
• enables high precision and high recall retrieval
http://www.yovisto.com/labs/autosuggestion/
• Query string refinement / extension• entity auto-suggestion• interpretation of natural language queries
J. Osterhoff, J. Waitelonis, H. Sack, Widen the Peepholes! Entity-Based Auto-Suggestion as a rich and yet immediate Starting Point for Exploratory Search, IVDW 2012
Freitag, 21. Juni 13
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
36
http://mediaglobe.yovisto.com:8080/mggui-dev2/
search facets
C. Hentschel, H. Sack, et al., Open up cultural heritage in video archives with mediaglobe, I2CS 2012
Freitag, 21. Juni 13
Freitag, 21. Juni 13
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
38
Explorative Search
dbpedia-owl:mission
dbpedia:Neil_Armstrong
dbpedia:Apollo_11dbpedia-owl:mission
category:Apollo_program
dcterms:subject
dbpedia:Apollo_13
dcterms:subject
yago:Space_accidents_and_incidents
rdf:type
rdf:type
dbpedia:Space_Shuttle_Challenger
dbpedia-owl:mission
http://mediaglobe.yovisto.com:8080/J. Waitelonis, H. Sack: Towards exploratory video search using linked data, MTAP Volume 59, Number 2 (2012), 645-672
dbpedia:Buzz_Aldrin
dbpedia:Michael_Collins
Freitag, 21. Juni 13
Exploratory Search and Serendipity•Find something that you were not looking for on purpose ...
dbpedia:Buzz_Aldrin
dbpedia:Cookie_Monster
dbpedia:Strictly_Come_Dancing
dbpedia:Transformers
Freitag, 21. Juni 13
Explorative Search & Intelligent Recommmendationwith yovisto
http://mediaglobe.yovisto.com:8080/
Freitag, 21. Juni 13
Explorative Search & Intelligent Recommmendationwith yovisto
http://mediaglobe.yovisto.com:8080/
Freitag, 21. Juni 13
Explorative Search & Intelligent Recommmendationwith yovisto
http://mediaglobe.yovisto.com:8080/
Freitag, 21. Juni 13
Harald Sack, Hasso-Plattner-Institute for IT-Systems Engineering, LDW 2011, Magdeburg, 30. Sep. 2011
Contact:Dr. Harald SackHasso-Plattner-Institut für SoftwaresystemtechnikUniversität PotsdamProf.-Dr.-Helmert-Str. 2-3D-14482 Potsdam
Homepage:http://www.hpi.uni-potsdam.de/meinel/team/sack.htmlBlog: http://yovisto.blogspot.com/E-Mail: [email protected] Twitter: lysander07 / biblionomicon / yovisto Slides can be found at http://slideshare.com/lysander07/
Thank you very much
for your attention!
Freitag, 21. Juni 13