graph-based multimodal clustering for social event detection in large collections of images
DESCRIPTION
Presentation by my colleague Giorgos Petkos of our paper at the Multimedia Modeling conference (MMM2014) in Dublin.TRANSCRIPT
MMM 2014
Graph-based multimodal clustering for social event
detection in large collections of images
Georgios Petkos, Symeon Papadopoulos, Emmanouil Schinas, Yiannis KompatsiarisInformation Technologies Institute (ITI)Centre for Research & Technologies Hellas (CERTH)
MMM 2014 Georgios Petkos et al.#2
Overview
• The problem of social event detection• Existing approaches• Proposed approach• Evaluation• Summary & future work
MMM 2014 Georgios Petkos et al.
the problem
MMM 2014 Georgios Petkos et al.#4
entertainment
personal
news
wedding / birthday / drinks
concert / play / sports
demonstration / riot / speech
Social events?
Attended by people and represented by multimedia content shared online
MMM 2014 Georgios Petkos et al.#5
Pope Francis
Pope Benedict
2007: iPhone release
2008: Android release
2010: iPad release
http://petapixel.com/2013/03/14/a-starry-sea-of-cameras-at-the-unveiling-of-pope-francis/
MMM 2014 Georgios Petkos et al.
Social event detection
Social event detection involves the automatic organization of a multimedia collection C into groups of items, each (group) of which corresponds to a distinct event. Can be treated as a multimodal clustering problemCOLLECTION
EVENT DETECTION
EVENT SET
E1
E2
EN
#6
MMM 2014 Georgios Petkos et al.
existing approaches
MMM 2014 Georgios Petkos et al.
Supervised event detection
• Rationale: use a large number of “known” event assignments to “learn” how to identify “same event” / “same cluster” relationships
Two variants:• Item-to-item: learn whether two items belong to the same
event cluster or not. – Model Input: the set of per modality distances between two images.
• Item-to-cluster: learn whether a new item belongs to a given event cluster or not. – Model input: the set of per modality distances between an image and
a prototype representation of the event.
#8
MMM 2014 Georgios Petkos et al.
Utilizing the “same event” model for clustering
• Item-to-item: – (Incremental). For each incoming image, average all item-to-item SE
scores for all items in each cluster. Assign to best-matching cluster if average above threshold or create new cluster (Becker et. al.).
– (Batch). Compute all item-item SE scores between each image and all other images and form an indicator vector. Cluster indicator vectors (Petkos et. al.).
• Item-to-cluster: – (Incremental). For each cluster maintain a multimodal representation.
Compute SE score between each incoming item and the existing prototype event representations. Assign to best-matching cluster if above threshold or create new cluster (Becker et. al). Alternatively use a second model for deciding if a new cluster should be added or not (Reuter et. al.).
#9
MMM 2014 Georgios Petkos et al.
proposed approach
MMM 2014 Georgios Petkos et al.
Overview of proposed approach
#11
• Item-to-item SE model utilized.• Candidate neighbours selection step (first appears in (Reuter et. al)) using a set of per modality indexes. • Graph representation.• Community detection on graph. Two variants of the algorithm:
• Batch: SCAN• Incremental: QCA
MMM 2014 Georgios Petkos et al.
Proposed approach: advantages
#12
• Item-to-cluster methods may suffer from incorrect prototype representations (due to averaging). • Candidate neighbours selection step makes the application of the method much more scalable.• Graph representation: in order to introduce a scalable item-to-item approach without averaging.
MMM 2014 Georgios Petkos et al.
evaluation
MMM 2014 Georgios Petkos et al.
Evaluation setup
• Used the dataset of the 2012 SED task of MediaEval• Ground truth: 7,779 photos clustered around 149
events (18 technical, 79 soccer, 52 Indignados)• Assess the following aspects:
– accuracy of same-event classification– compare clustering quality between item-to-cluster and
the two versions of item-to-item (batch & incremental)– measure contributions of different features– study generalization abilities of same event model
MMM 2014 Georgios Petkos et al.
Evaluation setup
Features:• Uploader identity.• Actual image content:
– GIST– SURF, aggregated using the VLAD scheme
• Textual features: title, description and tags. Either a TF-IDF or a BM25 weighting scheme is utilized.
• Time of media creation.• Location, when available (geodesic distance).
Appropriate indices are utilized in order to rapidly fetch the candidate neighbours for each modality.
MMM 2014 Georgios Petkos et al.
Evaluation: SE accuracy & clustering quality
• Same event classification accuracy 98.58% (SVM)– 10K pos/neg training, 10K pos/neg testing (random)
• Clustering quality (NMI): 30/119 training/testing events [10 random splits]– Incremental same or better than batch– Item-to-item better than item-to-cluster (significant at 0.95 confidence)
• When non-event photos enter the dataset, NMI degrades quickly
BATCH INCREMENTAL ITEM-TO-CLUSTER
AVG 0.924 0.934 0.898
STD 0.019 0.021 0.027
NON-EVENT BATCH INCREMENTAL ITEM-TO-CLUSTER
5% 0.4824 0.5164 0.3954
10% 0.3421 0.3683 0.2899
* In the second table, results were obtained using sed2011 for training and sed2012 for testing.
*
MMM 2014 Georgios Petkos et al.
Evaluation: contribution of features
• Same experiments using limited sets of features
• Repeating the same experiments without the use of blocking led to significantly worse results– e.g. 0.030 for visual, 0.7148 for textual
• Time is an extremely important feature
FEATUERS BATCH INCREMENTAL
VISUAL 0.8020 ∓ 0.0193 0.8179 ∓ 0.0151
TEXTUAL 0.7925 ∓ 0.0255 0.7792 ∓ 0.0310
VISUAL+TIME 0.9244 ∓ 0.0195 0.9360 ∓ 0.0183
TEXTUAL+TIME 0.9016 ∓ 0.0173 0.9049 ∓ 0.0209
MMM 2014 Georgios Petkos et al.
Evaluation: generalizing same event model
• Train using one event type > test on a different one• In most cases negative impact• In few cases, performance is very high!
BATCH
soccer technical Indignados
soccer - 0.8658 0.8494
technical 0.7967 - 0.8977
Indignados 0.9645 0.8456 -
INCREMENTAL
soccer technical Indignados
soccer - 0.8892 0.8667
technical 0.7661 - 0.7735
Indignados 0.9845 0.8482 -
MMM 2014 Georgios Petkos et al.
summary & future work
MMM 2014 Georgios Petkos et al.
Summary
• Scalable item-to-item multimodal clustering approach for SED
• Key characteristics:– Item-to-item “same event” model– Candidate neighbor selection – Organization of “same event” relationships to a graph– Efficient graph clustering algorithms: SCAN (batch) / QCA
(incremental)
• In general though, item-to-item approaches are less scalable than item-to-cluster approaches
#20
MMM 2014 Georgios Petkos et al.
Future work
• Extend method so that non-event images are properly handled
• Multiple sources of multimedia
• The MediaEval datasets are somewhat limited. Investigate the effect of crawling / image collection to the quality of results
#21
MMM 2014 Georgios Petkos et al.
thank you!
questions?Acknowledgements
MMM 2014 Georgios Petkos et al.
online clustering of same-event graph
QCA maintains community structure incrementally following graph change operations: node & edge addition (removal operations not applicable in same event graph): based on the concept of community attraction forces
A
B
C
D
X new nodenew edge
Cu
Cw
Cz
force from Cu to Cz
force from Cz to Cu
• Depending on a test (computed based on local graph structure), community structure could remain the same, X assigned to Cu or A to Cz.
• If A is assigned to Cu, all its neighbours will be checked for potential reassignment.
#23
MMM 2014 Georgios Petkos et al.
graph clustering :: SCAN
outlier
hub
(μ,ε)- corestructural similarity
• resilient to spurious links (e.g. visual links that connect unrelated images)
• very fast (scales linearly to the number of edges)• leaves less-/ and over-connected items out of the clustering
#24
MMM 2014 Georgios Petkos et al.
References
• Reuter, T., & Cimiano, P. (2012, June). Event-based classification of social media streams. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval (p. 22). ACM.
• Petkos, G., Papadopoulos, S., & Kompatsiaris, Y. (2012). Social event detection using multimodal clustering and integrating supervisory signals. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval (p. 23). ACM.
• Becker, H., Naaman, M. & Gravano, L.. Learning similarity metrics for event identification in social media. In Proceedings of the third ACM International Conference on Web search and Data Mining, WSDM ’10, pages 291–300, New York.
• Nguyen, N., Dinh, T., Xuan, Y., & Thai, M.. Adaptive algorithms for detecting community structure in dynamic social networks. In INFOCOM 2011. 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, 10-15 April 2011, Shanghai, China, pages 2282–2290. IEEE, 2011.
• Xu, X., Yuruk, N., Feng, Z. & Schweiger, T.. SCAN: a structural clustering algorithm for networks. In Proceedings of the 13th ACM SIGKDD, KDD ’07, pages 824–833, NY, USA, 2007. ACM
#25