image4act: online social media image processing for disaster response
TRANSCRIPT
Image4Act: Online Social Media Image Processing for Disaster Response
Firoj Alam, Muhammad Imran, Ferda OfliQatar Computing Research Institute
Hamad Bin Khalifa University, Qatar
Time-Critical Events and Information Gaps
Info. Info. Info.
Disaster event (earthquake, flood) Destruction, Damage
Information gathering
Humanitarian organizations and local administrationNeed information to help and launch response
Information gathering, especially in real-time, is the most challenging part
Relief operations
Disaster
2013 Pakistan EarthquakeSeptember 28 at 07:34 UTC
2010 Haiti EarthquakeJanuary 12 at 21:53 UTC
Social Media Data and Opportunities
Social MediaPlatforms
Availability of Immense Data:
Around 16 thousands tweetsper minute were posted duringthe hurricane Sandy in the US.
Opportunities:
- Early warning and event detection
- Situational awareness
- Actionable information
- Rapid crisis response
- Post-disaster analysis
Disease outbreaks
Social Media Images During Disasters
Damage Severity Assessment from Images
Social Media is Noisy (Irrelevant & Duplicate Content)
Examples of irrelevant images showing cartoons, banners, advertisements, celebrities, etc.Posted during the 2015 Nepal earthquake
Examples of near-duplicate images posted during the 2015 Nepal Earthquake
Automatic Image Processing Pipeline
Detailed Architecture
Image URLs
DB
Tweet Collector
Image Collector
Image Filtering
Relevancy filtering model
De-duplicationmodel
Web
Crowd Task Manager
Image Classifier(s)
PersistIn-memory DB
Crowd tasks
& answers
Image
downloading
Tweets Images Images Images
Is relevant? Is duplicate?
Classified Images
(filesystem)
Damage
Images
Injured
People
Rescue
efforts
Image
Hash DB
Database
In-memory DB
Is URL duplicate?
Persister
Classified
images paths
Postgres DB
Crowd
Images Labels
Labeled Datasets
NE: Nepal earthquake -- EE: Ecuador earthquake – TR: Typhoon Ruby – HM: Hurricane Matthew
Relevancy Filtering
Examples of irrelevant images showing cartoons, banners, advertisements, celebrities, etc.
Performance of the relevancy filtering
Task: Build a binary classifier to identify irrelevant imagesApproach: Transfer learning (fine-tune a pre-trained convolutional neural network, e.g., VGG16)
Duplicate Filtering
Examples of near-duplicate images
Task: Compute similarity between a pair of imagesApproach: Perceptual Hash + Hamming Distance (w/ threshold)
Before/After Image Filtering
Number of images that remain in our dataset after each image filtering operation
~ 2 %
~ 2 %~ 50 %
~ 58 %
~ 50 %
~ 30 %
Assume tagging an image costs $1, we could have gotten the same job done by paying $17k less, almost saving 2/3s of the budget!!!
Infrastructure Damage Assessment
• Three-class classification
– Categories: severe, mild & little-to-none
• Distinction between categories is ambiguous.
• Agreement among human annotators is low.– in particular for mild category
• Fine-tuning a pre-trained CNN (e.g., VGG16)
Deployment and Evaluation during Cyclone Debbie Event
Randomly selected 500 images
Manually labeled irrelevant images
Relevancy Filtering - Precision: 0.67
Duplicate Images
- Precision: 0.92
Thanks – Q & AFollow this project: @aidr_qcri
We are looking for a PostDoc(Computer vision, natural language processing, system development)
Contact us: [email protected]