![Page 1: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/1.jpg)
On the Quest for Changing KnowledgeMarco Brambilla, Stefano Ceri, Florian Daniel, Emanuele Della
Valle
@marcobrambi
![Page 2: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/2.jpg)
Data-driven innovation
and
Innovation-driven data
![Page 3: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/3.jpg)
Innovation requires
PreciseTo the pointUp-to-date
Domain-specific
information
![Page 4: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/4.jpg)
There are more things In heaven and earth, Horatio, Than are dreamt of in your philosophy.
Shakespeare (Hamlet Act 1, scene 5)
![Page 5: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/5.jpg)
From Data to Wisdom
![Page 6: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/6.jpg)
Formalizing new knowledge is hard
Only high frequency emerges
The long tail challenge
![Page 7: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/7.jpg)
Knowledge Extraction
Text miningSemantic Web
Search and recommendation systems
No specific care for emerging knowledge
![Page 8: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/8.jpg)
Heaven and HeartHow to peer through an effective window
on real world?
Social media, our blessing and curse
Domain experts matter
![Page 9: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/9.jpg)
Can we use social networks to discover emerging knowledge?
![Page 10: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/10.jpg)
Beware the streetlamp effect
The bias of the sourceThe bias of the observer
![Page 11: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/11.jpg)
Famous Emerging
![Page 12: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/12.jpg)
Evolving Knowledge
![Page 13: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/13.jpg)
Overview
![Page 14: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/14.jpg)
Knowledge Enrichment Setting
![Page 15: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/15.jpg)
Emerging Knowledge Harvesting
![Page 16: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/16.jpg)
Domain TypesTypes selected by the experts
Relevant for the domain
![Page 17: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/17.jpg)
Seed characterizationSelected by the expert
Belonging to an expert type
Thoroughly Described# @ a w
![Page 18: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/18.jpg)
Social Media Sourcing
Content coming from the seeds’ accounts
![Page 19: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/19.jpg)
Candidate Selection
Potentially any entity extracted from the social streams
Resulting in huge sets of candidates
# @ a w ♥
![Page 20: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/20.jpg)
Candidate Typing
![Page 21: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/21.jpg)
Candidate Pruning
Initial pruning of candidates based on
TF-DF:= df * tf / (N – df +1)
(*) variant of TF-IDF that does not discount document frequency because we are actually happy about frequent appearance
(we don’t look for information entropy!)
![Page 22: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/22.jpg)
Candidate Ranking
![Page 23: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/23.jpg)
Candidate Vector Space
Purely syntactic
Semantic:Based on entity extraction / DBpedia
Based on deep learning on images / ClarifAI
![Page 24: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/24.jpg)
![Page 25: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/25.jpg)
Example Analysis
![Page 26: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/26.jpg)
Experiments
Fashion brands Writers Painters
Exhibitions
![Page 27: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/27.jpg)
4,400 strategies evaluated
44 alternative feature vectors (12 basic features and 32 aggregations)
9 different weighting values for aggregations
5 levels of recall for entity extraction
3 different distances
![Page 28: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/28.jpg)
Pruning PhaseFrom 4,400 down to 10 strategiesEliminating the less relevant parameters
![Page 29: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/29.jpg)
Italian Fashion BrandsPrecision @5 = 0.2Increasing # seeds reduces precision
![Page 30: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/30.jpg)
Australian Writers – 22 seedsPrecision @5 = 0.8
![Page 31: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/31.jpg)
Innovative Painters – 21 seedsPrecision @5 = 0.6
![Page 32: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/32.jpg)
Twitter vs. Instagram P@5 = 1.0 P@5 = 0.8
vs.
![Page 33: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/33.jpg)
Fashion: Twitter + Instagram&
![Page 34: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/34.jpg)
&
Writers: Twitter + Instagram
Prec. = 1
![Page 35: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/35.jpg)
Conclusion
It’s about time to build innovation based on data
and build knowledge based on innovation
Harvesting can be iterative
![Page 36: On the Quest for Changing Knowledge. Capturing emerging entities from social media. WebScience 2016 DDI](https://reader035.vdocument.in/reader035/viewer/2022070511/58aeb3591a28ab00708b563b/html5/thumbnails/36.jpg)
On the Quest for Changing Knowledge
contact usMarco Brambilla, @marcobrambi, [email protected]
http://datascience.deib.polimi.it