from data to numbers to knowledge: semantic embeddings by alvaro barbero
TRANSCRIPT
![Page 1: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/1.jpg)
www.iic.uam.es
![Page 2: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/2.jpg)
7 de diciembre de 2016
Data Numbers KnowledgeÁlvaro Barbero – Chief Data Scientist
www.iic.uam.es
![Page 3: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/3.jpg)
www.iic.uam.es 3
How do we see the world?
![Page 4: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/4.jpg)
www.iic.uam.es 4
Understanding correlations
The more, the better
Linear correlations
![Page 5: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/5.jpg)
www.iic.uam.es 5
Understanding correlations
“Moral virtue is a mean”Aristotle
Convex correlations
![Page 6: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/6.jpg)
www.iic.uam.es 6
Understanding correlations
Few is bad, but slightly more is very good, unless you skipthe sweet spot and then you are doing terribly, but if you
keep going you get better until you do worse again and afterthat nothing changes and you do so-so
Nonlinear correlations
![Page 7: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/7.jpg)
www.iic.uam.es
THE REAL WORLD IS NON-LINEARShocking revelationsBig data scientists rampage the cities after the newly discovered uselessness of their linear models.
Lorem ipsum. Ea pro natuminvidunt repudiandae, his et facilisis vituperatoribus. Mei eu ubique alterasenserit, consul eripuitaccusata has ne. Ignotaverterem te nam, eu cibocausae menandri vim. Sit rebum erant dolorem et, sedodio error ad.Vel molestiecorrumpit deterruisset ad, mollis ceteros ad sea.
In libris graecis appeteremea. At vim odio loremomnes, pri id iuvaretpartiendo. Vivendomenandri et sed. Loremvolumus blandit cu has.Sitcu alia porro fuisset.
Ea pro natum inviduntrepudiandae, his et facilisisvituperatoribus. Mei euubique altera senserit, consul eripuit accusata has ne. Ignota verterem te nam, eu cibo causae menandrivim. Sit rebum erantdolorem et, sed odio error ad.Vel molestie corrumpitdeterruisset ad, mollisceteros ad sea.
BIG DATA TIMES
VERY VERY BIG DATA SCIENCE - Since 1802
![Page 8: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/8.jpg)
www.iic.uam.es 8The modelling dilemma
![Page 9: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/9.jpg)
www.iic.uam.es 9
The modelling dilemma
Easy to understandand explain
Linear models
May not faithfullyrepresent reality
Non-linear models
Accuraterepresentations
Very difficultinterpretation
![Page 10: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/10.jpg)
www.iic.uam.es 10
The brain trick
![Page 11: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/11.jpg)
www.iic.uam.es 11
Brain power
Low levelobservations
Multilayered non-linear processing
Abstract yet linear concepts
![Page 12: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/12.jpg)
www.iic.uam.es 12
Abstraction examples
<<
Art quality
![Page 13: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/13.jpg)
www.iic.uam.es 13
Embeddings: abstract non-linearity away
Low level observations
High levelembedding
Artificial multilayerednon-linear processing
Deep Network
![Page 14: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/14.jpg)
www.iic.uam.es 14
![Page 15: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/15.jpg)
www.iic.uam.es 15
word2vec
cat chills on a mat
cat chills mushroom a mat
Socher et al (2013)
![Page 16: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/16.jpg)
www.iic.uam.es 16
Exploiting the linearity: semantic algebra
King man woman queen
Obama USA Russia Putin
human animal ethics
paella Spain Italy risotto
Cristiano Madrid Barcelona Messi
![Page 17: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/17.jpg)
www.iic.uam.es 17
Bilingual word2vec
WordSpace
WzhWen
EnglishWords
MandarinWords
Socher et al (2013)
![Page 18: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/18.jpg)
www.iic.uam.es 18
Text embedding models through Recurrent Neural Networks
High dimensional representation of a sequence
0.1
0.5
1.0
0.0
2.4
The lazy brown foxSutskever et al - Sequence to Sequence Learning with neural networks
![Page 19: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/19.jpg)
www.iic.uam.es 19
Example: books embeddings
Cilibrasi, Vitányi - Normalized Web Distance and Word Similarity
Jonathan Swift
Oscar Wilde
William Shakespeare
![Page 20: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/20.jpg)
www.iic.uam.es 20
Embedding artistic styles
Gatys et al – A Neural Algorithm of Artistic StyleBottou et al - Optimization Methods for Large-Scale Machine Learning
Low levelobservations
Styleembedding
![Page 21: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/21.jpg)
www.iic.uam.es 21
Artistic styles as linear relations
Google Research - https://research.googleblog.com/2016/10/supercharging-style-transfer.html
![Page 22: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/22.jpg)
www.iic.uam.es 22
An application: furniture embedding
Bell and Bala - Learning visual similarity for product design with convolutional neural networks
![Page 23: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/23.jpg)
www.iic.uam.esTake home message
![Page 24: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/24.jpg)
www.iic.uam.es 24
Take home message
Reality is highly non-linear
Deep Learning can abstract complexity
away
Complex relations become easy comparisons!
![Page 25: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/25.jpg)
www.iic.uam.es
![Page 26: From data to numbers to knowledge: semantic embeddings By Alvaro Barbero](https://reader031.vdocument.in/reader031/viewer/2022030305/5874147a1a28abcb5b8b5015/html5/thumbnails/26.jpg)
www.iic.uam.es
www.iic.uam.es
Álvaro Barbero JiménezChief Data Scientist at Instituto de Ingeniería del Conocimiento (IIC)
Elementos gráficos de apoyo obtenidos en:
@albarjip
Alvaro Barbero