janos szendi-varga - bdu 3.0bdu.hu/2017/ppts/2017/szendi-varga_janos.pdf · 2017-12-12 · they are...
TRANSCRIPT
![Page 1: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/1.jpg)
GraphAware®
The power of polyglot searchingJanos Szendi-Varga
graphaware.com
@graph_aware
![Page 2: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/2.jpg)
Most frequently used UI element
GraphAware®
Search Go
![Page 3: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/3.jpg)
Evolution of Internet Search
https://moz.com/blog/the-evolution-of-search
![Page 4: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/4.jpg)
Slide from BDU 2016
![Page 5: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/5.jpg)
We started to be Polyglot
Big data architecture is not a vision
We hired Data Scientists
We started to index things (Lucene)
We started to use Solr, ElasticSearch, etc
It became the part of our Big Data architecture
We introduced Search Infrastructure
Evolution in corporate search
GraphAware®
![Page 6: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/6.jpg)
The fundamental of search infrastructure
GraphAware®
?
![Page 7: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/7.jpg)
They are aggregate oriented databases, they have limitations when it comes to connected data
Typical setup: Two users searching for the same thing will get the same results
They are in the search 3.0-4.0 phase
They are superstars of Full text search
We need to extend this with Graph-aided search
We have to boost some Search Hit (c`mon It is a recommender system)
We have to filter out or degrade the score
We need Things, not Strings!!444!!!négy!!!
Challenges
GraphAware®
![Page 8: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/8.jpg)
Example of graph-based search
GraphAware®
![Page 9: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/9.jpg)
“A knowledge graph is a multi-relational graph composed of entities as nodes and relationships as edges with different types that describe facts in the world."
Knowledge graph
GraphAware®
It is about “understanding the world as you and I do”.
![Page 10: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/10.jpg)
![Page 11: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/11.jpg)
![Page 12: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/12.jpg)
Search infrastructure should be easily integrated into existing architecture New data sources should be easily added Should support the strategic goals
e.g. Search driven e-commerceScalableShould provide personalised results Simple interface
Requirements of searching and KG
GraphAware®
![Page 13: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/13.jpg)
Take a graph database (Neo4j, Cayley, OntoText GraphDB, etc.)
Graph construction:
Knowledge extraction
from the internet
open data
grabbing
from text (NLP)
from current databases (Master Data)
from logs
Knowledge Graph Construction
Have a good graph model
Connect the things together
Steps to build KG
GraphAware®
![Page 14: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/14.jpg)
![Page 15: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/15.jpg)
![Page 16: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/16.jpg)
Apache Kafka for streaming pipelines
Product topic
Search topic
Feedback topic
Spark on the processing side
Neo4j on the consuming side
CQRS (Command Query Responsibility Segregation) pattern
Push to ElasticSearch with GraphAware plugin
Neo4j Transaction Handler (afterCommit)
You can define mappings to ES
Parts of the architecture
GraphAware®
![Page 17: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/17.jpg)
Success story 1.
• Sharing Tribal Knowledge inside the company
• >20 offices
• >3000 employees
• Data sources:
• Tableau dashboards (4000)
• Knowledge posts (>1000)
• Superset charts and dashboards (>6000)
• Experiments and metrics (>5000)
GraphAware®https://www.slideshare.net/ChristopherWilliams24/20170108scaling-tribalknowledge
![Page 18: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/18.jpg)
Success story 2.
•Half-century of collective NASA engineering knowledge
• It is called Lessons Learned database
• They use it in Mars mission project
GraphAware®
Impact: “Neo4j saved well over two years of work and one million dollars of taxpayers funds.”
“When we had the [Apollo 1] fire, we took a step back and said okay, what lessons have we learned from this horrible tragedy? Now let’s be doubly sure that we are going to do it right the next time. And I think that fact right there is what allowed us to get Apollo done in the ‘60s.” —Dr. Christopher C. Kraft, Jr., Director of Flight Operations
![Page 19: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/19.jpg)
Neo4j
ElasticSearch
GraphAware modules:
Neo4j to ElasticSearch
ElasticSearch Plugin
NLP plugin
Github: github.com/graphaware
Open data
Resources
GraphAware®
![Page 20: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/20.jpg)
GraphAware®
It is not a rocket science!
Anonymous NASA scientist
![Page 21: Janos Szendi-Varga - BDU 3.0bdu.hu/2017/ppts/2017/Szendi-Varga_Janos.pdf · 2017-12-12 · They are aggregate oriented databases, they have limitations when it comes to connected](https://reader033.vdocument.in/reader033/viewer/2022041513/5e293622deff296abb4ba467/html5/thumbnails/21.jpg)
www.graphaware.com@graph_aware
GraphAware
GraphAware®
world’s #1 Neo4j consultancy