kick start graph visualization projects
Post on 11-Aug-2014
1.503 views
Embed Size (px)
DESCRIPTION
Create and use graph visualizations efficiently in your projects.TRANSCRIPT
- by Sbastien Heymann seb@linkurio.us Kick-start Graph Visualization Projects.
- ...with software. Co-founder of the Gephi project - 2008 Co-founder of the Linkurious startup - 2013 PhD in computer science, UPMC LIP6 - 2013 A few words about me I democratise graph thinking (with pink titles) makes graphs handy
- Open source project started in 2008 Built to solve large graph visualization problems Latest version downloaded ~ 400,000 times http://gephi.org A few words about me / Gephi makes graphs handy
- A few words about me / Gephi
- A few words about me / Linkurious Started by a collaboration with Stanford - Mapping the Republic of Letters and DensityDesign in 2012. Now French startup of 3 people. Linkurious helps companies make sense of data with user- friendly visualization software. We help business analysts, R&D teams, developers and scientists.
- A few words about me / Linkurious
- Beautiful but unreadable pictures? Lets make graph visualization useful.
- 0. Why? 1. Key takeaways a. The 5 questions b. User stories c. Design visualization + interaction 2. Fraud detection use case 3. Q&A How to create and use graph visualization successfully? Agenda PRACTICE PRACTICE
- 0. Why graph visualization? Huh...
- What is a graph? This is a graph. Father Of Father Of Siblings
- What is a graph? / Nodes & relationships A graph is a set of nodes linked by relationships. Father Of Father Of Siblings This is a node This is a relationship
- People, objects, movies, restaurants, music... Antennas, servers, phones, people... Supplier, roads, warehouses, products... Graphs can be used to model many domains. Supply chains Social networks Communications Differents domains where graphs are important
- Graph visualization can help you in many ways. Do you have a graph project?
- The greatest value of a picture is when it forces us to notice what we never expected to see. Why? John Tukey (1962)
- How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
- Ask 5 questions / Q1: Data, tadaa? You need data. sourcing - cleaning - update
- sensemaking - scale - complexity Ask 5 questions / Q1: Data, tadaa? Can you model data as graphs? image: Martin Grandjean
- Hypothesis discovery, evidence finding Impact analysis, reportingData modelling, database administration Set up your goal. Administrate Understand Monitor Ask 5 questions / Q2: Why using graph visualization in your project? images: XKCD & the web
- Ask 5 questions / Q3: Who will use it? Define personas. data scientist business analyst developer public audience images: PhdComics & Despicable Me
- Short-term memory max 7 items otherwise the ability to make decisions drops Vision more than 10 000 nodes is generally useless Ask 5 questions / Q4: What are the constraints? Acknowledge human limits.
- 50 nodes 1B nodes Graph size Machine performances Server side VS client side rendering Interactive VS print Ask 5 questions / Q4: What are the constraints? Acknowledge technical limits.
- individual use VS collaborative work artwork VS integrated into an application Ask 5 questions / Q5: How is it used? Define scope.
- 1. What are the data? 2. What is your goal? 3. Who is your end-user? 4. What are the constraints? 5. How is it used? Ask 5 questions / Summary The 5 questions
- Ask 5 questions / Your turn! Answer the 5 questions of your project. PRACTICE
- How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
- I define a data model. I generate a significant graph sample. I create a business query with Cypher. I visualize the query result. I iterate on the data model until it is satisfying. Write user story / The developer story I am creating a Neo4j graph database for my application.
- Write user story / Your turn! Write your own user story. PRACTICE
- How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
- Graph visualization in practice
- Design visualization How to represent graphs?
- (a) Nodes are ordered as rows and columns; connections are indicated as filled cells. (b) A matrix representation of a typical biological pathway. in (Gehlenborg 2012) Design visualization / Common graph representations Matrices
- (a) A directed graph typical of a biological pathway. (b) An undirected graph with nodes arranged in a circle. (c) A spring-embedded layout of data from b. in (Gehlenborg 2012) Design visualization / Common graph representations Node-link diagrams
- Design visualization Lets choose node- link diagrams because its more common.
- Design visualization Map data to visual variables. proximity hierarchy group
- Expand Search Design interaction Add interactivity Details on demand Filter
- Design visualization and interaction / Graph Viz 101 Learn more at http://linkurio.us/graph-viz-101
- How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
- Use case 2. Bank loan fraud detection use case.
- Use case / The cost of fraud $28.6B AITE Group estimates that first party fraud will cost $28.6 billion in credit card losses a year by 2016. http://news.alaric.com/industry-news/fraud/a-new-approach-to-first-party-fraud-reducing-bad-debt/ http://bankinganalyticsblog.fico.com/2013/02/first-party-fraud-it-was-me.html
- A criminal uses the fake identity to register a bank account. He acts like a normal customer and tries to secure a loan. Once the criminal feels he cannot get access to more money he carefully prepares his exit : in a short amount of time he empties all of his accounts and disappears. A criminal or a group of criminal mix pieces of information (addresses, phone numbers, social security number) to create a synthetic-identity. A look at a common fraud scenario banks face. Create a fake identity Go to the bank, ask for a loan Disappear with the money Use case / A common fraud scenario
- Use case / How do we set up a graph-based fraud detection system? Lets ask our 5 questions. 1. What are the data? 2. What is your goal? 3. Who is your end-user? 4. What are the constraints? 5. How is it used?
- Use case / Q1: What are the data? We model customer data as a graph. Loan $25k Home address 58, Eisenhower Square Customer name J. Smith Phone number +33 5 68 98 25 74 Credit card 1 234$ ID J. Smith A graph showing a legitimate customer and the information she is linked to.
- Use case / Q1: What are the data? In a fraud ring people share the same information. 58, Eisenhower Square 14, Roses Street +33 6 75 89 22 14 $7k P. Martin $12,5k +331 42 58 66 00 J. Smith SSN 17873897893 31195855 $20k E. Selmati SSN 1787576553 $45k P. Smith SSN 1787579953 SSN 1267576553 31184274
- Use case / Q2: What is your goal? We w