neo4j gokuldaspillai-121018170144-phpapp01

21
Commercial Graph at Intuit Gokuldas Pillai Engineer, Data Services, Intuit @gokool

Upload: gokuldas-pillai

Post on 16-Jul-2015

46 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neo4j gokuldaspillai-121018170144-phpapp01

Commercial Graph at Intuit

Gokuldas Pillai

Engineer, Data Services, Intuit

@gokool

Page 2: Neo4j gokuldaspillai-121018170144-phpapp01

Improving the lives of 60M people

Page 3: Neo4j gokuldaspillai-121018170144-phpapp01

…creates a unique and compelling set of data

1 in 3Tax Returns

1 in12Americans

Pay

$2.6Tin Transactions

25 MillionQuestions Answered

1 to 50Apps

From

7 MillionMobile Customers

45M Customers Using Connected Services

Page 4: Neo4j gokuldaspillai-121018170144-phpapp01

Is it time to hire?

Small Business Hiring Trends

My revenue increased

5%...is that good?

Revenue Comparisons

Am I spending

more than my friends?

Spending Profiles

Auto $750

Rent $1,200

Groceries $400

Page 5: Neo4j gokuldaspillai-121018170144-phpapp01

Intuit Payment Graph

• Discover the latent network from multiple product data-stores

– Uniquely identify entities and their connections

– Connections scored by volume of trade

• Empower Business Unit (BU) teams to leverage the Intuit Payment Graph to build applications.

– Graph to be available for real time access

Page 6: Neo4j gokuldaspillai-121018170144-phpapp01

The Graph Server provides rich profiles

IdentityName

AddressPhoneEmail

Mint IdEtc.

SocialFacebook

YelpTwitter

Etc.

DemographicsAge

GenderEtc.

Consumer Profile Facets

IdentityName

AddressPhoneEmail

QBO IdEtc.

SocialFacebook

YelpTwitter

Etc.

FirmographicsCategoryRevenue

EmployeesEtc.

Business Profile Facets

Page 7: Neo4j gokuldaspillai-121018170144-phpapp01

And the buyer-seller relationships

May 20113 purchases$650.25

May 20111 purchase$25.95

Consumer

Business Business

Page 8: Neo4j gokuldaspillai-121018170144-phpapp01

Design

Page 9: Neo4j gokuldaspillai-121018170144-phpapp01

Fuzzy matching & de-duplicating entities

ID: 002114902Name: The Windsor-Press IncStreet: 6 N 3rd StCity: HamburgState: PAZip: 19526-1502Phone: (610)-562-2267

Company ABC

name: The Windsor Press, Inc.address: PO Box 465 6 North Third Streetcity: Hamburgstate: PAzip: 19526phone: (610) 562-2267

name: The Windsor Pressaddress: P.O. Box 465 6 North 3rd St.city: Hamburgstate: PAzip: 19526-0465phone: (610) 562-2267

Company PQR

Dun & Bradstreet

Both of the above vendor records map to external reference data:

Page 10: Neo4j gokuldaspillai-121018170144-phpapp01

Commercial Graph Architecture

Business names, address, phone, industry code

Real-time Applications

Request

Response

De-duped Nodes

Transactions

Invoices, bills,

payments, vendors, customers

Categorization

Matching/De-duping

Offline analytics

Page 11: Neo4j gokuldaspillai-121018170144-phpapp01

Data Model

CompanyName: Acme IncZip: 95134…

CompanyName: Veva LLCZip: 94040…

ProductName:Quickbooks…

ProductName:Payroll…

Relationship:CUSTOMERTxn Count: 125No. of years:1

Relationship:LICENSEDNo. of years:8

CompanyName: Beta LLCLocation: 94043…

Relationship:CUSTOMERTxn Count: 467No. of years:3

Page 12: Neo4j gokuldaspillai-121018170144-phpapp01

Data-model Demo

Page 13: Neo4j gokuldaspillai-121018170144-phpapp01

Scale

• Size of the graph

– 29 Mn Unique Nodes

– 315 Mn Properties

– 48 Mn Relationships

Page 14: Neo4j gokuldaspillai-121018170144-phpapp01

Referrals & recommendations

Connecting consumers with small businesses

Small business micro-communities

Page 15: Neo4j gokuldaspillai-121018170144-phpapp01

Big Data

for the Little Guy

Page 16: Neo4j gokuldaspillai-121018170144-phpapp01

Usecase - Vendor Recommendation

START n=node(23539) MATCH

n-[:PAYS]-v-[:PAYS]-vovWHERE

has(vov.IC4_DESC) AND vov.IC4_DESC =~ 'Legal.*' AND not (ID(vov) = ID(v))

RETURN ID(vov),vov.ENTITY_TYPE,vov.CITY?,vov.IC4_DESC?

ORDER BY vov.loyalty;

Page 17: Neo4j gokuldaspillai-121018170144-phpapp01

Why Neo4J

• Java – matched in-house skills

• Flexible/Supports quick exploration

• Easy admin functionality – set-up, adding data

• Built in access points over HTTP (REST/JSON)

• SQL-like Query language (Cypher is awesome!)

• Active mailing list

• Good documentation

• Vendor support

Page 18: Neo4j gokuldaspillai-121018170144-phpapp01

Neo4j for real-time graph applications

18

Cypher Query Language

START biz = node(100) MATCH biz–

[TRANSACTS]- x RETURN x

Great for… Opportunity Areas…

Real time

Cypher

Built-in Algos

Lucene search

Horizontalscaling

Access control

Indexing

Page 19: Neo4j gokuldaspillai-121018170144-phpapp01

Experiment. Measure. Pivot.

Persevere.

Privacy matters…a lot.

Build the right team.

Page 20: Neo4j gokuldaspillai-121018170144-phpapp01

Team

• 2 Engineers (100%)

• 2 Data Scientists (50%)

• 1 Product Manager

• We are hiring Data Engineers !

– http://careers.intuit.com/professional

Page 21: Neo4j gokuldaspillai-121018170144-phpapp01

Thank you.