location agnostic data - proligence · datawarehouse location agnostic data 6 schema a sales...

19
Location Agnostic Data Arup Nanda Arup Nanda Arup Nanda Arup Nanda Proligence Location Agnostic Data 2 It’s 11 PM. Do you know where your data is?

Upload: others

Post on 25-Apr-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Location Agnostic Data

Arup NandaArup NandaArup NandaArup Nanda

Proligence

Location Agnostic Data2

It’s 11 PM.

Do you know where your

data is?

Location Agnostic Data3

Schema A

SALES

SALES_HISTORY

Location Agnostic Data4

Schema A

SALES

SALES_HISTORY

Sales in last

3 months

Sales in last

24 months

Location Agnostic Data5

Schema A

SALES

SALES_HISTORY

Schema B

SALES_HISTORY

Datawarehouse

Location Agnostic Data6

Schema A

SALES

SALES_HISTORY

Schema B

SALES_HISTORY

Datawarehouse

Location Agnostic Data7

Schema A

SALES

SALES_HISTORY

Schema B

SALES_HISTORY

Datawarehouse

HBase

SALES_HISTORY

Hadoop

Location Agnostic Data8

SALES

SALES_HISTORY SALES_HISTORY

Datawarehouse

SALES_HISTORY

HadoopOLTP

Location Agnostic Data9

SALES

SALES_HISTORY SALES_HISTORY

Datawarehouse

SALES_HISTORY

HadoopOLTP

Location Agnostic Data10

SALES

SALES_HISTORY SALES_HISTORY

Datawarehouse

SALES_HISTORY

HadoopOLTP

Cloud

Location Agnostic Data11

Sharding

Location Agnostic Data12

North America South America EMEA APACLocation

Very Important Important Somewhat Imp RestCustomer

Importance

Location Agnostic Data13

Different Location Different Technology

Different Geography

Challenges

• Need to know:

– The technology

• Oracle Database, Hadoop?

• SQL, or Map Reduce Code

– The location

– The data availability

• Do I have the right data in the same DB or got to get from somewhere else?

Location Agnostic Data14

Solutions

• Put the business logic in the database

• But, which database?

• Which technology

Location Agnostic Data15

Solution, contd.

• Put the logic in the app layer

• Independent of technology

• But is really technology independent?

• Definitely inconsistent

• Need to reinvent the wheel

• Still need to know the location, currency, etc.

Location Agnostic Data16

Location Agnostic Data17

Different Location Different Technology

Different Geography

Location Agnostic Data18

Data As a Service

Step 1

• Gather a list of all high level data

– Not tables, columns; but data elements such as reservations, sales, etc.

Location Agnostic Data19

Step 2

• Map tables, columns, collections, etc. into these data elements

• Define an document business logic for each datastore

Location Agnostic Data20

Logic Data store

System of record, less than 3 months OLTPDB.SchemaA.TableA.ColumnA

Less than 2 years DWDB.SchemaA.ColumnA

More than 2 years Hbase

Define all other elements such as how often synced up.

Step 3

• Identify datastores that can’t tolerate latency

Location Agnostic Data21

Customer

Records

Sales Records

Unpredictable

division across

geographical

boundaries

Are generally

called within

ranges. Out of

ranges can be

slow.

Replicate

Shard

Example

Location Agnostic Data22

North

America

EMEA APAC

Customer Customer Customer

Sales Sales Sales

Sharded Data Strategies

• Use DB Links

create database link newyork

connect to gwuser/gwuser

using ‘newyork’;

Location Agnostic Data23

Sharded Data, contd.

• Create View SALES as

select * from sales

union all

select * from sales@newyork

union all

select * from sales@london

Location Agnostic Data24

Sharded Data, contd.

• Create INSTEAD OF triggers

create trigger

instead of INSERT on sales

for each row

if country_code in (‘US’,’CA’,…) then

INSERT into sales@newyork

Location Agnostic Data25

Step 4

• Write APIs to serve data elements

• Could be:

– Java Webservice

– Oracle Packages

– RESTful APIs

Location Agnostic Data26

Services

Location Agnostic Data27

North

America

EMEA APAC

Customer Customer Customer

Sales Sales Sales

Sales Service

Customer Service

Service

• Has a fixed “catalog” of requests it can address

• Has knowledge about location, data dispersion, technology

• Example

– Accept StoreID

– Lookup country of the store

– Is data replicated?

– If so, get local data

– If not, locate the DB for that country

– Get the data from that DB

Location Agnostic Data28

Services

Location Agnostic Data29

North

America

EMEA APAC

Sales Service

Services

Location Agnostic Data30

North

AmericaEurope APAC

Sales Service

Africa

Services

Location Agnostic Data31

North

AmericaEurope APAC

Sales Service

Africa

Hadoop

Services

Location Agnostic Data32

North

AmericaEurope APAC

Sales Service

Africa

Hadoop

Suitability

1. Reference data—country details, store details

� Static or almost static

2. Low variability—customer preferences

3. Latency tolerated—European customers generally call the European

call centers

4. Read type of access

Location Agnostic Data33

Bridging Technologies

Location Agnostic Data34

ServiceA Helper

Service

Gateways

Location Agnostic Data35

Gateway

Summary

• Data will continue to exist in many places

• Single source of truth is next to impossible to attain

• Architect to live with diverse data, sometimes competing versions

• Decide which data is to be manually replicated and sharded

• Use gateways or services to connect from one database to the other allowing

location and technology transparency.

• In Oracle databases, DB Links, Views and Instead of Triggers can provide the

location transparencies

• Gateways can bridge different technologies

Location Agnostic Data36

Thank You!

Blog: arup.blogspot.com

Tweeter: @ArupNanda

Facebook.com/ArupKNanda

Google Plus: +ArupNanda

37Location Agnostic Data