nosql and data modeling for data modelers

23
Big Data, NoSQL & Data Modeling 10 Tips for Data Modeling Success on Modern Data Projects Karen Lopez, InfoAdvisors www.datamodel.com

Upload: karen-lopez

Post on 21-Apr-2017

1.959 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: NoSQL and Data Modeling for Data Modelers

Big Data, NoSQL & Data Modeling

10 Tips for Data Modeling Success on Modern Data Projects

Karen Lopez, InfoAdvisorswww.datamodel.com

Page 2: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Data Models – Traditional Process

Conceptual (Data)

Model

Logical Data Model

Physical Data

Model(s) OLTPOLTP OLTP OLTP

OLTP

MARTMART

OLTP

OLTPOLTP

Aug 2014

Page 3: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Relational

Aug 2014

Data Models started with relational

modeling, so they look like relational

database structures.

Page 4: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

But….

That doesn’t mean they can’t be used to model data that goes into a non-relational format.All that formatting happens at build OR consumption time, not requirements time.

Aug 2014

Page 5: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

The Big Data Story

Lots of dataComing at us fastLots of variety in format & qualityWe want all the dataHighly available“It’s web scale”

Aug 2014

Page 6: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

What do we really mean by scale?

Bringing computing to the data

Massively parallel processing

Cheap, commodity hardware, but lots of itOptimized for Query/Reads/Questions/Telling stories

Aug 2014

Page 7: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

We’ve been down this road before…

Traditional transactional applications

Reporting-optimized

tables/structures

Data Warehouse / Dimensional

Modeling

Aug 2014

Highly normalized Highly Denormalized

Page 8: NoSQL and Data Modeling for Data Modelers

ETL

Classic DW ArchitectureOLTP DB

External Data

OLTP DB

EDWStaging/ETL DB

On Premises

Data Mart

Data Mart

Page 9: NoSQL and Data Modeling for Data Modelers

HadoopETL

Modern DW ArchitectureOLTP DB

OLTP DB

OLTP DB

EDWstaging

Cloud And/Or On Prem

Analytics Mart

Data Mart

Distributed Processing

(MapReduce)

Distributed Storage (HDFS)

Distributed Storage (Blob

Storage)

External Data

Page 10: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

NoSQL, Not Only SQL

Relational Graph Columnar/Column Family

Key Value Document Databases Others

Aug 2014

Page 11: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample Hive Statement

CREATE EXTERNAL TABLE TaxRebateUsage ( state string, zipcode string, agi_class int, n1 int, mars2 int, prep int, n2 int,)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE

Aug 2014

Page 12: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample JSON/MongoDB Notation

Aug 2014

Page 13: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample FoundationDB Statement

Aug 2014

Page 14: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample Cassandra Statement

Aug 2014

Page 15: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample Vertica Statement

Aug 2014

Page 16: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Sample Neo4j Statement

Aug 2014

Page 17: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Those weren’t SCHEMALESS….

They had data facts, which had meanings. And sometimes expected formats, precisions, and types.

In the NoSQL world, we don’t apply those necessarily at write time, but at read time.

SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic) or VARYING SCHEMAs.

Aug 2014

Page 18: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

The Big Data Big Lies

Schemaless• Schema on

Read, not Schema on Write

• Polyschematic

Big • New data

stories• New

technologies• Not just

volume

Aug 2014

Page 19: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

10 Tips For Modeling in a Hybrid World

1. Models require a modeler2. Data modeling tools are essential 3. There are many types of data models: know

which ones you need4. Modeling does not have to happen at the same

time in every project. It should happen at the right time

5. Modeling is not just schema design. Think outside the boxes and lines Aug 2014

Page 20: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

10 Tips for Modeling in a Hybrid World

6. A data model is much more than a diagram7. You will need training. 8. Team members may not understand

modeling. They will need training9. NoSQL is not one thing. Learn many patterns10.Modern data architectures are likely hybrid

solutions. You can’t just support one part.Aug 2014

Page 21: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

What does this mean for data modelers?

There will be jobs for traditional, ERD, relational modelers….….just like there are still jobs of RPG and COBOL programmersAll data has a data story. Many data stories.

A good modeler is a an architect at heart – finding the right solution for the data story.

Aug 2014

Page 22: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Business Intelligence Journal

Look for September 2014 Issue Article on

Modern Data Architectures

Aug 2014

Page 23: NoSQL and Data Modeling for Data Modelers

©InfoAdvisors - infoadvisors.com

Thank You!

www.infoadvisors.comwww.datamodel.comwww.dataversity.netcommunity.embarcadero.com

#TEAMDATA

Aug 2014

Follow me!