big data madison: architecting for big data

65
#CONTEXT Jordan @jordanbarr ette Josh @jherritz Copyright 2014 MIOsoft Corporation. All rights reserved.

Upload: miosoft

Post on 14-Jan-2017

54 views

Category:

Data & Analytics


11 download

TRANSCRIPT

Page 1: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Jordan@jordanbarrette

Josh@jherritz

Page 2: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 3: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 4: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 5: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

DATA#CONTEXT

Page 6: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

DATAIS

EVERYWHERE

#CONTEXT

© Kheel Center, Cornell University “Crowds fill the street during the New York Dressmakers Strike of 1933”

Page 7: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

HADOOPZILLA

#CONTEXT

© Kheel Center, Cornell University “Crowds fill the street during the New York Dressmakers Strike of 1933”

Page 8: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved. © Tasmanian Archive and Heritage Office NS3245/1/249

THERE’S

GOLD IN

THEMHILLS

#CONTEXT

Page 9: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© Kheel Center, Cornell University “Crowds fill the street during the New York Dressmakers Strike of 1933”

Page 10: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

BE PROACTIVE. MOVE FASTER, BETTER, LEANER. REDUCE EMPLOYEE CHURN. REDUCE CUSTOMER CHURN.

INCREASE SALES. ENABLE UPSELL. REDUCE COSTS.

#CONTEXT

© Jim Pennucci “Email Under The Light”

Page 11: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

“Context is one of the most

important concepts to application architecture”

#CONTEXT

Page 12: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

“Without contextual systems we will never get to the nirvana of what we envision around cognitive engagement"

#CONTEXT

Page 13: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

FRAUD PREVENTIONRISK MITIGATIONINTELLIGENCESERVICE PERSONALIZATIONSECURITYCOMPLIANCECUSTOMER 360-DEGREE VIEWINVENTORY / SUPPLY CHAIN MANAGEMENT

#CONTEXT

Page 14: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© NASA Goddard Space Flight Center “Barred Spiral Galaxy”

Page 15: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE WANT TO MODEL CONTEXT, SERVE CONTEXT TO APPS ANDLEVERAGE CONTEXT TO UNDERSTAND PATTERNS ACROSS OUR DATA.

#CONTEXT

Page 16: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 17: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 18: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE WANT TO MODEL CONTEXT, SERVE CONTEXT TO APPS ANDLEVERAGE CONTEXT TO UNDERSTAND PATTERNS ACROSS OUR DATA.

#CONTEXT

Page 19: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE NEED TO EFFICIENTLY STORE ANDQUICKLY RETRIEVE CONTEXTUALLY RELATED DATA.

#CONTEXT

Page 20: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

Computer Storage Hierarchy

Registers

Cache

RAMFlashHard Disk

Cost Per Byte

Capacity

#CONTEXT

Page 21: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE NEED TO EFFICIENTLY STORE ANDQUICKLY RETRIEVE CONTEXTUALLY RELATED DATA.

#CONTEXT

Page 22: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

COMMODITY HARDWARE $ << SPECIALIZED HARDWARE $

#CONTEXT

©Bill Wetzel “PDP-10”

Page 23: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

1 DELL R720XD = 48TBs

#CONTEXT

Page 24: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE NEED TO EFFICIENTLY STORE PBs+

#CONTEXT

Page 25: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

THINK DISTRIBUTED

#CONTEXT

© Candid Business“google-datacenter-tech-13”

Page 26: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

CONTEXT: A UNIT OF DISTRIBUTION

#CONTEXT

Page 27: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE NEED TO EFFICIENTLY STORE ANDQUICKLY RETRIEVE CONTEXTUALLY RELATED DATA.

#CONTEXT

Page 28: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

Computer Storage Hierarchy

Registers

Cache

RAMFlashHard Disk

SpeedCost Per Byte

Capacity

#CONTEXT

Page 29: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© Dario Trimarchi “Computer Hard Disk Stock Image”

Page 30: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

1 MILLION SEEKS = 83 MINUTES1 BILLION SEEKS = 58 DAYS

#CONTEXT

Page 31: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

CONTEXT: A PHYSICAL AND LOGICAL CLUSTER OF RELATED DATA. READ/WRITE WITH ONE SEEK TO DISK.

#CONTEXT

Page 32: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

WE WANT TO MODEL CONTEXT, SERVE CONTEXT TO APPS ANDLEVERAGE CONTEXT TO UNDERSTAND PATTERNS ACROSS DATA.

#CONTEXT

Page 33: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© Dario Trimarchi “Computer Hard Disk Stock Image”

Page 34: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

OPERATIONAL v. ANALYTICAL

© University of Chicago, Adam Lisberg “Athletics”

#CONTEXT

Page 35: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© Mariano Garcia-Gaspar “Locked”

Page 36: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© University of Chicago, Capes Photo “World War II”

Page 37: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

GOODBYE LOCKS

#CONTEXT

© Brandon Doran “Once the dust settles”

Page 38: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

© Province of British Columbia “002 Calgary Skyline”

Page 39: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

WAITING…© University of Chicago, Capes Photo “World War II”

Page 40: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

COLLAPSE MULTIPLE SERIAL TRANSACTIONS

INTO ONE TRANSACTION. STILL 1 SEEK.

#CONTEXT

Page 41: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

NOW THE HARD PART

#CONTEXT

Page 42: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 43: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

RELATIONSHIP DISCOVERY

#CONTEXT

Page 44: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

NEAR vs. FAR RELATIONSHIPS

#CONTEXT

© NASA, Visible Earch

Page 45: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

“NEAR” =DATA IN THE SAME CONTEXT

#CONTEXT

Page 46: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 47: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

DISTANCE-BASEDCLUSTERING

#CONTEXT

Page 48: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

PHONETICNUMERICSEMANTICTEMPORALGEOSPATIALETC

#CONTEXT

Page 49: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

http://accidents.apps.miosoft.com

Page 50: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

TRANSITIVITY

#CONTEXT

Page 51: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

http://accidents.apps.miosoft.com

Page 52: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

PROVENANCE

#CONTEXT

Page 53: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

ORIGIN

#CONTEXT

©Steve Corey “Creation of Adam, Sistine Chapel

Page 54: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

PROVENANCE – IT MIGHT NEED TO BE UPDATEDPRESERVATION

#CONTEXT

© Stephen Mitchell “Ligurian Bee’s”

Page 55: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 56: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

“FAR” = RELATIONSHIP WITHANOTHER CONTEXT

#CONTEXT

© Granger Meador www.meador.org

Page 57: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

CONTEXT FRAGMENTS CANIMPLY RELATIONSHIPS WITHUNIDENTIFIED CONTEXTS

#CONTEXT

© Pablo OE, “V”

Page 58: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 59: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

GRAPPLE

#CONTEXT

© John Piekos “Fishing Plug Boken”

Page 60: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 61: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 62: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

#CONTEXT

Page 63: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

ELASTICITYCOMMUNICATIONCONTAINERIZATIONCONFLICT RESOLUTIONRELATIONSHIP SUMMARY DATAADVANCED INDEXINGPARALLEL AGGREGATIONDATA CURATION TOOLSOPERATIONS AND MANAGEMENT TOOLS …

#CONTEXT

Page 64: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

15+ YEARS IN THE MAKING

#CONTEXT

Page 65: Big Data Madison: Architecting for Big Data

Copyright 2014 MIOsoft Corporation. All rights reserved.

www.miosoft.com