access2011 van garderen-suhonos-part2

32
BIG DATA BIG DATA

Upload: peter-van-garderen

Post on 27-Jun-2015

561 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Access2011 van garderen-suhonos-part2

BIG DATABIG DATA

Page 2: Access2011 van garderen-suhonos-part2
Page 3: Access2011 van garderen-suhonos-part2

"datasets that grow so large that they become

difficult to work with using relational

databases and within a tolerable elapsed time"

Page 4: Access2011 van garderen-suhonos-part2

BIG DATA IS BIGBIG DATA IS BIG

Page 5: Access2011 van garderen-suhonos-part2

LIKE, REALLY BIGLIKE, REALLY BIG

Page 6: Access2011 van garderen-suhonos-part2

FACEBOOK: 140 BILLION PHOTOS

HUMAN GENOME: 3 BILLIONBASE PAIRS

GOOGLE: 50 BILLIONWEB PAGES

WORLDCAT: 1.5 BILLIONITEM RECORDS

Page 7: Access2011 van garderen-suhonos-part2

NOT REALLYNOT REALLY

Page 8: Access2011 van garderen-suhonos-part2

EUROPEANA: 20 MILLION(715K / COUNTRY)

LIBRARY OF CONGRESS:

1.9 MILLION

CANADIANA: 1 MILLION

LIBRARY AND ARCHIVES CANADA:

3.5 MILLION(ARCHIVAL DESCRIPTIONS)

Page 9: Access2011 van garderen-suhonos-part2

BIG DATABIG DATAIS COMPLICATEDIS COMPLICATED

Page 10: Access2011 van garderen-suhonos-part2

1966

Page 11: Access2011 van garderen-suhonos-part2

1976

Page 12: Access2011 van garderen-suhonos-part2

Page 13: Access2011 van garderen-suhonos-part2

Page 14: Access2011 van garderen-suhonos-part2

NOT REALLYNOT REALLY

Page 15: Access2011 van garderen-suhonos-part2

ಠ_ಠ

Page 16: Access2011 van garderen-suhonos-part2
Page 17: Access2011 van garderen-suhonos-part2
Page 18: Access2011 van garderen-suhonos-part2

SCALABILITYSCALABILITY

Page 19: Access2011 van garderen-suhonos-part2

● ICA-AtoM (LAMP)

● BENCHMARK 3.5M RECORDS

● 100% OPEN SOURCE SOFTWARE

● COMMODITY HARDWARE

Page 20: Access2011 van garderen-suhonos-part2
Page 21: Access2011 van garderen-suhonos-part2

CAN WE DO IT?CAN WE DO IT?

Page 22: Access2011 van garderen-suhonos-part2

WRITE SPEEDWRITE SPEED

Page 23: Access2011 van garderen-suhonos-part2

READ SPEEDREAD SPEED

Page 24: Access2011 van garderen-suhonos-part2

WRITE MEMORYWRITE MEMORY

Page 25: Access2011 van garderen-suhonos-part2

READ MEMORYREAD MEMORY

Page 26: Access2011 van garderen-suhonos-part2

NOSQL vs. SQLNOSQL vs. SQL(a.k.a. ODM vs. ORM)

● 4x - 10x FASTER

● 50% - 90% LESS MEMORY

Page 27: Access2011 van garderen-suhonos-part2
Page 28: Access2011 van garderen-suhonos-part2
Page 29: Access2011 van garderen-suhonos-part2

RELATIONAL DATABASESSCALE WELL

IF YOUR DATAIS NOT HIERARCHICAL

SOLRSCALES WELL

IF YOU HAVE INFINITE RAM

BEWARE THEDOGMA OF SQL

NOSQL IS AVIABLE OPTION

THINK SIDEWAYS SCALE OUT →

Page 30: Access2011 van garderen-suhonos-part2
Page 31: Access2011 van garderen-suhonos-part2

THE CLOUD IS A LIETHE CLOUD IS A LIE

Page 32: Access2011 van garderen-suhonos-part2

“big data is less about size, and more about

freedom”

open source tools+ distributed design= new opportunities