in memory data grids, demystified!

41
Uri Cohen Head of Product @ GigaSpaces @uri1803 github.com/uric In-Memory Data Grids, Demystified

Upload: uri-cohen

Post on 15-Jan-2015

372 views

Category:

Technology


3 download

DESCRIPTION

The principles and foundations of in memory data grids

TRANSCRIPT

Page 1: In Memory Data Grids, Demystified!

Uri CohenHead of Product @ [email protected]/uric

In-Memory Data Grids, Demystified

Page 2: In Memory Data Grids, Demystified!

Agenda

• Why IMDG?• Brief History• How It Works– Data model & placement– HA and fault tolerance – Consistency – Internals

Page 3: In Memory Data Grids, Demystified!

Why IMDG?

Today, more than ever, there are many choices when it comes to storing your data

Page 4: In Memory Data Grids, Demystified!

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved

4

But There Many

Solutions

Page 5: In Memory Data Grids, Demystified!

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved

5

Just A Few Years Back

Page 6: In Memory Data Grids, Demystified!

So Why Indeed??

Page 7: In Memory Data Grids, Demystified!

The Need for Speed, In

Real Time…

Page 8: In Memory Data Grids, Demystified!

Some Facts

Page 9: In Memory Data Grids, Demystified!

Memory will always be faster

than disk (usually by orders of

magnitude)

Page 10: In Memory Data Grids, Demystified!

Recent Survey

Page 11: In Memory Data Grids, Demystified!

67%

The ratio of IT managers that think that real time analysis is the biggest challenge for big data implementations

Page 12: In Memory Data Grids, Demystified!

40%

• Plan to use in memory technologies for big data projects.• Only 32%

mentioned Hadoop

Page 13: In Memory Data Grids, Demystified!

Stream Processing

Page 14: In Memory Data Grids, Demystified!

Hell, Even Gartner Thinks So

“In memory computing (IMC) … provides transformational opportunities. The execution of

certain-types of hours-long batch processes can be squeezed into minutes or even seconds …

Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns

pointing at emerging opportunities and threats "as things happen.”

Page 15: In Memory Data Grids, Demystified!

And nowadays

HW and SW just makes it a whole lot

cheaper

Page 16: In Memory Data Grids, Demystified!

Some Common Use Cases

Page 17: In Memory Data Grids, Demystified!

Fast, Transactional Data Access

• Inventory management • Financial

reference data• Real time

transactional data

Page 18: In Memory Data Grids, Demystified!

Real Time Stream

Processing

• Fraud Detection• Click Stream

Analysis • Real time

analytics • Continuous

calculation

Page 19: In Memory Data Grids, Demystified!

Heavyweight Offline

Calculations

• Trade Reconciliation • Pattern analysis

and detection• Number crunching

Page 20: In Memory Data Grids, Demystified!

Caching

• Database offloading • Content heavy

websites

Page 21: In Memory Data Grids, Demystified!

The Evolution of Data Grids

Page 22: In Memory Data Grids, Demystified!

First There Were Local Caches

CacheIn process cachingof Key->Value data

structure

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

IMDG.next()

Good for repetitive-data reads

Limited in capacity

Doesn’t handle write-heavy scenarios

Reads are only part latency path

Page 23: In Memory Data Grids, Demystified!

Then Came Distributed Caches

CacheIn process cachingof Key->Value data

structure

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

Increased Capacity

Still no support for write-heavy scenarios

Limited to ID-based reads

Reads are only part latency path

IMDG.next()

Page 24: In Memory Data Grids, Demystified!

In Memory Data Grids

CacheIn process cachingof Key->Value data

structure

Increased capacity

Write scalability

Can serve as system of record with querying & transaction semantics

Still limited in capacity

Latency can come from other parts of your app

Distribute CachePartitioned cache

nodes

IMDGPartitioned system

of record

IMDG.next()

Page 25: In Memory Data Grids, Demystified!

How It Works

Page 26: In Memory Data Grids, Demystified!

Data Models

Page 27: In Memory Data Grids, Demystified!

27

Data Placement – Fixed Hashing

hash(key) % #nodes

Page 28: In Memory Data Grids, Demystified!

28

Fixed Hashing - HA

hash(key) % #nodes

Page 35: In Memory Data Grids, Demystified!

Data Consistency

Since we’re dealing with distributed data, consistency cannot be taken for granted• Read after write • Read after read • Write-write consistency

Page 36: In Memory Data Grids, Demystified!

Solution 1: Single

Master

Page 37: In Memory Data Grids, Demystified!

Solution 2: Read/Write Quorums

Page 38: In Memory Data Grids, Demystified!

Some More Concerns

• Transactions• Querying • Failure detection • Leader election • Persistency • Interoperability

Page 39: In Memory Data Grids, Demystified!

IMDG.next()

Using IMDG for messaging, BL

Page 40: In Memory Data Grids, Demystified!

IMDG.next()

SSD FTW!

Page 41: In Memory Data Grids, Demystified!

Thank You!

docs.gigaspaces.com