pivotal's effort on apache geode

35
Apache Geode, and Pivotal's leadership role in open sourcing (Gemfire) Nitin Lamba (incubating)

Upload: apache-apex

Post on 18-Jan-2017

129 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Pivotal's effort on Apache Geode

Apache Geode,and Pivotal's leadership role

in open sourcing (Gemfire)

Nitin Lamba

(incubating)

Page 2: Pivotal's effort on Apache Geode

Pivotal’s Open Source strategy

What is Apache Geode?

History

Differentiators

Basic Concepts

Resources

Q & A

Agenda

2

Page 3: Pivotal's effort on Apache Geode

3

Page 4: Pivotal's effort on Apache Geode

4

In 2015, Pivotal granted the components of its Big Data Suite to open source

6 Million Lines of Code4 new open source communities

Page 5: Pivotal's effort on Apache Geode

5

May 2015 Sept 2015

Sept 2015Oct 2015

Page 6: Pivotal's effort on Apache Geode

From GEMFIRE to GEODE…

6

Page 7: Pivotal's effort on Apache Geode

A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability,

resiliency and continuous availability

• fast access to critical data sets• location-aware distributed data

processing• event-driven data architecture

What is GEODE?

7

Page 8: Pivotal's effort on Apache Geode

• 1000+ systems in production (real customers)• Cutting edge use cases

Incubating but ROCK solid…

8

<2000 2004 2008 2012 2016

Early drivers• Data Volumes• Margins/ transactions• IT maintenance costs • Elasticity needs

Real-time needs• Real-time response• Time to market needs• Flexible Data Models • Persistent+In-memory

Global Data• Visibility across DC• Fast Ingest• Device to enterprise • Uptime (always on)

Open Source!• Apache Incubation• Gemfire > Geode• Geode M1 release• 1st Geode Summit

Financial Services

US DoDTrade Clearing

Travel Portal

Online Gambling

TelcosManufacturing

Auto InsurancePayroll processing

Rail systems

Page 9: Pivotal's effort on Apache Geode

…with both SCALE and SPEED, …

9

40KTransactionsper second

3TB Data

in-memory

17B Records

in-memory

120KConcurrent

users

Page 10: Pivotal's effort on Apache Geode

… and impacting a LOT of people!

10

China RailwayCorporation

Indian Railways

17%

19%

36%of the world population

Page 11: Pivotal's effort on Apache Geode

High-level Architecture

11

Powerful app development kit• APIs: Java & REST• Adapters: Redis, Lucene*, Spark*, …

Multiple persistence options• Filesystem, RDBMS or HDFS*• Sync: read-through, write-through• Async: write-behind

Durable <K,V> cache/ store• Data replicated or partitioned• Redundant storage in-memory/ disk• Flexible data retention policiesÎ

!

Loca

tor

Serv

er

Serv

er

Serv

er

Serv

er +""""

"

$

%%%

&& &% % %% %% %%

&&

A Peer-2-Peer in-memory Distributed System

REST

!

* Experimental and waiting community feedback

Page 12: Pivotal's effort on Apache Geode

• Minimize copying

• Minimize contention points

• Run user code in-process

• Partitioning & parallelism

• Avoid disk seeks

• Automated benchmarks

What makes it go FAST?

12

Page 13: Pivotal's effort on Apache Geode

• Cache• Region• Member• Client Cache• Persistence• Functions

Let’s talk about a few BASIC CONCEPTS…

13

Page 14: Pivotal's effort on Apache Geode

• In-memory storage and management for your data

• Configurable through XML, Java API or CLI

• Collection of Region

What is a CACHE?

14

Page 15: Pivotal's effort on Apache Geode

• Distributed java.util.Map on steroids (Key/Value)

• Consistent API regardless of where or how data is stored

• Observable (reactive)

• Highly available, redundant on cache Member (s).

What is a REGION?

15

Page 16: Pivotal's effort on Apache Geode

• Local, Replicated or Partitioned

• In-memory or persistent

• Redundant

• LRU

• Overflow

Region: Types & Options

16

LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY

Page 17: Pivotal's effort on Apache Geode

• Durability

• WAL for efficient writing

• Consistent recovery

• Compaction

Persistent Regions

17

Server 1 Server N

Page 18: Pivotal's effort on Apache Geode

• A process that has a connection to the system

• A process that has created a cache

• Embeddable within your application

What is a MEMBER?

18

Client

Locator

Server

Page 19: Pivotal's effort on Apache Geode

• A process connected to the Geode server(s)

• Can have a local copy of the data

• Run OQL queries on local data

• Can be notified about events on the servers

What is a CLIENT CACHE?

19

Page 20: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

20

Server 3Server 2Server 1

Page 21: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

21

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 22: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

22

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 23: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

23

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Page 24: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

24

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

B3

B2

Server 1 waits for others when it starts

Page 25: Pivotal's effort on Apache Geode

Persistence - Shared Nothing

25

Server 3Server 2Server 1

B1

B3

B2

B1

B3

B2

Primary

Secondary

Fetches missed operations on restart

Page 26: Pivotal's effort on Apache Geode

Persistence - Operational Logs

26

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Page 27: Pivotal's effort on Apache Geode

Persistence - Operational Logs: Compaction

27

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Copy live data forward

Page 28: Pivotal's effort on Apache Geode

• Used for distributed concurrent processing (Map/Reduce, stored procedure)

• Highly available

• Data oriented

• Member oriented

Functions

28

Page 29: Pivotal's effort on Apache Geode

Functions

29

Page 30: Pivotal's effort on Apache Geode

30

• Check out: http://geode.incubator.apache.org

• Subscribe: [email protected]

• Download: http://geode.incubator.apache.org/releases/

Join the Community!

Page 31: Pivotal's effort on Apache Geode

31

Thank you!

Page 32: Pivotal's effort on Apache Geode

Additional Slides

32

Page 33: Pivotal's effort on Apache Geode

Built for PERFORMANCE…

33

0

200,000

400,000

600,000

800,000

1,000,000

A Re

ads

A Up

date

s

B Re

ads

B Up

date

s

C Re

ads

D In

serts

D Re

ads

F Re

ads

F Up

date

s

Ope

ratio

ns p

er s

econ

d

YCSB Workloads

Cassandra Geode

Page 34: Pivotal's effort on Apache Geode

…and horizontal, consistent SCALABILITY!

34

Horizontal scaling for reads, consistent latency and CPU

0.

4.5

9.

13.5

18.

0.

1.25

2.5

3.75

5.

6.25

2 4 6 8 10

Speedu

p

ServerHosts

speedup latency(ms) CPU%

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size

Page 35: Pivotal's effort on Apache Geode

High Availability

35