imcsummite 2016 breakout - nikita ivanov - apache ignite 2.0 towards a converged data platform

22
Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. NIKITA IVANOV GridGain Founder & CTO Apache Ignite PMC Apache 2.0 - Towards Converged Data Platform Fast Data Meets Open Source http :/ / ignite.apache.org @apacheign ite See all the presentations from the In-Memory Computing Summit at http://imcsummit.org

Upload: in-memory-computing-summit

Post on 16-Apr-2017

263 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

NIKITA IVANOVGridGain Founder & CTO

Apache Ignite PMC

Apache 2.0 - Towards Converged Data PlatformFast Data Meets Open Source

http://ignite.apache.org @apacheignite

See all the presentations from the In-Memory Computing Summit at http://imcsummit.org

Page 2: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Agenda

• Fast Data vs Big Data– In-Memory Databases– In-Memory Data Grids– Hadoop & Spark

• Converged Data Platform• Big Data + Fast Data

•What is Apache Ignite– Big Bank Use Case– In-Memory Data Fabric– Shared Memory Layer• Share Spark RDDs• In-Memory File System

• Q & A

Page 3: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Very Active Community• Great Way to Learn Distributed Computing• How To Contribute:

– https://ignite.apache.org/community/contribute.html#contribute

– https://cwiki.apache.org/confluence/display/IGNITE/How+to+Contribute

Apache Ignite: Join Us!

Page 4: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Big Data– OLAP mostly– Larger Historical Data Set– Read-Mostly– Throughput Not Important– Low Query Latencies– Good-enough for interactive

analytics

Fast Data vs Big Data

• Fast Data– OLTP mostly– Smaller Operational Data Set– High Throughput (ops/sec)– Low Latencies– Consistent and Transactional

Page 5: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Big Data– Hadoop• MapReduce• HDFS• HBase

– Spark• Machine Learning• Graph Processing• SQL

– Warehouse/DB Vendors

Fast Data vs Big Data

• Fast Data– Streaming• Flink• Kafka• Apex

– In-Memory Data Grid• Ignite• Geode (incubating)

– In-Memory Database• MemSQL• VoltDB

– NoSQL• MongoDB• Cassandra

Page 6: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• In-Memory Databases– MemSQL• Closed Source• Free Limited Community Edition

– VoltDB• Open Source Community Edition (AGPL)• Closed Source Enterprise Edition

• Main Features– High-Throughput– Low Latencies– Full SQL Support• However, SQL is the only API

– Disk Persistence• Disk is just a copy of memory

– Complete replacement of existing databases

Fast Data: In-Memory Databases

Page 7: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• In-Memory Data Grids– Apache Ignite – In-Memory Data Fabric– Apache Geode (incubating)– Hazelcast

• Main Features– High throughput– Low latencies– Key-value store– Transactions– Extensive data querying capability– Disk persistence• Read & write-through to databases• Keep your existing database

Fast Data: In-Memory Data Grids

Page 8: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Apache Hadoop & Apache Spark– Big Family of Products– Batch Processing– In-Memory Processing (Spark)

• Main Features– Disk-based storage– Interactive Analytics– No Transactions– Read-Only Data Sets– Strong Querying Capabilities– Relatively Low Latencies• Good enough for human eye

Big Data Ecosystem

Page 9: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Big Data+ Add Shared Memory Store+ Add Transactions

How To Bridge The Gap?

• Fast Data+ Add Disk-First Data Sets+ Add Disk-First Processing

Page 10: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Fast Data + Big Data• Distributed and Scalable• Real Time Data-To-Action• Hybrid Transactional and Analytical Processing (HTAP)– Fast Data in Memory– Big Data on Disk– Combine RAM, NAND, HDD– No ETL– Query historical and analytical data– Transactions on historical and analytical data

What is a Converged Data Platform?

Page 11: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Apache IgniteTM In-Memory Data Fabric: Strategic Approach to IMC

• Supports Applications of various types and languages

•Open Source – Apache 2.0• Simple Java APIs• 1 JAR Dependency• High Performance & Scale• Automatic Fault Tolerance•Management/Monitoring• Runs on Commodity Hardware

• Supports existing & new data sources• No need to rip & replace

Page 12: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Apache Ignite In-Memory Data Fabric

Page 13: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

© 2014 GridGain Systems, Inc.

Use Case: Largest bank in Russia and Eastern Europe, and the third largest in Europe

• Sberbank Requirements– Migrate to data grid architecture– Minimize dependency on Oracle– Move to open source

•Why Apache Ignite– More than a Data Grid– Best performance

• 10+ competitors evaluated– Demonstrated best

• Fault tolerance & scalability• ANSI-99 SQL Support• Transactional consistency

• Jointly Developing• Disk-Only Data Sets• Query Disk & Memory Together

130

Milli

on C

usto

mer

s

DepositWithdraw

alStatemen

tDisk Store

Disk Store

Disk Store

1000+ Servers

GridGainSecurity

DepositWithdrawalStatement

Page 14: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Based on JCache (JSR 107)– In-Memory Key-Value Store– Basic Cache Operations– ConcurrentMap APIs– Collocated Processing (EntryProcessor)– Events and Metrics– Pluggable Persistence

• Ignite Data Grid– ACID Transactions– SQL Queries (ANSI 99)– In-Memory Indexes– On-Heap & Off-Heap Memory– Automatic RDBMS Integration

Apache Ignite Data Grid

Page 15: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Data Grid: Distributed Caching

Partitioned Cache Replicated Cache

Page 16: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• ANSI-99 SQL• Always Consistent• Fault Tolerant• In-Memory Indexes (On-Heap and Off-Heap)• Automatic Group By, Aggregations, Sorting• Cross-Cache Joins, Unions, etc.• Ad-Hoc SQL Support

Data Grid: Ad-Hoc SQL (ANSI 99)

Page 17: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

SQL Cross-Cache GROUP BY Example

Page 18: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• IgniteRDD Deployment Modes– Share RDD across tasks on the host– Share RDD across tasks in the application– Share RDD globally– Embedded vs External Deployments

• Faster SQL– In-Memory Indexes– SQL on top of Shared RDD

Share RDDs Across Spark Jobs

Page 19: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• Ignite In-Memory File System (IGFS)– Hadoop-compliant– Easy to Install– On-Heap and Off-Heap– Caching Layer for HDFS– Write-through and Read-through HDFS– Performance Boost

Ignite In-Memory File System

Page 20: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

Ignite In-Memory Map Reduce• In-Memory Native

Performance• Zero Code Change• Use existing MR code• Use existing Hive queries• No Name Node• No Network Noise• In-Process Data Colocation• Eager Push Scheduling

Page 21: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

• More SQL– Non-collocated Joins– 100% Data Modification Language (DML)– 100% Data Definition Language (DDL)

• More Disk– ATMM - Advanced Tiered-Memory Model:• Disk-first data sets• Any DRAM/NAND/HDD mix

– Seamless querying across ATMM

Proposed Apache Ignite 2.0 RoadmapConverged Data Platform

Page 22: IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Converged Data Platform

Apache®, Apache Ignite, Ignite®, and the Apache Ignite logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

ANY QUESTIONS?Thank you for joining us. Follow the conversation.

http://www.ignite.apache.org

@apacheignite