java day big data analysis in java world
TRANSCRIPT
![Page 1: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/1.jpg)
Big Data Analysis in Java Worldby Serhiy Masyutin
![Page 2: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/2.jpg)
Agenda
The Big Data Problem Map-Reduce MPP Analytical Database In-Memory Data Fabric Lambda Architecture Q&A
![Page 3: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/3.jpg)
The Big Data Problem
- Doug Laney
Big Data
![Page 4: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/4.jpg)
The Big Data Problem
Map-Reduce MPP AD IMDF
When do I need it?
In an hour In a minute Now
What do I need to do with it?
Exploratory analytics
Structured analytics
Singular event processing
(some analytics),
Transactions
How will I query and search?
Unstructured Ad hoc SQL Structured
How do I need to store it?
I do, but not required to
I must and I am required to
Temporarily
Where is it coming from?
File/ETL File/ETL Event/Stream/File/
ETLhttp://blog.pivotal.io/pivotal/products/exploring-big-data-solutions-when-to-use-hadoop-vs-in-memory-vs-mpp
![Page 5: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/5.jpg)
The Big Data Problem
Map-Reduce
MPP AD IMDF
Transactions
Customer records
Geo-spatial
Sensors
Social Media
XML, JSON
Raw Logs
Text
Image
Video
more
pro
cessin
g
http://blog.pivotal.io/big-data-pivotal/products/exploratory-data-science-when-to-use-an-mpp-database-sql-on-hadoop-or-map-reduce
![Page 6: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/6.jpg)
The Big Data Problem
Data is not Information
- Clifford Stoll
![Page 7: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/7.jpg)
Map-Reduce
http://jeremykun.files.wordpress.com/2014/10/mapreduceimage.gif?w=1800
CPUs aren’t getting faster
![Page 8: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/8.jpg)
Map-Reduce
https://anonymousbi.files.wordpress.com/2012/11/hadoopdiagram.png
![Page 9: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/9.jpg)
Map-Reduce
http://hadoop.apache.org/docs/r1.2.1/images/hdfsarchitecture.gif
![Page 10: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/10.jpg)
CAP Theorem
http://hadoop.apache.org/docs/r1.2.1/images/hdfsarchitecture.gif
Availability
Partition Tolerance
![Page 11: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/11.jpg)
Map-Reduce
http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0
public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1);}
![Page 12: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/12.jpg)
Map-Reduce
Volume Variety VelocityMedium-
LargeUnstructure
d dataBatch
processing
Availability
Partition
Tolerance
![Page 13: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/13.jpg)
MPP Analytical Database
http://www.ndm.net/datawarehouse/images/stories/greenplum/gp-dia-3-0.png
![Page 14: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/14.jpg)
MPP Analytical Database
http://my.vertica.com/docs/7.1.x/HTML/Content/Resources/Images/K-SafetyServerDiagram.png
![Page 15: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/15.jpg)
MPP Analytical Database
http://my.vertica.com/docs/7.1.x/HTML/Content/Resources/Images/K-SafetyServerDiagramOneNodeDown.png
![Page 16: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/16.jpg)
MPP Analytical Database
http://my.vertica.com/docs/7.1.x/HTML/Content/Resources/Images/K-SafetyServerDiagramTwoNodesDown.png
![Page 17: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/17.jpg)
MPP Analytical Database
http://my.vertica.com/docs/7.1.x/HTML/Content/Resources/Images/DataK-Safety-K2Nodes2And3Failed.png
![Page 18: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/18.jpg)
MPP Analytical Databasepublic static void main(String[] args) { Class.forName("com.vertica.jdbc.Driver"); ... String connectionUrl = "jdbc:vertica://VerticaHost:5433/ExampleDB"; try (Connection con = DriverManager.getConnection(connectionUrl)) { String sql = "SELECT id, username FROM users WHERE id = ?"; PreparedStatement ps = con.prepareStatement(sql);) { ... try (ResultSet rs = ps.executeQuery()) { while(rs.next()) { ... } } } catch (SQLException e) { ... } }
![Page 19: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/19.jpg)
MPP Analytical Database
Volume Variety VelocitySmall-
Medium-Large
Structured data
Interactive
ASTER DATABASE
Matrix
Availability
Partition
Tolerance
![Page 20: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/20.jpg)
In-Memory Data Fabric
https://ignite.incubator.apache.org/images/in_memory_data.png
![Page 21: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/21.jpg)
In-Memory Data Fabric
https://ignite.incubator.apache.org/images/in_memory_data.png
![Page 22: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/22.jpg)
In-Memory Data Fabric
https://ignite.incubator.apache.org/images/in_memory_compute.png
![Page 23: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/23.jpg)
In-Memory Data Fabric
public static void main(String[] args) { HazelcastInstance instance = Hazelcast.newHazelcastInstance();
Map<String, User> loggedOnUsers = instance.getMap("Users"); ... loggedOnUsers.put(username, user); ... if (loggedOnUsers.containsKey(username)) ... ... loggedOnUsers.remove(username); ... for (User u : loggedOnUsers.values()) ...}
![Page 24: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/24.jpg)
In-Memory Data Fabric
Volume Variety VelocitySmall-
MediumStructured
data(Near) Real-
Time
Availability
Partition
Tolerance
![Page 25: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/25.jpg)
Lambda Architecture
http://lambda-architecture.net
![Page 26: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/26.jpg)
Lambda Architecture
http://lambda-architecture.net
SQL
Streaming
ElephantDB
![Page 27: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/27.jpg)
Lambda Architecture
Volume Variety VelocitySmall-
Medium-Large
Unstructured-
Structured data
(Near) Real-Time
![Page 28: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/28.jpg)
Q&A
![Page 29: Java day Big Data Analysis In Java World](https://reader030.vdocument.in/reader030/viewer/2022032620/55ce7b15bb61eb24498b4667/html5/thumbnails/29.jpg)
Thanks folks!