hadoop 2 cluster with oracle solaris zones, zfs and unified archives orgad kimchi - principal...

12
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Hadoop 2 cluster with Oracle Solaris Zones, ZFS and unified archives Orgad Kimchi - Principal Software Engineer September 29, 2014 Oracle Confidential – Internal/Restricted/Highly Restricted Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Upload: sharyl-stafford

Post on 16-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Hadoop 2 cluster with Oracle Solaris Zones, ZFS and unified archives

Orgad Kimchi - Principal Software Engineer

September 29, 2014

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 2

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 3

Agenda

Introduction to Hadoop

The benefits of using Oracle Solaris technologies for a Hadoop cluster

1

2

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Introduction to Hadoop

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

What is Hadoop ?

• Originated at Google 2003 • – Generation of search indexes and web scores • Top level Apache project, Consists of two key services 1. Hadoop Distributed File System (HDFS), highly scalable, fault-

tolerant , distributed 2. MapReduce API (Java), Can be scripted in other languages • Hadoop brings the ability to cheaply process large amounts of

data, regardless of its structure.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 6

Hadoop Cluster best on Oracle Solaris

• Oracle Solaris Zones provide fast provisioning of new cluster members• Oracle Solaris ZFS optimized for secure, fast, big data• Oracle unified archive for a “cloud in a box deployments”• Oracle Solaris SMF, DTrace provide best runtime management

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 7

Oracle Solaris Zones Benefits

• Fast provision of new cluster members using the Solaris zones cloning feature• Very high network throughput between the zones for data node replication

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8

The benefits of using Oracle Solaris ZFS for a Hadoop cluster

• Immense data capacity,128 bit file system, perfect for big data-set• Optimized disk I/O utilization for better I/O performance with ZFS built-in

compression• Secure data at rest using ZFS encryption

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 9

The benefits of using Oracle Solaris Unified archives• You can use Unified Archive to create a "cloud in a box" and deploy a bare-

metal system in minutes vs. days.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 10

Solaris Benefits• Multithread awareness - Oracle Solaris understands the correlation

between cores and the threads, and it provides a fast and efficient thread implementation.

• DTrace - comprehensive, advanced tracing tool for troubleshooting systematic problems in real time.

• SMF – allow to build dependencies between Hadoop services (e.g. starting the MapReduce daemons after the HDFS daemons).

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 11

Lab Overview

• This hands-on lab presents exercises that demonstrate how to set up an Apache Hadoop cluster using Oracle Solaris 11 technologies such as Oracle Solaris Zones, ZFS, and network virtualization and unified archives. • Key topics include the Hadoop Distributed File System (HDFS) and the

Hadoop MapReduce programming model.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 12

Come and See Us• Set Up a Hadoop 2 Cluster with Oracle Solaris Zones, Oracle Solaris ZFS,

and Unified Archive [HOL2086]

• Wednesday, Oct 1, 1:15 PM - 3:15 PM - Hotel Nikko - Mendocino I/II