doing big data for real with docker · 2017-12-14 · the mesosphere dcos is a distributed...

30
DOING BIG DATA FOR REAL WITH DOCKER MESOSPHERE DCOS Elizabeth Lingg [email protected]

Upload: others

Post on 22-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

DOING BIG DATA FORREAL WITH DOCKER

MESOSPHERE DCOSElizabeth Lingg

[email protected]

Page 2: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

AGENDA1. Intro2. Mesosphere, Docker, and DCOS Overview3. Big Data Container Orchestration using DCOS and Docker4. Demo5. Q & A

Page 3: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

INTROEngineering Manager @ MesosphereM.S. Computer Science with a Specialization in ArtificialIntelligence from StanfordB.S. Computer Science with a Minor in Math, B.S. Policyand Management from Carnegie MellonExperience in AI, Big Data, and SystemsEnjoys applying Distributed Systems to Manage andReason Over Large Amounts of Data

Page 4: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

MESOSProvides primitives to author datacenter-native apps.

PRIMITIVES

Resources (cpu, mem, disk, ports)Asset fetchingTask state trackingAPI for the datacenter

Page 5: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

STATUS QUO IS STATICPARTITIONING

AND USE OF VIRTUAL MACHINES

Page 6: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

MESOS LET US TREAT A CLUSTER OFNODES...

Page 7: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

AS ONE BIG COMPUTER

Page 8: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

Not as individual

machines

Not as VMs

Page 9: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

BUT AS COMPUTATIONALRESOURCES LIKE CORES, MEMORY,

DISKS, ETC.

Page 10: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management
Page 11: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

WE LOVE CONTAINERS

Page 12: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

MOST MODERN APPLICATIONS ARE A WEB OFCONTAINERS

Page 13: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

A CONTAINER ORCHESTRATION PLATFORM

Page 14: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

Containerization in Mesos, a brief history

Page 15: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

MESOSPHERE DCOSSoftware to provide a complete OS: init, cron, apt-get,discovery, routingBeautiful web UI and CLISupportEcosystem of DCOS ServicesMesos Master and Mesos Workers Running in DockerContainers

Page 16: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

DCOS UI

Page 17: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

DCOS CLI$ dcos

Command line utility for the Mesosphere Datacenter OperatingSystem (DCOS). The Mesosphere DCOS is a distributed operatingsystem built around Apache Mesos. This utility provides toolsfor easy management of a DCOS installation.

Available DCOS commands:

config Get and set DCOS CLI configuration properties help Display command line usage information marathon Deploy and manage applications on the DCOS node Manage DCOS nodes package Install and manage DCOS software packages service Manage DCOS services task Manage DCOS tasks

Page 18: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

BIG DATA DISTRIBUTEDAPPLICATIONS ON DCOS

Mesos Master and Mesos Workers Running in DockerContainersDistributed Applications Running in Containers on theMesos WorkersContainer Orchestration done by Apache MesosResource Allocation and Scaling Managed by ApacheMesos

Page 19: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

BIG DATA DISTRIBUTEDAPPLICATIONS ON DCOS

Popular Distributed Apps easily deployed on a singleDCOS ClusterKafka, Cassandra, HDFS, Spark, and other Big DataServicesHealth checks and failure recovery are automated

Page 20: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

APPLICATION NETWORKINGInteract with the CLI or REST API's to interact with theservicesMesos DNS ResolutionDocker Networking mainly done through host modenetworking, works seamlessly

Page 21: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

DATA SECURITYServices storing secure data run on private worker nodesin the clusterPrivate nodes can only be accessed through VPNAs needed, services choose what is exposed through aproxy running on a public nodeDistributed Application can authenticate with the Masterusing Framework Authentication (Kerberos Option)

Page 22: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

EXAMPLE: SIMPLE DOCKER APP ONDCOS

{ "id": "/mesosphere/cd-demo-app", "instances": 1, "cpus": 1, "mem": 512, "container": { "type": "DOCKER", "docker": { "image": "mesosphere/cd-demo-app:$tag", "network": "BRIDGE", "portMappings": [ { "servicePort": 28080, "containerPort": 80, "hostPort": 0, "protocol": "tcp" }<

Page 23: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

EXAMPLE: CASSANDRA DCOSSERVICEFEATURES

Managed node configurationHealth MonitoringRest APIDNS Names for nodesMultiple Rings in one cluster

Page 24: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

INSTALL$ dcos package install cassandra

CUSTOMIZABLE INSTALL OPTIONS{ "cassandra": { "cluster-name": "dev", "resources": { "cpus": 3.0, "mem": 6144, "disk": 30720 } }}

$ dcos package install cassandra --options=options.json

Page 25: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

INSTALLING

Page 26: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

HEALTHY

Page 27: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

REST APIGET /node/all

GET /health/cluster/report

POST /node/{node}/replace

POST /cluster/repair/start

POST /scale/nodes?nodeCount={count}

Page 28: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

DEMO!

Page 29: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

Q & A

Page 30: DOING BIG DATA FOR REAL WITH DOCKER · 2017-12-14 · The Mesosphere DCOS is a distributed operating system built around Apache Mesos. This utility provides tools for easy management

THANKS!LET'S CHAT!WE'RE HIRING!

DCOS: Join:

mesosphere.commesosphere.com/careers/