michael hausenblas- scalable time series and stream processing for iot applications
Post on 17-Jan-2017
30 Views
Preview:
TRANSCRIPT
© 2016 Mesosphere, Inc. All Rights Reserved.
SCALABLE TIME SERIES AND STREAM PROCESSING FOR IOT APPLICATIONS
1
Michael Hausenblas, Developer & Cloud Advocate | 2016-01-16
© 2015 Mesosphere, Inc. All Rights Reserved.
MOTIVATION
2
© 2015 Mesosphere, Inc. All Rights Reserved.
AIRLINES
3
© 2015 Mesosphere, Inc. All Rights Reserved.
LOGISTICS
4
© 2015 Mesosphere, Inc. All Rights Reserved.
HEALTH CARE
5
© 2015 Mesosphere, Inc. All Rights Reserved.
TRADERS
6
© 2015 Mesosphere, Inc. All Rights Reserved.
FARMERS
7
© 2015 Mesosphere, Inc. All Rights Reserved.
CITIES
8
© 2
014,
Wire
d m
agaz
ine
© 2015 Mesosphere, Inc. All Rights Reserved.
YOU
9
© 2015 Mesosphere, Inc. All Rights Reserved.
THETOOLBOX
10
© 2015 Mesosphere, Inc. All Rights Reserved.
LET'S TALK ABOUT WORKLOADS* …
11*) kudos to Timothy St. Clair, @timothysc
batch streaming PaaS
MapReduce
© 2015 Mesosphere, Inc. All Rights Reserved.
• Apache Kafka• ØMQ, RabbitMQ, Disque (Redis-based), etc.• fluentd, Logstash, Flume• Akka streams• cloud-only: AWS SQS, Google Cloud Pub/Sub• see also queues.io
MESSAGE QUEUES & ROUTERS
12
© 2015 Mesosphere, Inc. All Rights Reserved.
APACHE KAFKA
13
• High-throughput, distributed, persistent publish-subscribe messaging system
• Originates from LinkedIn
• Typically used as buffer/de-coupling layer in online stream processing
Message queues & routers
kafka.apache.org
© 2015 Mesosphere, Inc. All Rights Reserved.
FLUENTD
14
Message queues & routers
www.fluentd.org
© 2015 Mesosphere, Inc. All Rights Reserved.
STREAM PROCESSING PLATFORMS
15
• Apache Storm• Apache Spark• Apache Samza• Apache Flink• Concord• cloud-only: AWS Kinesis, Google Cloud Dataflow• see also my webinar on stream processing
© 2015 Mesosphere, Inc. All Rights Reserved.
APACHE STORM
16
• Distributed, fault-tolerant stream-processing platform
• Guaranteed message processing (replaying messages on failure)
• Concepts: tuples, streams, spouts, bolts, topologies
Stream processing platforms
storm.apache.org
© 2015 Mesosphere, Inc. All Rights Reserved.
APACHE SPARK
17
Stream processing platforms
spark.apache.org
Spark SQL Spark Streaming MLlib(machine learning)
Spark core (RDD)
GraphX(graph processing)
Mesos
Filesystem (local, HDFS, S3) or data store (HBase, Cassandra, Elasticsearch, etc.)
YARNStandalone
© 2015 Mesosphere, Inc. All Rights Reserved.
TIME SERIES DATASTORES
18
• InfluxDB• OpenTSDB• KairosDB• Prometheus• see also iot-a.info
© 2015 Mesosphere, Inc. All Rights Reserved.
OPENTSDB
19
• Distributed time series database on top HBase
• Store, index, query & plot metrics
• Extremely scalable
• Low-level monitoring
Time series datastores
opentsdb.net
© 2015 Mesosphere, Inc. All Rights Reserved.
INFLUXDB
20
• No-dependency, time series database written in Go
• SQLish query language (incl. regex, fan out)
• Single node or Raft-based distributed node mode
Time series datastores
influxdb.com
© 2015 Mesosphere, Inc. All Rights Reserved.
CHALLENGES
21
• Set up and operation of components
• Elasticity: static vs. dynamic partitioning
• Efficient usage of resources (TCO)
© 2015 Mesosphere, Inc. All Rights Reserved.
MEET THE DATACENTER OPERATINGSYSTEM(DCOS)
22
© 2015 Mesosphere, Inc. All Rights Reserved.
LOCAL OS VS. DISTRIBUTED OS
23http://bitly.com/os-vs-dcos
© 2015 Mesosphere, Inc. All Rights Reserved.
DCOS IS A DISTRIBUTED OPERATING SYSTEM
24
• local OS per node (+container enabled)• scheduling (long-lived, batch)• networking• service discovery• stateful services• security• monitoring, logging, debugging
© 2015 Mesosphere, Inc. All Rights Reserved. 25
© 2015 Mesosphere, Inc. All Rights Reserved.
BENEFITS
26
DCOS
• Run stateless services such as Web server or app server and Big Data services like Kafka, Spark, or Cassandra together on one cluster
• Dynamic partitioning of your cluster, depending on your business requirements
• Increased utilization (10% → 80%++)
© 2015 Mesosphere, Inc. All Rights Reserved.
ANEXAMPLE
27
© 2015 Mesosphere, Inc. All Rights Reserved. 28
https://mesosphere.com/blog/2015/11/18/dcos-time-series-demo
© 2015 Mesosphere, Inc. All Rights Reserved. 29https://github.com/mesosphere/time-series-demo
© 2015 Mesosphere, Inc. All Rights Reserved.
Q & A
30
• @mhausenblas
• mhausenblas.info
• @mesosphere
• mesosphere.io/product
• mesosphere.com/infinity
top related