enterprise solution engineer twitter: @knight cloud · akka clustering • peer-to-peer based...

64
Next-Generation Scala Architectures Ryan Knight Enterprise Solution Engineer Twitter: @knight_cloud

Upload: others

Post on 16-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Next-Generation Scala Architectures

Ryan KnightEnterprise Solution Engineer

Twitter: @knight_cloud

Page 2: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

My Experience• Sun Microsystems

• Oracle

• Family Search / LDS Church

• Riot Games

• Adobe / T-Mobile

• Deloitte / State of Louisiana

• Typesafe

• Tomax / Demandware

• DataStax - Enterprise Technical Sales

Page 3: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Which is Faster?

• Fighter Jet

• Mantis Shrimp

• Bullet

• Mushroom Spores

Page 4: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Sphagnum Moss!

• Launches Spores at 89 MPH in less than a thousandth of second

• Spores Travel over 80 Height of the Launching Capsule

Page 5: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

"Você nunca muda as coisas lutando contra a realidade existente.Para mudar alguma coisa, construa um novo modelo que torne o modelo

existente obsoleto ".

Page 6: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Agenda

• Architecting the Application Tier

• Architecting the Data Tier

• Fundamental Architectural Principles

Page 7: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Architecting the Application Tier

with Scala

Page 8: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Evaluation Criteria

Professor Zapinsky provou que a lula é mais inteligente do que o gato doméstico quando

desafiados em condições semelhantes.

Page 9: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Flaw of Performance Benchmarking

• Unrealistic Load Scenario

• Unrealistic Application Scenario

• Performance is only one criteria

• Framework optimized for benchmark

Page 10: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Four Traits of Reactive Architectures

Page 11: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children
Page 12: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Why Scala?

• Type Inference

• Uniform Access of Principle - fields can be declared via methods or fields

• Traits

• Value Classes

• Package-level methods & fields

• Default and Named Parameters

• Higher Ordered Types

• Functions as First Class Citizens

• Currying / Methods with multiple parameter lists

• Qualified Imports

• Scoped access modifiers

• Case Classes

• Singleton Objects

• Default Methods - apply / unapply / set

• Implicit Conversion and Views

• Macros

• Parser Combinators

• Multi-Line Strings

• String Interpolation

• Traits

• Default Public Access

• Type Classes

• Extractor Patterns

Page 13: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Functional

XKCD

Page 14: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Why Functional Rocks!

• Immutability

• Higher-Level of Abstraction

• Define the What not the How

• Eliminating side effects

• Inherent Parallelism

Page 15: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Functional in Reactive Programming

• Easy to create callbacks

• Easy to handle Events and Async Results

Page 16: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Statements vs. Expressionsdef errMsg(errorCode: Int): String = { var result: String = _ errorCode match { case 1 => result = "Network Failure" case 2 => result = "I/O Failure" case _ => result = "Unknown Error" } return result; }

Page 17: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Statements vs. Expressions

def errMsg(errorCode: Int): String = errorCode match { case 1 => "Network Failure" case 2 => "I/O Failure" case _ => "Unknown Error" }

Page 18: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

No Imperative Code!

• Imperative programming - Describes computation in terms of statements that change a program state.

Page 19: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

def findPeopleIn(city: String, people: Seq[People]): Set[People] = val found = new mutable.HashSet[People] for(person <- people) { for(address <- person.addresses) { if(address.city == city) found.put(person) } } return found }

No Imperative Code!

Page 20: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

No Imperative Code!

def findPeopleIn(city: String, people: Seq[People]): Set[People] = for { person <- people.toSet[People] address <- person.addresses if address.city == city } yield person

Page 21: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Down with Null Pointers!def authenticateSession( session: HttpSession, username: Option[String], password: Option[Array[Char]]) = for { u <- username p <- password if canAuthenticate(u, p) privileges <- privilegesFor.get(u) } injectPrivs(session, privileges)

Page 22: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

NO BLOCKING!

Page 23: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Scala Futures

Page 24: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Future API

import scala.concurrent._

import ExecutionContext.Implicits.global

def calcInt(x: Int) = {

Future(x * 5)

}

calcInt(10).map { rslt => println(rslt) } // prints 50

Page 25: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Traditional Request/Response

Client Server Serviceblocking blocking

Problems?

Page 26: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Reactive Request/Response

def getTweets = Action.async { Ok(WS.get("http://twitter.com/"))}}

Client Server Servicenon-blocking non-blocking

Page 27: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Reactive CompositionAsync & Non-Blocking

def foo = Action.async {

val futureTS = WS.url("http://www.typesafe.com").get

val futureTwitter = WS.url("http://www.twitter.com").get

for {

ts <- futureTS

twitter <- futureTwitter

} yield Ok(ts.body + twitter.body)

}

• Futures Treated as Collections

• For Expression used to represent a “callback”

Page 28: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children
Page 29: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Akka

• Actor Based Toolkit

• Simple Concurrency & Distribution

• Error Handling and Self-Healing

• Elastic and Decentralized

• Adaptive Load Balancing

Page 30: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

What is an Actor?• Isolated lightweight processes• Message Based / Event Driven• Non-Request Based Lifecycle• Share nothing • Isolated Failure Handling• Same Semantics for Local and Remote

Page 31: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Akka Clustering• Peer-to-peer based cluster membership service

• No single point of failure or single point of bottleneck.

• Automatic node failure detector

• Cluster Events / Cluster-Aware Routers

• Cluster Routing

• Cluster Sharding

Page 32: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Programming Actors

32

case class Greeting(who: String) case class Departure(who: String)

class GreetingActor extends Actor with ActorLogging { def receive = { case Greeting(who) => log.info(s”Hello ${who}”)

case Departure(who) => log.info(s”Good by ${who}") } }

val system = ActorSystem("MySystem") val greeter = system.actorOf(Props[GreetingActor], name = "greeter") greeter ! Greeting("Charlie Parker")

Location Transparency!

Page 33: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Akka Supervisor Hierarchies• Parents send work to Children

• Router to Balance Work

• Parents supervise children actors

• Children delegate failure to parent

• Error-prone tasks delegated to children- “Error Kernel Pattern”

A

CB

D

GFE

Page 34: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Failure Recovery• Supervisor hierarchies with “let-it-crash”

semantics

• Lifecycle Monitoring

• Parent can resume, restart or terminate Child

• Error-prone tasks are delegated to child Actors - “Error Kernel Pattern”

Page 35: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Reference Architecture

35

Web Tier Work Tier

Data Service

AkkaRouter

Tweet Service

Geo Location

UserActor

UserActor

UserActor

UserActor

Reactive Server

UserActor

UserActor

UserActor

UserActor

Reactive Server

Page 36: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Architecting the Data Tier

Page 37: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

It’s all Trade-offs

Page 38: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Intelligent Data• Not just about Big Data or NoSQL

• Batch processing is dead! Ala Haddop

• Real-time data processing!

• Fluent API

• Integrated Batch, Iterative and Streaming Analysis!

Page 39: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

The Event Log

• Append-Only Logging• Database of Facts• Disks are Cheap• Why Delete Data any more?• Replay Events

39

Page 40: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Akka Persistence Webinar

Domain Events

• Things that have completed, facts• Immutable• Verbs in past tense

• CustomerRelocated• CargoShipped• InvoiceSent

• State Transitions

Page 41: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

41

“In general, application developers simply do not implement large scalable applications

assuming distributed transactions.”- Pat Helland

Life beyond Distributed Transactions:

an Apostate’s Opinion

Page 42: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

What is Cassandra?

Distributed Database

✓ Individual DBs (nodes)

✓ Working in a cluster

✓ Nothing is shared

C *

Page 43: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Client

Page 44: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Why Cassandra?

It’s Hugely Scalable (High Throughput)

Page 45: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Spark• Clustered In-Memory Data Analytics

• Fault Tolerant Distributed Datasets

• Batch, iterative and streaming analysis

• In Memory Storage and Disk

• 2-5× less code

• 10x faster on disk, 100x faster in memory than Hadoop MR

Page 46: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Spark Cassandra Connector • Loads data from Cassandra to Spark

• Writes data from Spark to Cassandra

• Implicit Type Conversions and Object Mapping

• Implemented in Scala (offers a Java API)

• Open Source

• Exposes Cassandra Tables as Spark RDDs + Spark DStreams (Soon)

Page 47: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Spark Cassandra Connector

• Data locality-aware (speed)

• Server-Side filters (where clauses)

• Cross-table operations (JOIN, UNION, etc.)

• Data transformation, aggregation, etc.

• Natural Time Series Integration

Page 48: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children
Page 49: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Intelligent Data Architecture

Page 50: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

val conf = new SparkConf(loadDefaults = true) .set("spark.cassandra.connection.host", "127.0.0.1").setMaster("spark://127.0.0.1:7077") Initialization

val sc = new SparkContext(conf)

val table: CassandraRDD[CassandraRow] = sc.cassandraTable("keyspace", "tweets")

val ssc = new StreamingContext(sc, Seconds(30)) val stream = KafkaUtils.createStream[String, String, StringDecoder, StringDecoder]( ssc, kafka.kafkaParams, Map(topic -> 1), StorageLevel.MEMORY_ONLY) stream.map(_._2).countByValue().saveToCassandra("demo", "wordcount") ssc.start() ssc.awaitTermination()

Page 51: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

val sc = new SparkContext( "local", "Inverted Index") sc.textFile("data/crawl") .map { line => val array = line.split("\t", 2) (array(0), array(1)) } .flatMap { case (path, text) => text.split("""\W+""") map { word => (word, path) } } .map { case (w, p) => ((w, p), 1) } .reduceByKey { (n1, n2) => n1 + n2 } .groupBy { case (w, (p, n)) => w } .map { case (w, seq) =>

Page 52: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Architectural Principles

Page 53: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

How to Fail

Page 54: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Shared Mutable State +

Locks / Thread Libraries

AVOID AT ALL COSTS!

Page 55: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Traditional Request/Response

Client Server Serviceblocking blocking

Problems?

Page 56: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

• SINGLE thread of control• If thread blows - you are screwed!• Explicit error handling WITHIN this single thread• Errors do not propagate between threads so there

is NO WAY OF EVEN FINDING OUT that something have failed

Failure Recovery in Java/C/C# etc.

Page 57: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Never block

• ...unless you really have to

• Blocking kills scalability (and performance)

• Never sit on resources you don’t use

• Use non-blocking IO

Page 58: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Go Async

Page 59: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

• Isolate the failure

• Compartmentalize

• Manage failure locally

• Avoid cascading failures

Use Bulkheads

Page 60: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Backpressure

• http://ferd.ca/queues-don-t-fix-overload.html

Page 61: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Backpressure

• http://ferd.ca/queues-don-t-fix-overload.html

Page 62: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Handling Backpressure• Fail Fast

• Circuit Breaker with default responses

• Load Shedding - Bounded Mailboxes

• Worker Pull Pattern vs. Push to Overload

• Throttling

Page 63: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

Questions?

Page 64: Enterprise Solution Engineer Twitter: @knight cloud · Akka Clustering • Peer-to-peer based cluster membership service ... • Router to Balance Work • Parents supervise children

©DataStax 2015 – All Rights Reserved