l20 scalability


Upload: olafur-andri-ragnarsson

Post on 22-Jan-2017




1 download


Page 1: L20 Scalability


Page 2: L20 Scalability

Agenda▪ Evolution - where are we today?▪ Requirements of 21st century web applications▪ Session State▪ Distribution Strategies▪ Scale Cube▪ Eventual Consistency– CAP Theorm▪ Real World Example

Page 3: L20 Scalability


60s 70s 80s 90s 00sIBM


Limited layering orabstraction

IBM, DEC Mini-

computers Unix, VAX

“Dumb” terminals


PC, Intel, DOS, Mac,

Unix, Windows

Client/Server RMDB

Windows Internet HTTP

Web Browsers





Browsers, Services Domain

Applications RMDB

Page 4: L20 Scalability


60s 70s 80s 90s 00sIBM


Limited layering orabstraction

IBM, DEC Mini-

computers Unix, VAX

“Dumb” terminals


PC, Intel, DOS, Mac,

Unix, Windows

Client/Server RMDB

Windows Internet HTTP

Web Browsers





Browsers, Services Domain

Applications RMDB

iOS Android HTML5

Browsers Apps API

Cloud NoSQL


Page 5: L20 Scalability

Motivation▪ Requirements of 21st century web systems– High availability– Millions of simultaneous users– Peak load of 1000s tx/sec▪ Example– What if we need to handle load of 20.000 tx/sec?– That’s 1.2 million tx per minute

Page 6: L20 Scalability

Session State

Page 7: L20 Scalability

Business Transactions▪ Transactions that expand more than one request– User is working with data before they are committed to the database• Example: User logs in, puts products in a shopping cart, buys, and

logs out– Where do we keep the state between transactions?

Login Catalogsearch

List of results


put into cart


Page 8: L20 Scalability

State▪ Server with state vs. stateless server– Stateful server must keep the state between requests▪ Problem with stateful servers– Need more resources, limit scalability

Client 1

Client 2

Client 3

Stateful Server Stateless Server

Client 1

Client 2

Client 3

Data 1

Data 2

Data 2

Page 9: L20 Scalability

Stateless Servers▪ Stateless servers scale much better▪ Use fewer resources

▪ Example:– View book information– Each request is separate▪ REST was designed to be stateless

Page 10: L20 Scalability

Stateful Servers▪ Stateful servers are the norm▪ Not easy to get rid of them

▪ Problem: they take resources and cause server affinity▪ Example:– 100 users make request every 10 second, each request takes 1

second– One stateful object per user– Object are Idle 90% of the time

Page 11: L20 Scalability

Session State▪ State that is relevant to a session– State used in business transactions and belong to a specific client– Data structure belonging to a client– May not be consistent until they are persisted▪ Session is distinct from record data– Record data is a long-term persistent data in a database – Session state might en up as record data

Page 12: L20 Scalability

Question: Wheredoyoustorethesession?


Page 13: L20 Scalability

Ways to Store Session State▪ We have three players– The client using a web browser or app– The Server running the web application and domain– The database storing all the data

Client Server Database

Page 14: L20 Scalability

Ways to Store Session State▪ Three basic choices– Client Session State– Server Session State– Database Session State

Client Server Database

Page 15: L20 Scalability

Client Session StateStore session state on the client

▪ How It Works– Desktop applications can store the state in memory– Web solutions can store state in cookies, hide it in the web page, or

use the URL– Data Transfer Object can be used– Session ID is the minimum client state– Works well with REST - Representational State Transfer

Page 16: L20 Scalability

Client Session State▪ When to Use It– Works well if server is stateless– Maximal clustering and failover resiliency ▪ Drawbacks– Does not work well for large amount of data– Data gets lost if client crashes– Security issues

Page 17: L20 Scalability

Server Session StateStore session state on a server in a

serialised form

▪ How It Works– Session Objects – data structures on the server keyed to session Id▪ Format of data– Can be binary, objects or XML▪ Where to store session– Memory, application server, file or local or in-memory database

Page 18: L20 Scalability

Server Session State▪ Specific Implementations– HttpSession – Stateful Session Beans – EJB▪ When to Use It– Simplicity, it is easy to store and receive data▪ Drawbacks– Data can get lost if server goes down– Clustering and session migration becomes difficult– Space complexity (memory of server)– Inactive sessions need to be cleaned up

Page 19: L20 Scalability

Database Session StateStore session data as committed data in the database

▪ How It Works– Session State stored in the database– Can be stored as temporary data to distinguish from committed

record data▪ Pending session data– Pending session data might violate integrity rules– Use of pending field or pending tables• When pending session data becomes record data it is save in the

real tables

Page 20: L20 Scalability

Database Session State▪ When to Use It– Improved scalability – easy to add servers– Works well in clusters– Data is persisted, even if data centre goes down▪ Drawbacks– Database becomes a bottleneck– Need of clean up procedure of pending data that did not become

record data – user just left

Page 21: L20 Scalability

What about dead sessions?▪ Client session– Not our problem▪ Server session– Web servers will send inactive message upon timeout▪ Database session– Need to be clean up– Retention routines

Page 22: L20 Scalability

Caching▪ Caching is temporary data that is kept in memory between requests

for performance reasons– Not session data– Can be thrown away and retrieved any time▪ Saves the round-trip to the database▪ Can become stale or old and out-dated– Distributed caching (message driven cache) is one way to solve that

Page 23: L20 Scalability

Practical Example▪ Client session– For preferences,

user selections▪ Server session – Used for browsing and

caching– Logged in customer▪ Database– “Legal” session– Stored, trackable, need to survive between sessions

Page 24: L20 Scalability


A) ClientSessionState B) ServerSessionState C) DatabaseSessionState D) Nostaterequired


Page 25: L20 Scalability

Distribution Strategies

Page 26: L20 Scalability

Distributed Architecture▪ Distribute processing by placing objects on different nodes





Page 27: L20 Scalability

Distributed Architecture▪ Distribute processing by placing objects on different nodes▪ Benefits– Load is distributed between different nodes giving overall better

performance– It is easy to add new nodes– Middleware products make calls between nodes transparent

But is this true?

Page 28: L20 Scalability

Distributed Architecture▪ Distribute processing by placing objects different nodes

“This design sucks like an inverted hurricane” – Fowler

Fowler’s First Law of Distributed Object Design: Don't Distribute your objects!

Page 29: L20 Scalability

Remote and Local Interfaces▪ Local calls– Calls between components on the same node are local▪ Remote calls– Calls between components on different machines are remote▪ Objects Oriented programming– Promotes fine-grained objects

Page 30: L20 Scalability

Remote and Local Interfaces▪ Local call within a process is very, very fast▪ Remote call between two processes is order-of-magnitude s l o w e r– Marshalling and un-marshalling of objects– Data transfer over the network▪ With fine-grained object oriented design, remote components can kill

performance▪ Example– Address object has get and set method for each member, city,

street, and so on– Will result in many remote calls

Page 31: L20 Scalability

Remote and Local Interfaces▪ With distributed architectures, interfaces must be course-grained– Minimising remote function calls▪ Service Architecture has to have course-grained APIs and combine

several objects– Avoid fine-grained interfaces▪ Example– Instead of having getters and setters for each field, bulk assessors

are used

Page 32: L20 Scalability

Distributed Architecture▪ Better distribution model (X scaling)– Load Balancing or Clustering the application involves putting

several copies of the same application on different nodes





Page 33: L20 Scalability

Where You Have to Distribute▪ As architect, try to eliminate as many remote call as possible– If this cannot be archived choose carefully where the distribution

boundaries lay▪ Distribution Boundaries– Client/Server– Server/Database– Web Server/Application Server– Separation due to vendor differences– There might be some genuine reason

Page 34: L20 Scalability

Optimizing Remote Calls▪ We know remote calls are expensive▪ How can we minimize the cost of remote calls?▪ The overhead is– Marshaling or serializing data– Network transfer▪ Put as enough data into the call– Course grained call– Use binary protocols – avoid XML

Page 35: L20 Scalability

How to Model Services

Page 36: L20 Scalability

Term microservices is sometimes used, but is misleadingHas nothing to do with lines of code

How big is a service?

Example definition:

Balance between integration points and size

Time: Can be rewritten in one iteration (2 weeks)Features: All things that belong together

Page 37: L20 Scalability

Loose CouplingWhen services are loosely coupled, a change in one service should not require a change in another

A loosely coupled service knows as little about the services with which it collaborates

Source: Building Microservices

Page 38: L20 Scalability

High CohesionWe want related behaviour to sit together, and unrelated to sit elsewhere

Group together stuff the belongs together, as in SRP

If you want to change something, it should change in one place, as in DRY

Source: Building Microservices

Page 39: L20 Scalability

Bounded ContextConcept that comes from Domain-driven Design (DDD)

Any given domain contains multiple bounded contexts, and within each are “models” or “things” (or “objects”)

that do not need to be communicated outside

that are shared with other bounded contexts

The shared objects are define the explicit interface to the bounded context

Source: Building Microservices

Page 40: L20 Scalability

Bounded Context

Source: Martin Fowler, BoundedContext


Page 41: L20 Scalability

The Right Balance▪ In Service Architecture, we want to split by functionality (Y Scaling)– Boundaries must be well designed – objects that work together are

grouped together– APIs must be sufficiently course grained

Page 42: L20 Scalability

The Scale Cube

Page 43: L20 Scalability

Scaling the application▪ Today’s web sites must handle multiple simulations users▪ Examples:– All web based apps must handle several users– mbl.is handles >200.000 users/day– Betware must handle up to 100.000 simultaneous users and 1,2

million tx/min for terminal system peak load

Page 44: L20 Scalability
Page 45: L20 Scalability

The World we Live in▪ Average number of tweets per day 500 million▪ Total number of minutes spent on Facebook each month

700 billion▪ SnapChat has 100 million daily active users who send 1

billion snaps each day▪ Instagram has over 200 million users on the platform

who send 60 million photos per day▪ Number of messages sent by WhatsApp: 30 billion

Page 46: L20 Scalability

Scalability▪ Scalability is the ability of a system, network, or process to handle a

growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth

▪ With more load, how does the load of the system vary?

Page 47: L20 Scalability

Scalability▪ Scalability is the measure of how adding resource (usually hardware)

affects the performance– Vertical scalability (up) – increase server power– Horizontal scalability (out) – increase the servers▪ Session migration – Move the session for one server to another▪ Server affinity– Keep the session on one server and make the client always use the

same server

Page 48: L20 Scalability

Scalability▪ How is the system growth pattern – what is the formula?

Page 49: L20 Scalability

Scaling ApplicationsIn the Internet world you want to build web sites that gets lots of users and massive hit per second

But how can you cope with such load?

Browser HTTPServer Application Database

Page 50: L20 Scalability

The Scaling Problem▪ We need to handle number of request to our system▪ There are two ways to scale:– Vertically or scale up: Add more capacity to your hardware, more memory

for example– Horizontal or scale out: Add more machines

Page 51: L20 Scalability

Scaling Up▪ This is the traditional approach for many monolithic systems▪ Use a big powerful system▪ Pros:– Easy to do, easy to understand– One memory space and one database▪ Cons:– Has very hard limits– Does not work for the 21st century requirements

Page 52: L20 Scalability

Scaling Out (X scaling)▪ This can work for monolithic systems if the database requirements is

not high▪ Use a many machines and distribute the load– Have one big powerful database▪ Pros:– Scales well – handles much more load– Shared database▪ Cons:– Session management is a challenge– Database is a bottleneck

Page 53: L20 Scalability

Scale Cube

X scaling: duplicate the system

Z scali

ng: Part

ition th

e data

Y sc


g: P







Page 54: L20 Scalability

Load Distribution▪ Use number of machines to handle requests▪ Load Balancer directs all

request to particular server– All requests in one session go

to the same server– Server affinity▪ Benefits– Load can be increased– Easy to add new pairs– Uptime is increased▪ Drawbacks– Database is a bootleneck

Page 55: L20 Scalability

Clustering▪ With clustering, servers

are connected together as they were a single computer– Request can be handled

by any server– Sessions are stored on

multiple servers– Servers can be added and

removed any time▪ Problem is with state– State in application servers reduces scalability– Clients become dependant on particular nodes

Page 56: L20 Scalability

Clustering State▪ Application functionality– Handle it yourself, but this is complicated, not worth the effort▪ Shared resources– Well-known pattern (Database Session State)– Problem with bottlenecks limits scalablity▪ Clustering Middleware– Several solutions, for example JBoss, Terracotta▪ Clustering JVM or network– Low levels, transparent to applications

Page 57: L20 Scalability

Scalability Example

Page 58: L20 Scalability

Scalability Example

Page 59: L20 Scalability

Amdahl’s Law

Page 60: L20 Scalability

Amdahl’s Law▪ This law is used to find the maximum expected improvement to an

overall system when only part of the system is improved▪ In parallel computing, it states that a small portion of the program

which cannot be parallelized will limit the overall speed-up available from parallelization

Page 61: L20 Scalability

Amdahl’s Law▪ Amdahl’s law for overall speedup

1 Overall speedup = F (1 – F) + S

F = The fraction enhanced S = The speedup of the enhanced fraction

If we make 20% of the program be 10x faster F=0.2 S=10

1 overall speedup = 0.2 (1 – 0.2) + 10 Gives 1.22 in overall speedup

IF S = 1000, overall speedup is 1.25

Page 62: L20 Scalability

Amdahl’s Corollary▪ Make the common case fast– Common case being defined as “most time consuming”

40% 10x faster => 1.5625

20% 100x faster => 1.2468

Page 63: L20 Scalability

The Optimization Process▪ There is only one way to test scalability: Measure– Find the bottleneck (the common case)– Hypothesize about improvement– Make optimization – change only one thing a time– Measure again and repeat

Page 64: L20 Scalability

Eventual Consistency

Page 65: L20 Scalability

Transactions▪ Transaction is a bounded sequence of work– Both start and finish is well defined– Transaction must complete on an all-or-nothing basis▪ All resources are in consistent state before and after the transaction▪ Example: Database transaction– Withdraw data from account– Buy the product – Update stock information▪ Transactions must have ACID properties

Page 66: L20 Scalability

ACID properties▪ Atomicity– All steps are completed successfully – or rolled back▪ Consistency– Data is consistent at the start and the end of the transaction▪ Isolation – Transaction is not visible to any other until that transaction commits

successfully▪ Durability– Any results of a committed transaction must be made permanent

Page 67: L20 Scalability

Transactional Resources▪ Anything that is transactional– Use transaction to control concurrency– Databases, printers, message queues▪ Transaction must be as short as possible– Provides greatest throughput– Should not span multiple requests– Long transactions span multiple request

Page 68: L20 Scalability

Transaction Isolations and Liveness▪ Transactions lock tables (or resources) – Need to provide isolation to guarantee correctness– Liveness suffers– We need to control isolation▪ Serializable Transactions– Full isolation– Transactions are executed serially, one after the other– Benefits: Guarantees correctness– Drawbacks: Can seriously damage liveness and performance

Page 69: L20 Scalability

Isolation Level▪ Problems can be controlled by setting the isolation level– We don’t want to lock table since it reduces performance– Solution is to use as low isolation as possible while keeping


Page 70: L20 Scalability

Problem▪ Serialization crates scalability bottlenecks▪ Applications that support fully secure serialization of using RMDB

have hard time with scale▪ Can we scarify something?– Can we relax these requirements?

Page 71: L20 Scalability

CAP Theorem▪ States that it is impossible for a distributed computer system to

simultaneously provide all three of the following guarantees:– Consistency: all nodes see the same data at the same time– Availability: a guarantee that every request receives a response

about whether it was successful or failed– Partition tolerance: the system continues to operate despite

arbitrary message loss or failure of part of the system

Page 72: L20 Scalability
Page 73: L20 Scalability

ACID vs. BASE▪ BASE: Basically Available, Soft state, Eventual consistency▪ Basically Available: Guarantees availability of the database▪ Soft state: The state of the system can change over time - even without

input.▪ Eventual consistency: The system will eventually become consistent

over time given no new input

Page 74: L20 Scalability

ACID vs. BASE▪ The difference has more to do with synchronous and asynchronous

messaging▪ For large scale systems asynchronous caters for the fastest and least

restricted workflow

Page 75: L20 Scalability

Asynchronous▪ Eventual Consistency example

WebLayerRequests Approve RMDB



Page 76: L20 Scalability

Measuring Scalability▪ The only meaningful way to know about system’s performance is to

measure it▪ Performance Tools can help this process– Give indication of scalability– Identify bottlenecks

Page 77: L20 Scalability

Example tool: LoadRunner

Page 78: L20 Scalability

Example tool: JMeter

Page 79: L20 Scalability

Summary▪ Requirements of 21st century web applications– Availability, Eventual consistency▪ Session State– Client, Server, Database▪ Distribution Strategies– Don’t distribute fine grained object – identify bouneries▪ The Scale Cube▪ Eventual Consistency– CAP Theorm▪ Real World Example