lecture 2: basic enterprise architecture modules and patterns gustavo alonso systems group computer...

Lecture 2:Basic Enterprise Architecture Modules and PatternsGustavo AlonsoSystems GroupComputer Science DepartmentSwiss Federal Institute of Technology (ETHZ)[email protected]://www.iks.inf.ethz.ch/

©Gustavo Alonso, ETH Zürich. 2

Contents Communication

Synchronous –blocking- interaction Asynchronous –non blocking- interaction Batch transfer

Additional modules Name and directory services Persistence Security Transactions Routing and Filtering

Examples of use of patterns Hardware fault tolerance patterns Software fault tolerance patterns Performance patterns

Synchronous and Asynchronous Interaction


Synchronous interaction Blocking communication (the caller waits until the called responds)

Communication modeled as Request Response

Good match for programming language modularity: Request is a method call Response is the return of a method call Programming model does not change Matches the semantics of programming language (parameters, variables,

methods) program control flow

inter process communication• local• LAN• WAN


Properties of the synchronous pattern

Advantages Tightly coupled interaction

Speed Simplicity

Simple to understand for developing and debugging

Close match to programming languages (RPC, RMI)

Easy to define interfaces between the interacting parties

Disadvantages Tightly coupled interaction

Reduced fault tolerance Introduces distributed dependencies Makes maintenance and upgrading more complex

Too simple to allow realistic interactions (must be extended with other patterns)


Fault tolerance in tight coupling Quick review of basics:

Probability of event A = P(A) Mean time to event A = 1/P(A) (memoryless, small P(A)) For several, independent events, the mean time to the

occurrence of any of those events = 1/(P(A) + P(B) + P(C))

With tight coupling (client = caller; server = called) MTTF client = C MTTF server = S MTTF system = 1/(1/C + 1/S) If C = S, then MTTF system = C/2

With tight coupling, the reliability is reduced by half (assuming equal failure probabilities for each component).

With N components, MTTF = C/N


Simple example Probability that the client fails .01 in a day Probability that the server fails .001 in a day

Mean Time To Failure: Client = 100 days Server = 1000 days

Mean Time To Failure of Client-Server system = 90.9 days

In a client server system, the overall availability will be less than the availabilities of the client and the server


Asynchronous interactionAsynchronous

Non-blocking communication (neither caller nor called must wait)

Communication modeled as messages events

Not a good match for existing programming languages Streams rather than calls Asynchronous control flow Impedance mismatch

program control flow

inter process communication• local• LAN• WAN


Properties of the asynchronous pattern

Advantages Loosely coupled

Makes interacting parties independent Messages can be processed in flight

Easier to implement reliable delivery

Additional functionality can be implemented in the messaging system rather than on the communication ends.

Disadvantages Loosely coupled

Requires messaging system Overhead Impedance mismatch

Communication is explicit (send, receive, forward)

Interaction is more complex and involves more elements: more difficult to trace, monitor, and debug


Architectural possibilities of Async. Pat.

One to many

Many to one

workflow

persistent messaging

messageforwarding


Batch transfer Batch transfer is a form of asynchronous communication used for:

Large amounts of data File based exchanges (ftp) Data collections Batch update jobs Data uploads

We will not mention it too often during the course but keep in mind that for certain tasks, batch transfer is the best solution and that it complements the other two: Synchronous = parameters Asynchronous = messages Batch = files, collections, …

Additional architectural modules


Name and directory service Most basic extension to the synchronous interaction pattern

Avoid having to name the destination Ask where destination is Then bind to destination

Advantages: Development is independent of deployment properties (e.g., network

address) More flexibility:

• Change of address Can be combined with:

• Load balancing• Monitoring• Routing• Advanced service search

Name and directory service

1. register2. lookup

3. address

4. request

5. response


Persistence Persistence is used in all patterns to ensure reliability and recoverability

Persistence keeps a record on stable storage of the relevant state changes of a system

Can be implemented On file system On databases

Persistence does not change the interaction or the nature of the architecture but it does confer properties that are important for fault tolerance

Persistence is typically expensive but often unavoidable and necessary

persistent messaging

persistent objects

logging


Security Security has many aspects:

Authentication Authorization Confidentiality …

Sometimes it involves patterns: Authorization (credentials, log in, certificates)

Other times it is part of the infrastructure: Cells Domains Controls in the message layer

In the enterprise, security is very important but does not figure prominently in the architecture as it is assumed it is built in in the interactions (this leads to several problems …)


Transactions Transactions establish guarantees on interactions:

Atomicity: all or nothing Recoverability: ability to recall what happened and reconstruct a previous

state of the system

Implemented through an additional module Keeps track of transactions Runs transactional protocols

Transaction manager

1. Begin2. Txnal. context

3. request

4. request

5. request

6. commit

7. 2 phasecommit

7. 2 phasecommit


Routing and filtering Routing allows to direct calls to the most appropriate service. It works for both sync. and async. patterns Routing can be based on:

Performance (load balancing) Availability (what works) Contents (e.g., price value) …

Filtering is similar to routing but may also involve: Eliminating messages or calls (incorrect data) Modifying messages o calls (to extend the data or adapt it to a new

interface) Sorting and prioritizing

router

Organizing the architectural modules


What is common to all of them? All these additional modules have one aspect in common:

They involve introducing an additional module layer where the new functionality is available

Why as a module or additional layer: optional use can be added to already existing systems without changing them much

When all these modules are taken together, homogenized and included in a single platform, the result is an enterprise middleware tool.


Module proliferation Starting from the simplest pattern (synchronous interaction), adding any new

functionality implies additional modules: Name and directory service Transactions Security …

Historically these modules have been added in an increasingly structured manner: Ad-hoc, code level compatibility (e.g., RPC DCE) Model specific, specification level compatibility (e.g., CORBA) Model independent, specification level compatibility (e.g., Web Services)

The transition from 2 Tier Architectures to 3 Tier Architectures also happened as a result of attempts to organize the additional modules a 2 Tier Architecture needed anyway.


A historical tour of architectural modules

Middleware platforms were traditionally built around one or two key design decisions (transactions = TP Monitors, transactional OO design = Object Monitors, persistence).

Different platforms and products were conceptually similar but incompatible at all levels

Because conceptually they were all very similar, some systems were used because of the overlapping functionality, not because of the key aspects of a system (e.g., CORBA reinvented the wheel in many areas)

RPCName

services persistence

security

transactions

Runtime engine


EAI in the 80’s - 90’s The proliferation of:

Products Functionality Systems Services

… led (leads) to a wildly heterogeneous mix of platforms, models, interfaces, and technologies

With the transition from 2-Tier to 3-Tier, the advent of faster networks and eventually the Internet, the need to make it all work together increased significantly

Hence the need for Standardization Enterprise Architecture

RPCName

services persistence

security

transactions

Runtime engine

RPC

Name

services

persistence

security

transactions

Runtime

engineRPC

Name servicespersistence

securitytransactions

Runtime engine

RPCName services persistence

security

transactionsRuntime

engine

Examples of the use of patterns• Hardware fault tolerance


Hardware fault tolerance Enterprise system require a high degree of reliability and fault

tolerance This can be achieved through

Hardware (high end machines, RAID systems) Software (architectural patterns)

We start with hardware patterns to illustrate the basic principles and to show why certain hardware is always needed to guarantee certain levels of fault tolerance


Key concepts Modularity

Separates functionality in black boxes Modules can be made redundant

Failfast Clean failure semantics Detects failures and stops (failfast), or Forwards only results from working modules (failvote)

Recovery Repairing a faulty module after the failure Mean Time To Repair

We assume memory-less systems, independent failures, and small probabilities.


Failfast patterns Pairing or duplexing

Two modules Compare outputs If they disagree, stop (failure detected)

Can be generalized to N modules Works as long as a majority of modules work Output is output of the majority of modules No majority = failure (stop)

Triple Module Redundancy (TMR) Using 3 modules

Recursive failfast patterns Used to detect comparator failures Reduced MTTF

comparator

comparator

comparator comparator



Simple analysis (failvote) I

Simple pair MTTF module = 10 years MTTF pair = 5 years Stops as soon as there is no majority working (the important thing is that it

stops)

For triplex = 8.3 years

comparator

Redundant pair With failvote, MTTF does not improve because the system fails as soon as one

of the two pairs fails MTTF module = 10 years MTTF pair = 5 years MTTF redundant pair = 2.5 years

The redundant pair tolerates failures in the connectors and comparators




Simple analysis (failvote) II Why doing this, then?

MTTF decreases Significant redundancy needed

The reasons are: Failfast is important as it provides clean semantics Differences between transient and permanent failures

These patterns can mask transient failures by simply signaling an invalid output Ratio 1:100 hard/soft errors MTTF simple pair = 500 years MTTF red. pair = 250 years MTTF pair and spare = 375 y.

Failfast (instead of Failvote) will increase the MTTF as the number of modules increases

Pair and spare MTTF module = 10 years MTTF system = 7.5 years (calculate

the probability that any of the 4 redundant pair modules fails, 2.5 years, then the probability that any of the remaining two modules fail, 5 years, total 2.5 + 5)



OR OR


High availability through recovery High availability is achieved when the failfast patterns are

combined with recovery of failed modules Mean Time To Repair = time between a failure and the module

working again. Probability one module is unavailable MTTR/(MTTR+MTTF) If MTTF >> MTTR, then MTTR/(MTTR+MTTF) ->

MTTR/MTTF Probability failure of a redundant system with n modules:

(n/MTTF)*(MTTR/MTTF)^(n-1) MTTF for such a system is then

(MTTF/n)*(MTTF/MTTR)^(n-1)

Assume modules with MTTF = 1 year, MTTR = 4 hours MTTF simple pair = 1095 years MTTF triplex = 1’600’000 years

Examples of the use of patterns• Software fault tolerance


Notation (IBM patterns) We will mostly use a notation proposed by IBM to describe patterns. Types of patterns

Business patterns: describe the interaction at a high level Integration patterns: describe the way systems can be connected Composite patterns: combination of business and integration patterns Application patterns: logical components that make up a solution and the

way they interact Runtime patterns: refinement of the application pattern mapping logical

components to physical run-time nodes Product mappings: map the runtime and application patterns into concrete

products implementing the necessary functionality

From IBM Patterns for e-business


Making a simple system highly available

Assume a simple interaction: User outside firewall (e.g., browser over internet) Presentation and Application are “local” Synchronous interaction IBM = stand alone single channel application pattern

The question is how to make it highly available We will do this by progressively introducing patterns and layers each one

conferring the system a new property

Figures from “Patterns for the edge of network”. Voegeli & Braswell - IBM Redbook, Nov. 2002


Rules for high availability Rules are similar to the ones described for hardware fault tolerance

Redundancy There must be a replacement for every module that can fail This implies modularity (as in hardware)

Monitoring for failure Detecting that a failure has occurred This implies some sort of comparator (as in hardware) Failures are also software (exceptions, error codes)

Suppressing failed entities Once a module is determined to be faulty, it should be removed It implies a awareness of all member modules and their status Unlike in hardware, membership can be dynamic


Basic auxiliary modules There are several options to group

modules so that they provide redundancy: High availability pair

• Primary-back up: one module does the work, the other is in stand-by in case of failures

• Peer pairs: both modules work in parallel and monitor each other

Cluster: several modules running on a set of parallel entities (processes, machines), typically no cross monitoring and not aware of each other

Pool: A special type of cluster where the modules are threads residing in a single machine

Load balancer: module that is aware of all modules in a cluster or pool and is in charge of Monitoring Distributing jobs Suppressing failed modules


Basic pattern (no high availability)• Outside world = Internet• DMZ = “demilitarized zone” internal to the company but not trusted (no confidential material reachable from the outside)


Option 1: single load balancer The single load balancer distributes requests to two application

servers The application servers implement the presentation and

application layers for the application Provides redundancy for application server Scalability is achieved by adding more servers to the cluster No redundancy for load balancer


Option 2: hot standby load balancer To improve over option 1, one can introduce a hot standby back up

for the load balancer This is a primary/back-up pair where the second load balancer is

not active but ready to take over in case of failure (failure detection by heartbeat exchanged between the load balancers)


Option 3: mutual high availability Two load balancers monitoring each other (heartbeat) Each one with its own cluster Take over (aliasing of IP address) if one load balancer fails The system is now highly available and scalable but it is also more

complex


Option 4: wide area load balancing Like Option 2 but with load balancers being able to forward load

to remote load balancers It balances entire sites rather than modules Provides high availability for site failures

Examples of the use of patterns• Performance patterns


Performance For our purposes here, the performance of a system can be improved by:

Adding resources: add more modules so that more requests can be processed in parallel (redundancy)

Lower the workload: Organizing the architecture so that certain operations take less work to complete (caching, specialization of modules)

Split the workload and parallelize: Divide the tasks into sub-components and organize the architecture so that it is possible to execute some of these sub-tasks in parallel

The important aspect of these patterns is what they allow to do in terms of scalability as migration from one to the other is not easy (or not at all possible without redesigning everything). Once the architecture is fixed, so will many of the properties of the system


Starting pattern Simple web server application Two application servers, initially tying together web server and application

server functionality What are the architectural variations that will increase performance?


Specialization Separate the application server from the web server.

Web Server redirector: static HML content, request forwarding Application server: application specific functionality Calls to static content take less time, create less work

Security is increased by moving Application Servers behind domain firewall Makes it easier to add resources where bottleneck is (static content, dynamic

content, processing, etc.)


More specialization: Separation Separate the presentation layer from the application server layer (in essence,

turn the application server into a 2-Tier system) Presentation servers take care of tasks such as page formatting,

composing frames, generating HTML, etc. Application servers run the business logic

The advantages at the application server level are the same as the advantages of the 2-Tier model

Requires to be able to separate the presentation layer from the application layer


Lower workload: caching proxy Add a pure caching proxy (a module that only caches data, does not do

any processing or load balancing: it stores complete pages and matches them to the request. If there is a match, the page is returned without any further processing)

Reduced response time for content that can be cached (dynamic or static) Eliminates workload from the rest of the system Caches need to be maintained to avoid stale data


Lessons learned Some architectural designs can be captured in the form of patterns

Understanding these patterns is important to understand the properties of each architecture and to be able to make the right design decisions

There are many patterns and many possible combinations between them

Important as well is the cost of transitioning from one pattern to another: Adding proxy caches is relatively easy Splitting an application is difficult

lecture 2: basic enterprise architecture modules and patterns gustavo alonso systems group computer...

Documents