ibm labs in haifa copyright 2000-2003 ibm corporation advanced web applications development technion...

110
IBM Labs in Haifa Copyright 2000- 2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March 2003

Upload: rolf-crawford

Post on 01-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Advanced Web Applications Development

Technion CS 236606 Spring 2003, Class 2

Eliezer Dekel

March 2003

Page 2: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

2 Copyright 2000-2003 IBM Corporation

Copyright 2000-2003 IBM Corporation Material is based on original by Dr. Alfred Spector & Dr. Jeffrey

Eppinger Updated by Eliezer Dekel

Page 3: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

3 Copyright 2000-2003 IBM Corporation

Table of Contents

Module A-1: Introduction Module A-2: Multi-tier Architectures Module A-3: Application Taxonomy Module A-4: Requirements of Web Applications Module A-5: Techniques for Scaling Module A-6: Caching and Replication Module A-7: An Example of Replication: Weighted Voting Module A-8: Load Balancing Module A-9: Failure Detection Module A-10: Achieving Availability with Malleability

Page 4: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

IntroductionModule A-1

Page 5: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

5 Copyright 2000-2003 IBM Corporation

Complex heterogeneous infrastructures are a reality!

Director Director and Security and Security

ServicesServicesExistingExisting

ApplicationsApplicationsand Dataand Data

BusinessBusinessDataData

DataDataServerServerWebWeb

ApplicationApplicationServerServer

Storage AreaStorage AreaNetworkNetwork

BPs andBPs andExternalExternalServicesServices

Inte

rne

t F

ire

wall

Inte

rne

t F

ire

wall

WebWebServerServer

DNSDNSServerServer

DataData

Cach

eC

ach

e

Lo

ad

Bala

nce

rLo

ad

Bala

nce

r

Inte

rne

t F

ire

wall

Inte

rne

t F

ire

wall

Dozens of systems and applications

Hundreds of components

Thousands of tuning

parameters

Page 6: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

6 Copyright 2000-2003 IBM Corporation

On

e of th

e Data C

enters (500 servers)

C is c o 7 0 0 0

ICPMSCOMC7501

C is c o 7 0 0 0

ICPMSCOMC7502

C a ta lyst5 0 0 0

ICPMSCOMC5001(MSCOM1)

ATM0/0/0.1

FE4/0/0Port 1/1

HSRP

FE4/1/0 FE4/1/0

HSRP

Port 2/1 Port 2/1C a ta lyst

5 0 0 0

ICPMSCOMC5002(MSCOM2)

FE4/0/0

ATM0/0/0.1

Port 1/1

C is c o 7 0 0 0

ICPMSCOMC7503

C a ta lyst5 0 0 0

ICPMSCOMC5003(MSCOM3)

ATM0/0/0.1

FE4/0/0Port 1/1

HSRP

FE4/1/0 FE4/1/0

HSRP

Port 2/1 Port 2/1 C a ta lyst5 0 0 0

ICPMSCOMC5004(MSCOM4)

FE4/0/0

ATM0/0/0.1

Port 1/1

C is c o 7 0 0 0

ICPMSCOMC7504

SD

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SYSTEMS

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

AC AC

48V DC 48V DC

5VDC OK 5VDC OK

SHUTDOWN SHUTDOWN

CAUTION:Double Pole/neutral fusing CAUTION:Double Pole/neutral fusingF12A/250V F12A/250V

ASX-1000

B DB DB D B D

A CA CA CA C

SD

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SYSTEMS

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

SER

ETHN

EXTSELECT

RESET

TXCRXL

PWR

AC AC

48V DC 48V DC

5VDC OK 5VDC OK

SHUTDOWN SHUTDOWN

CAUTION:Double Pole/neutral fusing CAUTION:Double Pole/neutral fusingF12A/250V F12A/250V

ASX-1000

B DB DB D B D

A CA CA CA C

ICPMDISTFA1001 ICPMDISTFA1002

3A2 2A2

2A2

1A2

ATM0/0/0.1

4A2

ATM0/0/0.1

4A2

1A2

C is c o 7 0 0 0

ICPMSCOMC7505

Catalyst 2926

ICPMSFTDLC2921(MSCOM DL1)

Port 1/1

FE4/0/0

HSRP

C is c o 7 0 0 0

ICPMSCOMC7506

Catalyst 2926

ICPMSFTDLC2922(MSCOM DL2)

Port 1/1

FE5/0/0

HSRP

Port 1/2Port 1/2

FE4/0/0

HSRP

FE5/0/0

HSRP

IIS

IIS

IIS

IIS

IIS

IIS

CPMSFTWBW26CPMSFTWBW28CPMSFTWBW30

CPMSFTWBW37CPMSFTWBW38CPMSFTWBW39

WWW.MICROSOFT.COMWWW.MICROSOFT.COM

CPMSFTWBW24CPMSFTWBW31CPMSFTWBW32CPMSFTWBW33CPMSFTWBW34

CPMSFTWBW35CPMSFTWBW40CPMSFTWBW41CPMSFTWBW42CPMSFTWBW43

SEARCH.MICROSOFT.COM

CPMSFTWBS01CPMSFTWBS02CPMSFTWBS03CPMSFTWBS04CPMSFTWBS05CPMSFTWBS06CPMSFTWBS07CPMSFTWBS08CPMSFTWBS09

CPMSFTWBS10CPMSFTWBS11CPMSFTWBS12CPMSFTWBS13CPMSFTWBS14CPMSFTWBS15CPMSFTWBS16CPMSFTWBS17CPMSFTWBS18

WWW.MICROSOFT.COM

CPMSFTWBW08CPMSFTWBW13CPMSFTWBW14CPMSFTWBW29

CPMSFTWBW36CPMSFTWBW44CPMSFTWBW45

WWW.MICROSOFT.COM

CPMSFTWBW01CPMSFTWBW15CPMSFTWBW25

CPMSFTWBW27CPMSFTWBW46CPMSFTWBW47

REGISTER.MICROSOFT.COM

CPMSFTWBR03CPMSFTWBR04CPMSFTWBR05

CPMSFTWBR09CPMSFTWBR10

SUPPORT.MICROSOFT.COM

CPMSFTWBT01CPMSFTWBT02

CPMSFTWBT03CPMSFTWBT07

CPMSFTWBT04CPMSFTWBT05

WINDOWS.MICROSOFT.COM

CPMSFTWBY01CPMSFTWBY02

CPMSFTWBY03CPMSFTWBY04

WINDOWS98.MICROSOFT.COM

CPMSFTWBJ01

WINDOWSMEDIA.MICROSOFT.COM

PREMIUM.MICROSOFT.COM

CPMSFTWBP01CPMSFTWBP02

CPMSFTWBP03

SUPPORT.MICROSOFT.COM

CPMSFTWBT06CPMSFTWBT08

CPMSFTWBR07CPMSFTWBR08

CPMSFTWBR01CPMSFTWBR02CPMSFTWBR06

REGISTER.MICROSOFT.COM

WINDOWSMEDIA.MICROSOFT.COM WINDOWSMEDIA.MICROSOFT.COM

CPMSFTWBJ01CPMSFTWBJ02

CPMSFTWBJ03CPMSFTWBJ05

CPMSFTWBJ06CPMSFTWBJ07CPMSFTWBJ08

CPMSFTWBJ09CPMSFTWBJ10

CPMSFTWBJ06CPMSFTWBJ07CPMSFTWBJ08

CPMSFTWBJ09CPMSFTWBJ10

MSDN.MICROSOFT.COM

CPMSFTWBN01CPMSFTWBN02

CPMSFTWBN03CPMSFTWBN04KBSEARCH.MICROSOFT.COM

CPMSFTWBT40CPMSFTWBT41CPMSFTWBT42

CPMSFTWBT43CPMSFTWBT44

INSIDER.MICROSOFT.COM

CPMSFTWBI01 CPMSFTWBI02

3D2

C a ta lyst5 0 0 0

IUSCCMQUEC5002(COMMUNIQUE2)

C a ta lyst5 0 0 0

IUSCCMQUEC5001(COMMUNIQUE1)

C a ta lyst5 0 0 0

C a ta lyst5 0 0 0

ICPMSCBAC5001ICPMSCBAC5502

Port 1/1 Port 1/2Port 2/12

C is c o 7 0 0 0

ICPCMGTC7501

C is c o 7 0 0 0

ICPCMGTC7502

FE4/1/0

Port 1/1

FE4/1/0SQL

Microsoft.com SQL Servers

Microsoft.com Stagers,Build and Misc. Servers

FTP 6

Build Servers 32

IIS 210

Application 2

Exchange 24

Network/Monitoring 12

SQL 120

Search 2

NetShow 3

NNTP 16

SMTP 6

Stagers 26

Total 459

Microsoft.com Server Count

Drawn by: Matt GroshongLast Updated: April 12, 2000

IP addresses removed by J im Gray to protect security

CPMSFTSQLB05CPMSFTSQLB06CPMSFTSQLB08CPMSFTSQLB09CPMSFTSQLB14CPMSFTSQLB16CPMSFTSQLB18CPMSFTSQLB20CPMSFTSQLB21

Backup SQL Servers

CPMSFTSQLB22CPMSFTSQLB23CPMSFTSQLB24CPMSFTSQLB25CPMSFTSQLB26CPMSFTSQLB27CPMSFTSQLB36CPMSFTSQLB37CPMSFTSQLB38CPMSFTSQLB39

CPMSFTSQLA05CPMSFTSQLA06CPMSFTSQLA08CPMSFTSQLA09CPMSFTSQLA14CPMSFTSQLA16CPMSFTSQLA18CPMSFTSQLA20CPMSFTSQLA21CPMSFTSQLA22

Live SQL ServersCPMSFTSQLA23CPMSFTSQLA24CPMSFTSQLA25CPMSFTSQLA26CPMSFTSQLA27CPMSFTSQLA36CPMSFTSQLA37CPMSFTSQLA38CPMSFTSQLA39

IIS

IIS

IIS IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

Consolidator SQL Servers

CPMSFTSQLC02CPMSFTSQLC03CPMSFTSQLC06CPMSFTSQLC08CPMSFTSQLC16CPMSFTSQLC18CPMSFTSQLC20CPMSFTSQLC21CPMSFTSQLC22CPMSFTSQLC23

CPMSFTSQLC24CPMSFTSQLC25CPMSFTSQLC26CPMSFTSQLC27CPMSFTSQLC30CPMSFTSQLC36CPMSFTSQLC37CPMSFTSQLC38CPMSFTSQLC39

DOWNLOAD.MICROSOFT.COM DOWNLOAD.MICROSOFT.COM

HTMLNEWS(pvt).MICROSOFT.COM

CPMSFTWBV01CPMSFTWBV02CPMSFTWBV03

CPMSFTWBV04CPMSFTWBV05

CPMSFTWBD01CPMSFTWBD05CPMSFTWBD06

CPMSFTWBD07CPMSFTWBD08

CPMSFTWBD03CPMSFTWBD04CPMSFTWBD09

CPMSFTWBD10CPMSFTWBD11

ACTIVEX.MICROSOFT.COM

CPMSFTWBA02 CPMSFTWBA03

FTP.MICROSOFT.COM

CPMSFTFTPA03CPMSFTFTPA04

CPMSFTFTPA05CPMSFTFTPA06

NTSERVICEPACK.MICROSOFT.COM

CPMSFTWBH01CPMSFTWBH02

CPMSFTWBH03

HOTFIX.MICROSOFT.COM

CPMSFTFTPA01

ASKSUPPORT.MICROSOFT.COM

CPMSFTWBAM03CPMSFTWBAM04

CPMSFTWBAM01CPMSFTWBAM01

MSDNNews.MICROSOFT.COM

CPMSFTWBV21CPMSFTWBV22

CPMSFTWBV23

MSDNSupport.MICROSOFT.COM

CPMSFTWBV41 CPMSFTWBV42

NEWSLETTERS.MICROSOFT.COM

CPMSFTSMTPQ01 CPMSFTSMTPQ02

NEWSLETTERS

CPMSFTSMTPQ11CPMSFTSMTPQ12CPMSFTSMTPQ13CPMSFTSMTPQ14CPMSFTSMTPQ15

NEWSWIRE

CPMSFTWBQ01CPMSFTWBQ02CPMSFTWBQ03

Misc. SQL Servers

INTERNAL SMTP

CPMSFTSMTPR01CPMSFTSMTPR02

NEWSWIRE.MICROSOFT.COM

CPITGMSGR01 CPITGMSGR02

NEWSWIRECPITGMSGD01CPITGMSGD02CPITGMSGD03

OFFICEUPDATE.MICROSOFT.COM

CPMSFTWBO01CPMSFTWBO02

CPMSFTWBO04CPMSFTWBO07

PremOFFICEUPDATE.MICROSOFT.COM

CPMSFTWBO30CPMSFTWBO31

CPMSFTWBO32

SearchMCSP.MICROSOFT.COM

CPMSFTWBM03

SvcsWINDOWSMEDIA.MICROSOFT.COM

CPMSFTWBJ21 CPMSFTWBJ22

STATSCPITGMSGD04CPITGMSGD05CPITGMSGD07CPITGMSGD14CPITGMSGD15CPITGMSGD16CPMSFTSTA14CPMSFTSTA15CPMSFTSTA16

WINDOWS_Redir.MICROSOFT.COM

CPMSFTWBY05

COMMUNITIES

COMMUNITIES.MICROSOFT.COM

CPMSFTNGXA01CPMSFTNGXA02CPMSFTNGXA03

CPMSFTNGXA04CPMSFTNGXA05

CODECS.MICROSOFT.COM

CPMSFTWBJ16CPMSFTWBJ17CPMSFTWBJ18

CPMSFTWBJ19CPMSFTWBJ20

CGL.MICROSOFT.COM

CPMSFTWBG03CPMSFTWBG04CPMSFTWBG05

CPMSFTWBG04CPMSFTWBG05

CDMICROSOFT.COM

CPMSFTWBC01CPMSFTWBC02

CPMSFTWBC03

BACKOFFICE.MICROSOFT.COM

CPMSFTWBB01CPMSFTWBB03

CPMSFTWBB04

Build Servers

INTERNET-BUILDINTERNET-BUILD1INTERNET-BUILD2INTERNET-BUILD3INTERNET-BUILD4INTERNET-BUILD5INTERNET-BUILD6INTERNET-BUILD7INTERNET-BUILD8INTERNET-BUILD9INTERNETBUILD10INTERNETBUILD11INTERNETBUILD12INTERNETBUILD13INTERNETBUILD14INTERNETBUILD15INTERNETBUILD16

INTERNETBUILD17INTERNETBUILD18INTERNETBUILD19INTERNETBUILD20INTERNETBUILD21INTERNETBUILD22INTERNETBUILD23INTERNETBUILD24INTERNETBUILD25INTERNETBUILD26INTERNETBUILD27INTERNETBUILD30INTERNETBUILD31INTERNETBUILD32INTERNETBUILD34INTERNETBUILD36INTERNETBUILD42

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IIS

IISIIS

IIS IIS

SQL

SQL

SQL

SQL

SQLSQL

SQL

SQL

SQL

SQL

SQL

StagersCPMSFTCRA10CPMSFTCRA14CPMSFTCRA15CPMSFTCRA32CPMSFTCRB02CPMSFTCRB03CPMSFTCRP01CPMSFTCRP02CPMSFTCRP03

CPMSFTCRS01CPMSFTCRS02CPMSFTCRS03CPMSFTSGA01CPMSFTSGA02CPMSFTSGA03CPMSFTSGA04CPMSFTSGA07

PPTP / Terminal Servers

CPMSFTPPTP01CPMSFTPPTP02CPMSFTPPTP03CPMSFTPPTP04

CPMSFTTRVA01CPMSFTTRVA02CPMSFTTRVA03

CPMSFTSQLD01CPMSFTSQLD02CPMSFTSQLE01CPMSFTSQLF01CPMSFTSQLG01CPMSFTSQLH01CPMSFTSQLH02CPMSFTSQLH03CPMSFTSQLH04CPMSFTSQLI01CPMSFTSQLL01CPMSFTSQLM01CPMSFTSQLM02CPMSFTSQLP01CPMSFTSQLP02CPMSFTSQLP03CPMSFTSQLP04CPMSFTSQLP05CPMSFTSQLQ01CPMSFTSQLQ06

CPMSFTSQLR01CPMSFTSQLR02CPMSFTSQLR03CPMSFTSQLR05CPMSFTSQLR06CPMSFTSQLR08CPMSFTSQLR20CPMSFTSQLS01CPMSFTSQLS02CPMSFTSQLW01CPMSFTSQLW02CPMSFTSQLX01CPMSFTSQLX02CPMSFTSQLZ01CPMSFTSQLZ02CPMSFTSQLZ04CPMSFTSQL01CPMSFTSQL02CPMSFTSQL03

Monitoring Servers

CPMSFTHMON01CPMSFTHMON02CPMSFTHMON03

CPMSFTMONA01CPMSFTMONA02CPMSFTMONA03

Canyon Park Data CenterMicrosoft.com Network Diagram

Page 7: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Multi-Tier ArchitecturesModule A-2

Where it All Takes Place

Page 8: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

8 Copyright 2000-2003 IBM Corporation

Recall 2-tier Vs. n-tier Architecture

Client(Browser)

Tier 2Logic

Tier 3Logic

Client Database

Database

Tier 2Logic

Data

Data

2-tier

N-tier

Page 9: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

9 Copyright 2000-2003 IBM Corporation

Why 2-tier?

(Often called “Client-Server”, which is a bad name because it’s too general) Simple Better for dynamic queries Potentially more efficient (probably not in reality) Perhaps more processing off-loaded to client (for better or worse) Global data modeling is not practical

Page 10: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

10 Copyright 2000-2003 IBM Corporation

Examples of Two-,Three-,and Four-Tiered Infrastructures

Page 11: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

11 Copyright 2000-2003 IBM Corporation

Why n-tier?

Modularity via objects, not enterprise-wide data model “Thin” clients since “Fat” clients infeasible Security Replication of business logic easier Flexibility Performance (Due to flexibility) Manageability All data not in one data model All data not in one database brand Etc.

Page 12: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

12 Copyright 2000-2003 IBM Corporation

Even with n-tier, Databases Crucial

Databases need to have all functions required in 2-tier and more. Data model support Concurrency Control Security Integrity Performance Manageability Support for heterogeneity

Page 13: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

13 Copyright 2000-2003 IBM Corporation

Databases in a Heterogeneous World

There needs to be semantic consistency while using multiple databases Atomicity Consistency Isolation Durability

Transactions will be covered later It is desirable that there be interoperability of applications with multiple databases

Same API to access multiple databases And, ability to access multiple databases Hence, motivation for JDBC and ODBC

Page 14: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Application TaxonomyModule A-3

Characterizing Web Applications

Page 15: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

15 Copyright 2000-2003 IBM Corporation

Application Taxonomy

Applications typically made up of many interactions with a client How the application must be built depends on the type of interactions

that comprise it This seems trivial, but it is where all architecture starts All interactions are to varying degrees

Asynchronous or Synchronous Influencing all interactions are requirements for concurrency, throughput,

latency, ... Interactions are sometimes called “transactions,” though no specific

semantic properties are applied to the word transaction when used in this way.

Page 16: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

16 Copyright 2000-2003 IBM Corporation

Workload Characteristics

Application Functionality Types of Interaction - Inquiry (Static and Dynamic) vs. Transactions Volume of Transactions Volume of User-Specific Responses (Personalization) Amount of Cross-Session Info Transaction Complexity Data Volatility Integration with legacy systems

Usage Patterns Number of Unique Items Number of Page Views Volume of Dynamic Searches Transaction Volumes Swing

Infrastructure Constraints % Secure Pages (privacy) Security: Authentication, Integrity, Non-repudiation, Regulations

Page 17: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

17 Copyright 2000-2003 IBM Corporation

Types of Web Applications

Publish and Subscribe Web Portals such as yahoo.com, excite.com, Media Sites such as www.nfc.co.il, zdnet.com and Events such as www.usopen.org, www.wimbeldon.org

Shopping Exact Inventory Sites - Victoriassecret.com, Abercrombie.com Inexact Inventory Sites - buy.com, dvdexpress.com

Customer Self Service Home banking - bankone.com, wingspanbank.com Travel Sites - Travelocity Insurance - amica.com

Trading Online Brokerages - schwab.com, fidelity.com, etrade.com Auction Sites - ebay.com, priceline.com Games – Interactive group game servers

Page 18: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

18 Copyright 2000-2003 IBM Corporation

Workload Characteristics of Web Applications

Low Medium High

Transaction Volumes

Dynamic ContentDynamic Searches

User Specific Responses (Personalization)

Cross-session Information

Legacy Integration

Data Volatility

Transaction Volume Swings

Number of content Publishers/Sources

Number of Unique Items per page

Page Content Volatility

Number of Page Views

Security, Authentication etc.

Percentage of Secure Pages

Transaction Complexity

System Workload Characteristics Publish &Subscribe

Shopping CustomerSelf Svc.

Trading

Page 19: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

19 Copyright 2000-2003 IBM Corporation

Application Taxonomy: Read Transactions

Read-only transactions Highly static: X-Ray, Corporate Information Entertainment Video,

1990 Census Nearly static: Train Schedule, Catalog without quantities Dynamic: Weather Forecast, Catalog with quantities Dynamic with high consistency requirements: Account balance,

Catalog with quantities Dynamic data with high consistency and rapid update: rock concert

sales with assigned seating

Page 20: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

20 Copyright 2000-2003 IBM Corporation

Application Taxonomy: Update Transactions

Update w/ modest integrity: Amazon book comment Update w/ high integrity: Billing record Update w/asynchronous processing: Stock Trade Update w/loosely coupled processing: Buying a physical product over

the net, or ordering/provisioning a new ISDN line

Page 21: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

21 Copyright 2000-2003 IBM Corporation

Issues

It is the type of applications along the read-only and update dimensions that greatly impact How applications are architected What system support is needed

For each of the previous examples, it is worth considering the implications

Page 22: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Requirements of Web ApplicationsModule A-4

Page 23: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

23 Copyright 2000-2003 IBM Corporation

Requirements - Summary

Availability Scalability Security Performance Integrity Manageability Malleability/Longevity Integration Cost

Page 24: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

24 Copyright 2000-2003 IBM Corporation

Availability

Defined as measurement of perceived uptime by a user There are 86,400 seconds in a day (~100,000) 31,536,000 seconds in a

year (~30 million) 99% uptime represents 1% downtime is

864 seconds/day or 14.4 minutes/day 315,360 seconds/year or 5256 minutes/year or 88 hours/year

99.99%53 minutes/year or 0.14 minutes/day)

99.999%5 minutes/year

99.99999% (7 nines)3 seconds/year

99.9999%30 seconds/year

Percentage UptimeDowntime

Page 25: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

25 Copyright 2000-2003 IBM Corporation

Availability - Discussion

What do you see on the web? Why? What will be required in the future?

Page 26: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

26 Copyright 2000-2003 IBM Corporation

In the News

Source: Gartner Group

Page 27: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

27 Copyright 2000-2003 IBM Corporation

Downtime Costs (per Hour)

Brokerage operations $6,450,000 Credit card authorization $2,600,000 Ebay (1 outage 22 hours) $225,000 Amazon.com $180,000 Package shipping services $150,000 Home shopping channel $113,000 Catalog sales center $90,000 Airline reservation center $89,000 Cellular service activation $41,000 On-line network fees $25,000 ATM service fees $14,000

Sources: InternetWeek 4/3/2000 + Fibre Channel: A Comprehensive Introduction, R. Kembel 2000, p.8. ”...based on a survey done by Contingency Planning Research."

Page 28: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

28 Copyright 2000-2003 IBM Corporation

September 11, 2001

Only 15% of the companies in the World Trade Center had a working business continuity plan

One Law firm did not have a backup outside of the building – it went out of business

One of the trading firms was able to successfully, immediately transition over to a backup site across the river with absolutely no interruption to their customers

An investment bank had only a tape backup. It took them four days to recover

Page 29: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

29 Copyright 2000-2003 IBM Corporation

Scalability

The capability of a system to adapt readily to a greater or lesser intensity of use, volume, or demand while still meeting its business objectives (acceptable levels of performance, availability, manageability etc.)

Ideal - Gracefully degrade as load increases. Seldom happens

Bad situation - Think it's OK until load increases. Poor design

Utilization increases faster than the load - Typical

Utilization increase linearly with load - Good Situation

Resource Utilization

Load

Page 30: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

30 Copyright 2000-2003 IBM Corporation

Security

Privacy Authentication Authorization Audit Non-repudiation

Page 31: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

31 Copyright 2000-2003 IBM Corporation

Performance

How long does it take to get a response to a request from the system? Top-level metrics

Latency Throughput

How many transactions can be completed in a unit of time (Capacity)? Subsidiary metrics

CPU Network Bandwidth I/O of various types ...

Page 32: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

32 Copyright 2000-2003 IBM Corporation

Integrity

Data correctness Data permanence Disaster recovery Data currency

Page 33: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

33 Copyright 2000-2003 IBM Corporation

Manageability

Consider number of elements in a web applications Consistency Security Modifications Performance Configuration Training level required of operators

Page 34: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

34 Copyright 2000-2003 IBM Corporation

Malleability/Longevity

Continuous availability (despite update and failure) Time period of use of program

Page 35: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

35 Copyright 2000-2003 IBM Corporation

Integration

Note: millions of person-years of spent every year for applications This represents a total multi-trillion dollar investment Hence, integration is a necessity Integration approaches

Application to application Data sharing by multiple applications Process (Complex application integration)

For some applications, integration cost is 7x cost of system, yet this is less than recreating existing applications or losing benefits of integrated systems

Page 36: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

36 Copyright 2000-2003 IBM Corporation

Cost

Initial implementation Modification Installation Management (management is greater than development cost – usually

at least double)

Page 37: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

37 Copyright 2000-2003 IBM Corporation

Total Cost of Ownership

HW management

3%

Environmental14%

Downtime20%

Purchase20%

Administration

13%

Backup Restore

30%•Administration: all people time•Backup Restore: devices, media, and people time•Environmental: floor space, power, air conditioning

Page 38: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

38 Copyright 2000-2003 IBM Corporation

  Cause of System Crashes  

20%10%

5%

50%

18%

5%

15%

53%

69%

15% 18% 21%

0%

20%

40%

60%

80%

100%

1985 1993 2001

Other: app, power, network failure

System management: actions + N/problem

Operating Systemfailure

Hardware failure

(est.)

Current State of the ART

Failures due to people up, hard to measure VAX crashes ‘85, ‘93 [Murp95]; extrap. to ‘01 HW/OS 70% in ‘85 to 28% in ‘93. In ‘01, 10%? How get administrator to admit mistake? (Heisenberg?)

(based on the lecture “Recovery Oriented Computing” by Dave Patterson, Berkeley)

Page 39: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Techniques for Scaling Module A-5

Techniques for achieving the requirements

Page 40: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

40 Copyright 2000-2003 IBM Corporation

Motivation

Defined: Data is stored without overlap across multiple sites and each site processes its data the same way

This is the architecture of the web (Order of magnitude circa 10^12 hits/day)

Back of the envelope thought exercise: Assume a server can handle average number of hits ranging from

10^1/sec. – 10^4 /sec Then, there must be 10^3 – 10^6 web sites to meet load…

Examples (data partitioning – segmented workload): 1999 data on one site, 1998 on another… a’s on one site, b’s on another…

Page 41: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

41 Copyright 2000-2003 IBM Corporation

Some typical Web site loads over a 24-hour period

Page 42: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

42 Copyright 2000-2003 IBM Corporation

Example Response Time Budget

Client Request5%

Request Network Latency5%

Server Time55%

Response Network Latency20%

Client Response Processing15%

Page 43: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

43 Copyright 2000-2003 IBM Corporation

How Latency Varies Based on Workload Pattern and Tier

Page 44: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

44 Copyright 2000-2003 IBM Corporation

Achieving the Requirements

Faster Machines (Vertical Growth) Replicated Machines (Horizontal Growth) Specialized Machines Segmented Workloads Request Batching User Data Aggregation Connection Management and Caching

It is important to note that a detailed understanding of the application is key to the successful implementation

Page 45: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

45 Copyright 2000-2003 IBM Corporation

Faster Machines - Vertical Growth

Scalability can be achieved through the use of faster machines. This technique can include:

moving to hardware that is bigger than current environment. For example: moving a web server from and PC based server running NT to a UNIX based serverusing machines with more CPUs to leverage

the operating system's multitasking and multiprocessing capabilitiesusing machines that leverage other

computing paradigms such as parallel computingusing better software that is optimized for the

CPUusing faster hardware components such as

memory, cache, disk and I/O devices etc.

Page 46: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

46 Copyright 2000-2003 IBM Corporation

Replicated Machines - Clusters

Adding more machines of the same type and load balancing requests across these machines. In order to implement this technique we have to implement additional components in the architecture such as:

Dispatcher node that can monitor and load balance processing requests across the replicated machines A synchronization node that synchronizes the

content and data across the machinesA mechanism for managing sessions across

replicated machines

Page 47: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

47 Copyright 2000-2003 IBM Corporation

Specialized Machines

Individual components of the architecture can be scaled by using specialized machines that perform a certain function much faster. This technique is typically used in architectures to facilitate: Intelligent routing of traffic and data across replicated machines Dynamic caching, used extensively by event sites and other media

sites to speed up access to frequently accessed content Security and encryption, used by high volume sites to speed up the

SSL encryption and decryption

Page 48: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

48 Copyright 2000-2003 IBM Corporation

Segmented Workload

This is a technique that is typically used in conjunction with replicated machines. It involves the partitioning of the workload of an application to achieve optimum performance. There are several ways of implementing this technique, they vary from:

URL references, which is the most simplistic form of segmenting the workload by analyzing the URL and directing the requests to appropriate serversFunctional Partitioning, which looks at the

application and builds the partitioning of the workload in through custom programmingData Partitioning, placing segments of the data

in different machines

Function 1

Function 2

Function 3

Page 49: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

49 Copyright 2000-2003 IBM Corporation

Request Batching Multi-tier communication places a large computational

load on both the client-tier (requester) and the server-tier. It also introduces considerable latency. Furthermore, the overhead costs of virtually all cross-tier requests are equal, therefore it is much better to make fewer, but larger requests.

The goal of this technique is to reduce the number of requests that are sent between requesters and responders (such as between tiers or processes) by allowing the requester to define new requests that combine multiple requests.

Client Server

Client Server

Client

Server

Server

Server

Client

Server

Server

Server

Command

Page 50: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

50 Copyright 2000-2003 IBM Corporation

User Data Aggregation

This technique aggregates most commonly accessed data from multiple backend systems to speed up the overall performance of the architecture. This technique is typically implemented using:Custom ProgrammingIntelligent Middleware andData replication

Client Server

Client Server

Client

Server

Server

Server

Client

Server

Server

Server

Server

Page 51: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

51 Copyright 2000-2003 IBM Corporation

Connection Management

This technique aims to achieve scalability by reducing the most expensive operations within an application's workflows. This includes connections to legacy systems, databases and other servers

Servlet /App

WEB Application Server

PoolConnection

Connection Manager

ClientClient

Resource

I ncoming Request

1

4 6

3

7

A5 B

2

1. WAS passes a user request to a Servlet/App

2. The Servlet requests a connection from the Mgr.

3. The Mgr get a connection from the pool and gives the Servlet/app a connection.

4. The Servlet uses the connection to the resource

5. The resource returns data back

6. The Servlet return the connection to the Manager and the connection is returned to the pool

7. The Servlet/App sends the response back

If a connection is not available: A The CM requests a new connection B Adds the connection to the pool

Page 52: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

52 Copyright 2000-2003 IBM Corporation

Caching

Defined: Storage of and reference to data in a location that can be accessed faster and/or with higher aggregate bandwidth

Done at every level of a system Processor/memory Computer/disk Browser Web

Simplest when only one, infrequent writer of the data Issues: Write through caches

Cache invalidation

Page 53: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

53 Copyright 2000-2003 IBM Corporation

Caching (continued)

More complex when multiple writers and/or higher frequency updates There is the distributed cache consistency problem This happens in:

Computer architecture Multi-computer architectures Distributed systems of all types, including the web

Examples: Browser cache DNS Mirror sites Etc.

Page 54: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

54 Copyright 2000-2003 IBM Corporation

Techniques Applied to Web Tiers

Page 55: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

55 Copyright 2000-2003 IBM Corporation

Dimensions of the Scaling Techniques

Scaling TechniqueIncrease Power

Improve Efficiency

Shift / Reduce Load

Faster Machine X

Replicate Machines X

Specialized Machines X X

Segmented Workload X X

Request Batching X

User Data Aggregation X

Connection Management X

Caching X X

Page 56: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Caching and Replication Module A-6

The Technology Behind the Techniques

Page 57: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

57 Copyright 2000-2003 IBM Corporation

Cache Consistency Techniques

Fuzzy Use time-out and hope for the best Setting time-out is very tricky and error-prone

Consistent caching Use distributed cache consistency algorithms There are trade-offs between availability and consistency Algorithms are very tricky but can be gotten right Typical approach is the concept of token management The concept of token management...

Read token Write token Usually more tokens required to make things really work

Page 58: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

58 Copyright 2000-2003 IBM Corporation

Replication

Definition: Explicit creation, maintenance, and access of multiple copies of some resource Processors Bandwidth Data Etc.

Why replicate? Throughput Bandwidth Availability Integrity

Page 59: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

59 Copyright 2000-2003 IBM Corporation

Replication vs. Data Partitioning

Replication Same or overlapping data stored at multiple locations

Partitioning Data non-overlapping Typically, only one “home” for any data element

Page 60: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

60 Copyright 2000-2003 IBM Corporation

Replication vs. Caching

Difference between caching and replication Caching: there is a fundamental difference between a cached copy

and the real “backing” data. Loss of the cache is not a failure except from the perspective of performance

Replication: all replicas are of the same type, albeit not necessarily identical. Loss of a replica is a failure and could result in higher likelihood of lost data

Page 61: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

61 Copyright 2000-2003 IBM Corporation

Semantics of Replication

Consistency/fuzzy replication Same issue as in caching as above

What does consistency mean? Ticket Sales (OK to not show all the seats) Latest Score in basketball game (Can lag by up to n seconds) Weather forecast (Variable lag, depending on serverity of change) Prices for certain goods (Perhaps they need to be exact, as

differentials would cause customer dissatisfaction)

Page 62: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

62 Copyright 2000-2003 IBM Corporation

Replication Algorithms Abound

Unanimous Update Always update all copies Read from any copy

Excellent read throughput Excellent read availability Very poor write throughout Very poor write availability

Unanimous Read Always read all copies Update any copy

Excellent write throughput and availability Very poor read throughput and availability

Page 63: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

63 Copyright 2000-2003 IBM Corporation

Additional Replication Algorithms

Primary Copy Must update primary copy Primary copy ensures all other copies get updated Read from any copy

Excellent read throughput and availability Poor write availability

Signicant complexity in ensuring primary copy updates all other replicas

Voting Assume n copies Read from any r Write to n-r+1

Page 64: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

64 Copyright 2000-2003 IBM Corporation

Replication Conclusions

All algorithms quite difficult to implement But, replication has compelling benefits

Best long term approach for high data availability Software update or data reorganization Disaster recovery

Obvious performance benefits as well, at least for data which is either read or written infrequently. (Often, one of these is true.)

Systems support for replication required if implementation is to be feasible

Systems Support – Atomic Transactions in particular

Page 65: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

An Example of Replication: Weighted Voting Module A-7

This algorithm is due to Dr. David Gifford and was published in the ACM, 1974, …

Page 66: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

66 Copyright 2000-2003 IBM Corporation

Replication Algorithms Abound

Unanimous Update Unanimous Read Primary Copy Weighted Voting

Assume n copies Read from r (r n), the “read set” Write to n-r+1, the “write set” Concept is that there is overlap between read and write sets,

ensuring up-to-date copy seen.

Page 67: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

67 Copyright 2000-2003 IBM Corporation

Replication Algorithms Abound

Unanimous Update Unanimous Read Primary Copy Weighted Voting

Assume n copies Read from r (r n), the “read set” Write to n-r+1, the “write set” Concept is that there is overlap between read and write sets,

ensuring up-to-date copy seen.

Page 68: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

68 Copyright 2000-2003 IBM Corporation

Weighted Voting in More Detail

Each replica assigned a “weight” Each replica stores {version #, value} pair Read algorithm

Read from r copies Choose value associated with highest version #

Write algorithm Read from r copies to obtain version_numberi

Update n-r+1 copies using 1+max(version_numberi)

The invariant is that there are always n-r+1 copies of the data, and each of these has the same, highest version number.

Page 69: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

69 Copyright 2000-2003 IBM Corporation

Weighted Voting Example

Value:

Version:

Replica A Weight:

Value:

Version:

Replica B Weight:

Value:

Version:

Replica C Weight:

Page 70: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

70 Copyright 2000-2003 IBM Corporation

Weighted Voting Pragmatics

When read set is small, read availability and throughput is high When write set is small, write availability and throughput is high Or is it?

Writes require reads in version # based algorithm… Solution involves monotonically increasing time stamps:

Time clocks are typically not used by themselves Sequence numbers get passed on every message and continually

updated

Page 71: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

71 Copyright 2000-2003 IBM Corporation

Problems

What happens if a replica is down? Answer: self-healing; replica eventually restored

What happens if there are concurrent updates? What happens if reads occur during an update? What happens if there are failures during writes?

The algorithm fails The invariant gets violated The algorithm produces inconsistent results

Page 72: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

72 Copyright 2000-2003 IBM Corporation

The Solution

The atomic transaction Distributed updates and reads are done within the scope of a

transaction ACID Properties automatically maintained by system

Atomicity Consistency Isolation Durability

These properties make it possible to maintain invariants on distributed objects; e.g., the replicas

Page 73: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

73 Copyright 2000-2003 IBM Corporation

The Atomic Transaction

In a few weeks, we will discuss the concept fully Usage Implementation (major purpose of the book by Bernstein)

It will play an important role in the course because: The Web is a distributed structure There need to be invariants maintained across data Doing this by-hand –if one worries about failure– is very tedious

Page 74: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Load Balancing Module A-8

Page 75: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

75 Copyright 2000-2003 IBM Corporation

Load Balancing

Definition: Load Balancing refers to a technique that uses a load balancing algorithm (LBA) to choose a replica

Definition: An LBA is an algorithm (typically distributed) that permits a client to select a replica that meets performance & availability goals

Participants in the algorithm include clients and commonly replicas and other intermediaries

May want priority for certain requests

Page 76: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

76 Copyright 2000-2003 IBM Corporation

Load Balancing In Use - Examples

Direct a data read or write to: An unloaded replica A nearby replica A replica that will not charge much for its service …

Direct a processing request to: A replica that will complete the request with minimum latency A node that has been used for similar processing, so its cache is

primed …

Page 77: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

77 Copyright 2000-2003 IBM Corporation

Many Approaches to Load Balancing

Maintain a replicated directory service Client can consult an instance of it to gain an address of a replica Approaches

Directory can return set of replicas and client can use algorithm to determine proper replica

Or, Directory service can apply algorithm and return proper replica

Can use a replicated, intermediary that is a forwarding service

Page 78: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

78 Copyright 2000-2003 IBM Corporation

Algorithms for Directing Load

Randomization Round-robin Dynamic: Based on recent replica performance Locality-based (recent usage) Content-based Geography or Topology-based Negotiation-based (Request for Proposal -- direction to lowest bidder)

Page 79: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

79 Copyright 2000-2003 IBM Corporation

Randomization

Simple Excellent if

Locality effects are not important Reasonable distribution of requests

Timing Duration

No need for priority-based execution Willingness to accept stochastically good performance

Page 80: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

80 Copyright 2000-2003 IBM Corporation

Round-Robin

What is meant by Round-Robin Intra-client round robin? Inter-client round robin?

Simple Excellent if

Locality effects are unimportant (or non-existent) Requests have similar duration

Page 81: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

81 Copyright 2000-2003 IBM Corporation

Add’l Topics for Randomization & RR

Algorithms should take into account: Differential capacity of replicas Differential capacity of networks Ownership of resources Security issues

Page 82: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

82 Copyright 2000-2003 IBM Corporation

Dynamic Load Balancing

Can track in one or more places: Actual performance by replica Metrics of replica loading Results of probes

That information can be used to determine best replica Complex Advantages

Can provide excellent results in situations that randomized or round-robin load-balancing does not

Can be customized to provide priority, etc.

Page 83: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

83 Copyright 2000-2003 IBM Corporation

A Strawman LBA

Assumptions below… Clients 1..n, Datagatherer, & Replicas A & B DataGatherer

Probes replicas every 60 seconds, (Time = 0, 60, …) Chooses least loaded replica & reports it for 60 secs

Clients issue Service time for requests is ~10 secs w/low variance Requests to replicas based upon consulting DataGatherer

Page 84: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

84 Copyright 2000-2003 IBM Corporation

What’s the Result?

A meta-stable system: all load oscillates between Replica A and Replica B

Problem: reported load not tracking actual load Solutions

More frequent probes: probes should happen more frequently than 1/average(service time)

LBA should be less definitive in nature; e.g., somewhat stochastic In any case, designing good load balancing algorithms is hard without

knowing lots of information about the load

Page 85: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

85 Copyright 2000-2003 IBM Corporation

Locality-based

Premise is that a replica that has serviced a certain type of request recently should do so again

Why? Efficiency due to already available resources

E.g., open files or databases Efficiency due to security

E.g., secure communication sessions Complexity: how to other techniques, as Locality may not be enough

Page 86: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

86 Copyright 2000-2003 IBM Corporation

Content-based

As in data partitioning, assume certain types of data can best be handled by certain sites Site A stores “aa…az” in random access memory Site B does the same for “ba..bz” Therefore, “a” requests should generally go to Site A.

This is actually an approach for achieving locality

Page 87: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

87 Copyright 2000-2003 IBM Corporation

Geography or Topology-based

Based on co-location of client and replica May be an indicator of

Higher bandwidth Shorter latency Increased reliability Better security

Domain names are now registered with geographical coordinates

Page 88: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

88 Copyright 2000-2003 IBM Corporation

Negotiation-based

Virtual capitalism in action: Issue RFP Evaluate RFPs Ship work as appropriate

Cost of load-balancing overhead must be less than benefit This approach can get very interesting quickly:

Contractual commitments and compensation if unmet A way to do Pareto optimal scheduling

Useful to implement for real load balancing in business-to-business e-commerce

Page 89: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

89 Copyright 2000-2003 IBM Corporation

Role of Caching

Cache results of LBA for performance and availability The usual problem of cache correctness

How long until cache refresh Time-outs too short -> load balancing algorithm places too much load Time-outs too long -> data is insufficiently fresh

What happens when cache sends you to a failed site If faulty cached-data, go back and refetch This leads to the definition of a Hint

A cached entry which is right with high probability, but can be and always is checked for validity prior to use

The issue of time-out appears again

Page 90: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

90 Copyright 2000-2003 IBM Corporation

Example: Load Balancing to HTTP Server

User specifies http://www.xxx.com Request should actually be handled by one of many HTTP servers to

provide higher throughput One approach Can do request re-direction (a type of forwarder)

See http protocol definition as in assigned reading The forwarder a potential bottleneck

Page 91: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

91 Copyright 2000-2003 IBM Corporation

Approach 1 – Round Robin DNS

DNS entries allow 32 server addresses per record. DNS (name) servers will cycle through the entries therefore providing

round-robin load balancing Advantages

Cheap Easy

Page 92: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

92 Copyright 2000-2003 IBM Corporation

Round Robin DNS - Problems

Addresses of unavailable servers will remain until an administrator removes the entries

It takes hours or days for the DNS database to replicate So, system hands out addresses of down servers for a long time Address of recently added servers take a while to become visible

All servers treated equally Perhaps, new servers will likely be faster than the old ones and

would handle more load Some servers may handle multiple loads and should get fewer

requests

Page 93: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

93 Copyright 2000-2003 IBM Corporation

Cisco Local and Distributed Director

See:http://www.cisco.com/warp/public/cc/pd/cxsr/400/tech/scale_wp.htm

Session redirection accomplished by rewriting IP header using a mapping table

Intelligent load balancing to servers within a cluster Takes into account status of servers Uses only a single DNS entry for entire server complex

Simplifies administration Hot standby feasible

Fancier load balancing of this type Routes requests based on topological distance Routing decisions can be based on hop counts, network usage, &

round-trip latency.

Page 94: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

94 Copyright 2000-2003 IBM Corporation

IBM Secureway Network Dispatcher

http://www-4.ibm.com/software/network/dispatcher/about/features/keyfeatures.html Network dispatcher

Doesn't modify packets (vs. LocalDirector which does) Only inspects inbound requests (LocalDirector looks at both)

So, response go back directly to the requester (greater efficiency) Background processes check servers to ensure that they are up

"advisors" support HTTP, SSL, FTP, NNTP, POP3, SMTP, Telnet This way requests don't go to down servers.

Balances load across servers of different sizes: Servers send CPU, Disk, I/O metrics to dispatcher

Supports hot standby for high availability of dispatcher Uses a "sticky" port option to route client requests to same server to

ensure state preserved across requests: recall locality topic

Page 95: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Failure Detection Module A-9

Page 96: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

96 Copyright 2000-2003 IBM Corporation

Failure Detection

Explicit –clear indication that failure has occurred Timely Semantics clean, … as far as they go Voting

Implicit – timeout Requester does not receive response after waiting a while Unclean: Does not necessarily mean remote system failed

Timeout often used in very many places/levels Communication Naming, … And, ultimately, End-to-end

Some have argued only end-to-end timeouts valuable, but this is incorrect

Page 97: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

97 Copyright 2000-2003 IBM Corporation

Timeout In More Depth

Problems with timeouts Semantics Specification of timeout length

Particularly difficult when requests take variable amounts of time And, requester, can not dynamically set time-out interval Long intervals lead to poor customer satisfaction – imagine an

ATM that made you wait 10 minutes before failing and giving you your card back?

Therefore, timeouts are used at multiple system levels Lower levels have more predictable performance so can trigger

timely failures better Higher levels are required for ultimate correctness

Page 98: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

98 Copyright 2000-2003 IBM Corporation

The Role of the Sequence Number

Sequence number in communication protocol Failure Duplicate detection Flow control

Sequence number in replication algorithms As discussed previously

Sequence number in site crash detection Sites increment a number post failure Therefore possible to tell if site has crashed This is important to not miss getting work done on a site

Page 99: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

99 Copyright 2000-2003 IBM Corporation

Voting

Discussed wrt: Weighted Voting Algorithm Used to determine most up-to-date copies

What if used to detect incorrect data N-way computation

Structure N-inputs: vote on them and determine most typical input N-computations on most typical input Vote on result N-outputs which go into next stage of computation Or go to some device which itself votes

Page 100: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

100

Copyright 2000-2003 IBM Corporation

Yahoo Denial of Service Attack

Mostly unavailable 10:20AM – 12:00PM PST 2/7/00 Reported cause (NYT, 2/8/00)

50 computers “flooded” Yahoo site 1 gigabyte/second or 20 mbytes/computer/second “Clogging” Yahoo’s site and routers Difficult to trace due to use of hijacked computers

Solutions Audit, Filter, Legal System

Typical Yahoo availability: 99.3%, according to Keynote Systems Corresponds to being down 61 hours/year And, Yahoo is a good site

Page 101: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in HaifaCopyright 2000-2003

IBM Corporation

Achieving Availability with MalleabilityModule A-10

Page 102: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

102

Copyright 2000-2003 IBM Corporation

The Goal Malleability

How do you change the system without taking it down? The application The operating system Perhaps, even a change to the hardware

This has proven very hard

Page 103: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

103

Copyright 2000-2003 IBM Corporation

An Approach

Ensure a service is replicated Stop a copy Augment its interfaces Restart it And repetitively do the same to the other copies Eventually, all replicas will have no capabilities Note: it is very hard to reduce the scope of interfaces., Augmentation is

much easier.

Page 104: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

104

Copyright 2000-2003 IBM Corporation

An Example

Assume you want to modify the function of a replicated directory while it is online

Assume there are: Multiples instances of the replicated directory itself, called

CtrReplicaGrps Multiple instances of the individual replicas, called CtrReplicas As in the weighted voting algorithm discussed earlier

Page 105: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

105

Copyright 2000-2003 IBM Corporation

Technique (1)

[Part 1 to be discussed at the end] Part 2: one by one,

Stop a CtrReplica (hope a failure doesn’t occur simultaneously) Start a new version Do for all CtrReplicas The CtrReplicaGrps should not mind this gradual change. (They

don’t use the new methods… yet…) Also, they can tolerate the failure of a CtrReplica

Page 106: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

106

Copyright 2000-2003 IBM Corporation

Technique (2)

Part 3: Now, one by one: Stop a CtrReplicaGrp Start the new version Do for all CtrReplicaGrps

Now, there is a new function available. Finally, do Part 1: test what we have so carefully installed, so we haven’t

just (methodically) inserted a bug into the entire, supposedly fault-tolerant, system

Page 107: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

107

Copyright 2000-2003 IBM Corporation

Issues

Issues: Too many steps for a human being to get right

So, need automation via console May not handle a simultaneous failure during upgrading:

So, more replicas may be needed Cost of availability: The shape of this curve is right, though the calibration is

unknown and undoubtedly flattens as experience grows

0

10

20

30

40

50

60

70

Page 108: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

108

Copyright 2000-2003 IBM Corporation

Window of Vulnerability

If transactions used, there is a potential availability problem during the “Window of Vulnerability”

The only solution is that transactions coordinators must be rather reliable and be guaranteed to recover quickly after a crash

Page 109: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

109

Copyright 2000-2003 IBM Corporation

Availability

So, considerable thought required to achieve high availability in malleable systems

Better when not needed However, when high availability required

Every level of system needs to be studied and addressed

Page 110: IBM Labs in Haifa Copyright 2000-2003 IBM Corporation Advanced Web Applications Development Technion CS 236606 Spring 2003, Class 2 Eliezer Dekel March

IBM Labs in Haifa

110

Copyright 2000-2003 IBM Corporation

The Architecture As We’re Studying It

EJB

DB

MS

Servlet/JSPClient

Integrated Dev’t Environment

Java Runtime Environment

Security/Directory (X509, LDAP, Kerberos)

Linux NT AIX Solaris Sys/390

Reusable Components

Modeling and Other Softw’ Eng. Tools

System

s Mgm

t

Reliable

Messsaging

Workflow Management