Download - Systems Engineering for the Internet and the Web Rob Oshana [email protected] 214-415-9690
My Background
• Defense business experience
• Internet/web experience
• Commercial “shrink wrap” experience
• SMU adjunct (CSE, EETS)
• Consulting with telecom
• E-Commerce certificate
So Why am I Talking About This ?
• I learned a lot about system engineering in DoD environment
• I saw a need for sound system engineering principles in the internet space (currently chaotic)
• I believe there is an opportunity in the educational space
Introduction
• The Internet has increased the scope and complexity of information technology systems, placing even greater importance on system planning and design
• System engineering for rapid, iterative methodologies of the Internet world
System
• A system can be defined as– an integrated composite of people, products,
and processes that provide a capability to satisfy a need or objective [MIL-STD-499B]
– a collection of components organized to accomplish a specific function or set of functions
– an interacting combination of elements, viewed in relation to function [INCOSE 95]
System
• A system may be a product that is hardware only, hardware/software, software only, or a service– the sum of the products being delivered
to the customer(s) or user(s) of the products
– achieve the overall cost, schedule, and performance objectives of the business entity developing the product
Systems engineering process
• Systems engineering process is a comprehensive problem-solving process used to– transform customer needs and requirements
into a life-cycle balanced solution set of system product and process designs
– generate information for decision makers– provide information for the next product
development or acquisition phase
SE-CMM Process Areas
Applicable Process Areas
• Analyze Candidate Solutions
• Derive and Allocate Requirements
• Evolve System Architecture
• Integrate Disciplines
• Integrate System
• Understand customer needs
• Coordinate with suppliers, etc
Analyze Candidate Solutions
• Identifies the characteristics of a process for choosing a solution from several alternatives– design decision– production decisions– life-cycle cost decisions– human factors decisions– risk reduction decisions
Derive and Allocate Requirements
• Typical Work Products– operational concept– user interaction sequences– maintenance operational sequences– timelines– simulations– usability analysis
Understand Customer Needs and Expectations
• Interface control working groups• Questionnaires, interviews, operational
scenarios obtained from users• Prototypes and models• Brainstorming• Market surveys• Observation of existing systems,
environments, and workflow patterns
Coordinate with Suppliers
• Typical Work Products– make-vs.-buy trade study– list of system components– sub set of system components for
outside organizations to address– list of potential suppliers– beginnings of criteria for completion of
needed work
System Engineering Applied to Internet Infrastructure
Example - Campus Network
http://www.cisco.com/cpress/cc/td/cpress/ccie/ndcs/01ccie.htm#35145
Determining Requirements
• Understand requirements
• Selecting capability and reliability options that meet these requirements
• Solution must reflect the goals, characteristics, and policies of the organizations in which they operate
Determining Requirements
• Two primary goals drive design and implementation– Application availability– Cost of ownership
• IS budgets today often run in the millions of dollars as large organizations increasingly rely on electronic data for managing business activities
• A well-designed solution can help to balance these objectives!!
The Design Problem: Optimizing Availability and
Cost• Design problem consists of the
following general elements
• Environmental givens– location of hosts, servers, terminals,
and other end nodes– the projected traffic for the environment– projected costs for delivering different
service levels
Optimizing Availability and Cost
• Performance constraints– network reliability– traffic throughput– host/client computer speeds (for example, network
interface cards and hard drive access speeds).
• Internetworking variables• network topology• line capacities• packet flow assignments
Optimizing Availability and Cost
• Goal is to minimize cost based on these elements while delivering service that does not compromise established availability requirements
• Primary concerns are availability and cost– essentially at odds– increase in availability must generally be
reflected as an increase in cost
Assess needs and cost
Select topologies and technologies to satisfy needs
Model Network Workload
Simulate behavior underexpected load
Perform sensitivity tests
Rework design as needed
GeneralNetworkDesignProcess
Assessing User Requirements
• Users primarily want application availability in their networks;– response time
• interactive online services, such as automated tellers and point-of-sale machines
– throughput• file- transfer activities (low response-time
requirements)
• always a tradeoff; think Size/Weight/Power!
Assessing User Requirements
– reliability• Financial services, securities exchanges,
and emergency/police/military operations• high level of hardware and topological
redundancy• Determining cost of any downtime is
essential in determining the relative importance of reliability
Assessing User Requirements
• User community profiles• Interviews, focus groups, and surveys• Interviews with key user groups• Focus groups• Formal surveys can be used to get a
statistically valid reading of user sentiment
• Human factors tests
Assessing Proprietary and Nonproprietary Solutions
• Compatibility, conformance, and interoperability are related to the problem of balancing proprietary functionality and open internetworking flexibility
• Multivendor environment or specific, proprietary capability– Open routing protocol can potentially result
in greater multiple-vendor configuration complexity
Assessing Proprietary and Nonproprietary Solutions
• Gaining a measure of interoperability versus losing functionality
• Previous internetworking (and networking) investments and expectations for future requirements have considerable influence over choice of implementations
Assessing Proprietary and Nonproprietary Solutions
• Must consider– installed internetworking and networking
equipment– applications running (or to be run) on the
network– traffic patterns– physical location of sites, hosts, and users– rate of growth of the user community– physical and logical network layout
Assessing Costs
• Internetwork is a strategic element in customer’s overall information system design– cost of internetwork is much more than the
sum of your equipment purchase orders.
• Must be viewed as a total cost-of-ownership issue
• Must consider the entire life cycle of your internetworking environment
Costs to Consider
• Equipment hardware and software costs– initial purchase and installation,
maintenance, and projected upgrade costs
• Performance tradeoff costs– cost of going from a five-second
response time to a half-second response time
Costs to Consider
• Installation costs– Installing a site's physical cable plant can
be the most expensive element of a large network
• installation labor• site modification• fees associated with local code conformance• costs incurred to ensure compliance with
environmental restrictions (such as asbestos removal)
Costs to Consider
• Expansion costs– cost of ripping out all thick Ethernet,
adding additional functionality, or moving to a new location
• Projecting future requirements and accounting for future needs saves time and money
Costs to Consider
• Support costs– Complicated internetworks cost more to
monitor, configure, and maintain• training• direct labor (network managers and
administrators)• sparing• replacement costs
– Also out-of-band management, SNMP management stations, and power
Costs to Consider
• Cost of downtime– Evaluate the cost for every minute that a
user is unable to access a file server or a centralized database
– If the cost is high enough, fully redundant internetworks might be best option
Costs to Consider
• Opportunity costs– Every choice made has an opposing
alternative option• specific hardware platform• topology solution• level of redundancy• system integration alternative
Costs to Consider
• Opportunity Costs– opportunity costs of not switching to newer
technologies and topologies might be lost competitive advantage, lower productivity, and slower overall performance
• Any effort to integrate opportunity costs into your analysis can help to make accurate comparisons at the beginning of the project
Costs to Consider
• Sunken costs– Investment in existing cable plant,
routers, concentrators, switches, hosts, and other equipment and software are sunken costs
– If the sunken cost is high, might need to modify networks so that existing internetwork can continue to be utilized
Estimating Traffic: Work Load Modeling
• Empirical work-load modeling– instrumenting a working internetwork– monitoring traffic for a given number of users,
applications, and network topology
• Characterize activity throughout a normal work day– type of traffic passed– level of traffic– response time of hosts– time to execute file transfers
Work Load Modeling
• Extrapolating to the new internetwork's number of users, applications, and topology
• Tools• Passive monitoring of an existing
network• Measure activity and traffic generated
by a known number of users
Work Load Modeling
• Problem with modeling workloads on networks is that it is difficult to accurately pinpoint traffic load and network device performance as functions of the number of users, type of application, and geographical location
Work Load Modeling
• Factors that influence the dynamics of the network– The time-dependent nature of network
access– Differences associated with type of traffic
• Routed and bridged traffic place different demands
– The random (nondeterministic) nature of network traffic
Sensitivity Testing
• Sensitivity testing involves breaking stable links and observing what happens– how traffic is rerouted– speed of convergence– whether any connectivity is lost– and whether problems arise in handling
specific types of traffic
Sensitivity Testing
• This empirical testing is a type of regression testing:– A series of specific modifications (tests)
are repeated on different versions of network configurations
– By monitoring the effects on the design variations, you can characterize the relative resilience of the design
System Engineering Techniques Applied to the
Web
Quantitative AnalysisBusiness model &measurable goals
E-Business sitearchitecture
Predict E-BusinessSite performance
ForecastWorkload Evolution
Obtain PerformanceParameters
MeasureE-Business Site
Characterize Customer Behavior
CharacterizeSite Workload
DevelopPerformance Models
1
2
3
4
56
7
8
Customer, Workload, and Resource Models
CustomerModel
WorkloadModel
ResourceModel
What-if questionsregarding impacts ofcustomer behavior
What-if questionsregarding impacts ofworkload, architecture, andconfiguration changes
Metrics:- revenue/sec- response time- throughput
What is a performance model?
• A model of a system helps one understand some fundamental characteristics of the system
• “All models are wrong, but some are useful!”
Zipfs Law
• If one ranks the popularity of words in a given text (p) by their frequency (f) then f ~ 1/p
• A few elements score very high and a very large number of elements score very low
• Many phenomena on the web can be modeled by Zipfs law
Zipfs Law
• P = k/r where P is the number of references to a document, r is the rank, k is a positive constant
• Some documents are very popular while most documents receive just a few references
• Can use Zipfs law to understand some asymptotic properties of web caching performance
Zipfs Law
• Results obtained from Zipfs model are useful– to characterize WWW workloads– analyze document dissemination and
replication strategies– model the behavior of caching and
mirroring systems
Other Types of Models
• CBMG• CSID• Resource model; represents the structure
and the various components of an e-business site
• Performance model; represents the way system’s resources are used by the workload and capture the main factors determining system performance
Other Types of Models
• Analytic models; specify the interaction between the various components of a system via formulas
• Example; minimum possible HTTP transaction time
Rtmin = RTT + requestmin + SiteProcessingTime + replymin
RTT=round trip delay in network comm, requestmin = RequestSize/Bandwidth = min time needed to send the request to the site
Other Types of Models
• Simulation models; mimic the behavior of the actual system by running a simulation program– Mimics the transitions among the system
states according to the occurrence of events in the simulated system
– Measure performance by counting events– Expensive to develop and run– How long to you run it ?
Why do we need models?
• Help us understand the quantitative behavior of complex systems
• Commerce is a transaction based system• Useful for analyzing document
replacement policies in caching proxies• Useful for analyzing bandwidth capacity of
certain network links• Good essential tool for studying resource
allocation problems in the context of e-commerce
A Modeling Paradigm
• View from different perspectives;– Modeling/prediction paradigm
• Modeling the system– Analytic models
• Validating the model– Obtain necessary input parameters– Make proper assumptions
• Using the model to predict future system performance
– Analytic or simulation techniques
Modeling/Prediction Paradigm
Actualsystem
Collectdata
Performancemeasurement
Build amodel
ObtainParameters
Solve themodel
ValidateThe model
ChangeValidated
model
Solve themodel
PerformanceOf projected
system
Analyzing Modeling Predicting
A Modeling Paradigm
• Accuracy of results• Response time of e-commerce transaction
computed by model should be compared against actual data
• Rules of thumb– Resource utilization – 10%– System throughput – 10%– Response time – 20%
• Errors may exist in modeling phase or in the measurement phase
State and Transitions of a CBMG
Entry Home LoginAdd to
cartSelect
Browse
Search
Register
Pay
Search
0.70.1
0.6
0.15
0.15
0.2
0.30.2
0.1
0.5
0.3 0.20.3
0.1
0.4
0.1
0.05
0.05
0.2
0.05
0.1
0.2
0.1
0.35
0.2
0.2
0.35
0.250.1
0.25
0.25
1.0
0.3
0.2
Capacity Planning
• Determining future load levels– Natural evolution of existing workloads– Deployment of new applications and services– Changes in customer behavior
• Traffic surges due to new situations• Changes in customer navigational patterns due to
availability of new business functions
• Predictive patterns and not experimentation
Definition of Adequate Capacity
ServiceLevel
Agreements
SpecifiedTechnologies
And standards
Costconstraints
Customers
Management
AdequatecapacityAdequatecapacity
e.g. startup cost < $5.5 millionMaintenance cost < $1.6 million/yr
e.g. response time < 2 secAvailability > 99.5%
e.g. NT servers,Oracle DBMS,SSL, SET
CBMG for Online Auto-Buying Service with Virtual Buying feature
entry
Selectoptions
Selectcar
home
Vieworder
Cancelorder
Selectorder
SelectSvc contr
Enterdata
Apply forfinancing
Enter delivdata
Holdcar
Selectoptions
Typical Multi-Tier E-Business Site Architecture
Intranet/Internet
WebServer
Router
AppServers
DBServers
Firewall
MS windows NT serverMS IIS HTTP server
LAN 1 LAN 2
MS windows NT serverMS SQL server
MS windows NT serverSite Server CommerceEdition
T3 link
CSID for the Option Select E-Business Functions
C WS AS DB AS CWS
1 2 3 4 5 6 7
LaunchShowOptions
DisplayCarOptions
SearchCarOptions
DisplayCarOptions
SendReply
(Int,LAN1) (LAN1,LAN2) (LAN2) (LAN2) (LAN1,LAN2) (LAN1,Int)
[1,200] [1,320] [1,400] [1,1050] [1,2400] [1,2600]
Performance Laws
• T = observation period
• Bo = system busy period
• Ao = number of arrivals of requests
• Co = number of completed requests
• Can then derive operational quantities
Utilization Law
• Fraction of time the resource is busy
• Utilization, U = Bi / T
• Average throughput from queue = Xi = Co / T
• Ui = Bi / T = Bi / (Co/Xi) = (Bi/Co) X Xi = Si X Xi
Forced Flow Law
• Average number of visits, Vi; each completing transaction has to pass Vi times on average by queue i
• Xo transaction complete per unit time
• Vi X Xo transactions visit queue I per unit time
• Xi = Vi X Xo is the Forced Flow Law
Service Demand Law
• Combine the Utilization and Forced Flow Laws
• Di = Vi X Si = (Xi / Xo) X (Ui / Xi) = Ui / Xo
Little’s Law
• Simple and widely applicable to performance analysis of computing resources
N
R
X
Customers arriveat the black box,spend R secondsin the black boxand leave
Little’s Law
0
N
k
n(t)
t
rk
Number of customers inthe black box at time t
Little’s Law
• Departure rate through black box is X customers/sec
• N = average number of customers in the black box (at the web site)
• Show that N = X X R• Observation time is • Average number of customers in the
interval can be calculated
A Performance Modeling Question
DB Servers(e.g.mainframes)
Intranet/Internet
DMZ Layer 1 Layer 2 Layer 3
LoadBalancer
Router
AppServers
WebServers
Firewall
Model
?
Single Server Model
Service processQueuing Space
Arrivalprocess
Resources
Single Queue Model
Model
ResponsesRequests
Single Queue
Webserver
DataStoragedevice
Requests/responses
Queuing Network Model
Intranet/Internet
DMZ Layer 1 Layer 2 Layer 3
LoadBalancer
Router
AppServers
WebServers
Firewall
Queuing Network Model
Financial Site: CSID for “Show Portfolio”
C WS AS DB AS WS C
C
1 2
3
4 5 6 7 8
[1,m1] [0.95,m3] [0.8,m6] [1,m7] [1,m8] [1,m9]
[0.05,m2]
Open QN of the Financial Site
1
2
processor
disk
3
4
processor
disk
5
6
processor
disk
Web server App server Database server
responses
Response Time of Financial Site
Response Time
0
0.5
1
1.5
2
2.5
3
1 2 3 4 5 6 7 8 9
Arrival Rate
Res
po
nse
Tim
e
Response Time
Contention for Software in E-Business Sites
• WS is multithreaded (m threads)
• AS has n threads
• DS has p threads
• Queue for WS limited (requests may be rejected)
• Requests sent to AS and/or DS and are queued there
S/W and H/W Queues
Disk
CPU
Disk
CPU
Disk
CPU
1
m
WS threads
1
m
AS threads
1
m
DS threads
Rejectedrequests
Example of Zipf’s Law
Traffic Volume to an E-Tailer Site
Historical Data Patterns
So What is Being Done?
Technology Assessment
• Reduces the risk of using obsolete or unproven solutions and identifies available products and services with attractive price-performance profiles
• The thrust is to make full use of standards-based, leading edge technologies that are commercially available, plug-and-play components.
Prototyping, modeling, and simulation
• Techniques are used to evaluate alternative conceptual designs, predict performance, and conduct trade-off analyses
• Analysis tools to support workload forecasting, performance measurement, capacity management, and cost estimation– Used to evaluate conceptual designs and select
the system alternative that best meets current and future requirements
Acquisition Phase
• Active support role, or assumes full responsibility in acquiring all the necessary products and services competitively to build the target system
• Prepare acquisition specifications, screen potential vendors, elicit proposals, evaluate offers, and select best-value solution
Implementation Phase
• Support clients in managing systems development, installation, and cut over activities to ensure quality performance by vendors
• Monitor work progress, conduct formal reviews at major milestones, identify risk areas, and devise corrective actions to ensure the delivery of reliable, maintainable systems, on schedule and within budget
Job Description
• Participate ..on a project team of engineers involved in development of systems and software for XX products.... Requires a strong background in Systems design…...and system-level documentation, on projects that may include any of the following list of responsibilities: Specify detailed product requirements, participate in the architecture and requirements of system software/hardware for optical products designed for the core of optical networks. Demonstrate a high degree of originality and innovation in defining product and project level architecture. Significantly influences the design of interfaces between products to ensure interoperability. Define new software product features. Champion new, improved design methodologies. Define Reliability, Availability, Servicability (RAS) goals for products. Strong interpersonal skills ...
Summary
• The internet is here to stay (and becoming critical)
• Complexity of modern solutions requires a good systems engineering approach
• SMU is in a hotbed for this technology
• Educational opportunities