©ian sommerville 2006msc module: advanced software engineering slide 1 service dependability
TRANSCRIPT
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 2
Objectives
To explain what is required to ensure that services are dependable
To discuss mechanisms which can be used to deliver service dependability
To introduce you to some research work in service-oriented software engineering
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 3
Topics covered
Reliable message interchange Service-fault tolerance A generic mechanism to implement fault
tolerant services
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 4
Dependability
The dependability of a system reflects the extent that the system can be trusted to deliver its required services in a safe and secure way
Dependability (in general) will be covered in the 2nd semester. • Today, I’ll simply focus on one aspect of
dependability as applied to service• Availability - the service is available to service requests• Reliability - the services delivers results according to its
specification
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 5
Dependability requirements
Requests for services from service clients and responses to these requests are reliably delivered
Advertised services are available• Both the service software and the servers must be
available Services deliver results as advertised
• Services must be reliable
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 6
Reliable messaging
When one service sends a message to another, the sending service should be confident that that message will (eventually) reach its destination
Sometimes, it is essential that one and only one copy of messages are delivered• E.g. a withdrawal from a bank account
Sometimes, messages must be received in exactly the order they have been sent• E.g. a sequence of database updates
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 7
Messaging standards
Two similar but incompatible standards have been proposed• WS-Reliability.
• Has the backing of most companies except Sun and IBM
• WS-ReliableMessaging• Has the backing of Sun and IBM
It is not yet clear which of these will emerge as the dominant standard.
I will illustrate the topic using WS-ReliableMessaging
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 8
Reliable message processors
Service A Sending RMP
Receiving RMP
Service B
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 9
Reliable message processors
Part of middleware that facilitates service interaction so may be used by any service
They are NOT part of the services themselves Sending RMP
• Submit. Transfers a message from producer to sending RMP
• Notify. Transfers a response message from the sending RMP to the producer
Receiving RMP• Deliver. Transfers a message from the receiving RMP to a
consumer• Respond. Transfers a response message from the
consumer to the receiving RMP
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 10
RMP operation
RMP’s add information to the SOAP message header that allows for message exchange to be correlated
This allows the middleware to identify duplicate messages, acknowledge message delivery and ensure that messages are delivered in the correct sequence.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 12
Service availability and reliability
To ensure availability, it must be possible to continue to deliver a service when software and/or hardware has failed• This suggests that there has to be more than one
instance of a service plus a mechanism to switch between them in the event of a failure
To ensure reliability, it must be possible to check if the results of a service are consistent with its specification
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 13
Fault tolerance
Fault tolerance is: • The ability of a system to continue execution (perhaps
offering a degraded service) without system failure in the presence of system faults.
Fault tolerance is mostly used to provide enhanced system availability but may also be used where a system has critical reliability requirements.
For web services, fault tolerance mechanisms may be used to ensure service availability and reliability.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 14
Redundancy and diversity
Redundancy• A system includes functionality that replicates other
functionality provided elsewhere in the system. Diversity
• Redundant functionality is provided in different ways. Providing diversity and redundancy in systems is
expensive. Generally, confined to critical systems ie systems where the costs associated with system failure are high.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 15
Fault tolerance policies
Re-try (with/without diversion)• Simple re-try (of a service) handles transient failures; Re-try with
diversion handles platform failures. N-version execution with voting
• Simultaneous execution of multiple versions of a service. Handles system failure either lack of service or inconsistent computation.
Constrained execution• Execution of a service with checks on inputs. Handles out-of-
range computation. Alternative version execution with acceptance tests
• Handles system failure - either lack of service or out of range computation.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 16
Service fault tolerance
The services used in an application may be provided by unknown providers on unknown hardware.
Service users do not know anything about how well these services • Meet their specification• Have been tested for defects
It therefore makes sense to try to ensure that a service-oriented system can continue to operate even when faulty services are used
This means that the system should be fault-tolerant.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 17
Fault tolerant services
In its most general form, this approach relies on the existence of multiple versions of a service.
The vision of competing services offered by different providers provides redundancy and diversity so fault tolerance can be widely deployed.• Service provider
• May use fault tolerance to help achieve an advertised quality of service.
• Service client• May use fault tolerance to achieve a required quality of service or to
enhance service trust.
Ideally, a common mechanism should be available that can be used by both providers and clients.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 18
The container model
Used in component-based software engineering e.g. EJB• Common services (e.g. transaction management)
required by different components are provided by a ‘container’.
• To deploy a component, it is included in a container and thus it ‘inherits’ access to common services.
We decided to adopt a comparable approach where services are deployed in a container which is configured to provide fault tolerance support.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 20
Container components
Policies• A generic container is configured by a fault tolerance
policy. • This is an XML description of the strategy to be used to
achieve fault tolerance. Procedures
• The container includes a set of procedures (currently in Java) that implements the defined policy.
A proxy service• This manages access to the actual services that are used.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 23
Policy models
An XML description of the fault tolerance policy to be adopted.
This includes references to the procedures to be executed, the conditions for fault detection and the mechanisms for fault recovery.
This description is then interpreted by the container to implement the fault tolerance procedures.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 24
Fault-tolerant service provision
Research project on service dependability to investigate how to provide service fault tolerance
Goals• The F/T controller should not require any change
in existing services or service usage by clients.
• Services should not have to be aware that they are accessed through a F/T controller.
• The F/T controller should accommodate a range of fault tolerance policies.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 25
Policy model representation
<?xml version="1.0" encoding="UTF-8"?><policyModel>
<procedure name="VotingProcedure" class="net.sourceforge.digs.endpoints.voting.VotingProcedure" start="true">
<voting requirement="5"> <vote> <xpath>//maxTemp</xpath> <voteClass>net.sourceforge.digs.endpoints.voting.vote.IntegerVote</voteClass> <majority>3</majority> <tolerance>2</tolerance> </vote> <vote> <xpath>//minTemp</xpath> <voteClass>net.sourceforge.digs.endpoints.voting.vote.IntegerVote</voteClass> <majority>3</majority> <tolerance>2</tolerance> </vote> </voting>
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 26
Policy model representation
<connections> <connection id="Proxy1" procedure="Proxy1"/> <connection id="Proxy2" procedure="Proxy2"/> <connection id="Proxy3" procedure="Proxy3"/> <connection id="Proxy4" procedure="Proxy4"/> <connection id="Proxy5" procedure="Proxy5"/> </connections> </procedure>
<procedure name="Proxy1" class="net.sourceforge.digs.endpoints.proxy.ProxyProcedure"> <endpoint url="http://in-ega051000012.lancs.ac.uk:8080/weather/weather.bbc.co.uk" proxyHost="wwwcache.lancs.ac.uk" proxyPort="8080"/> </procedure>….
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 27
Procedures
We provide a set of generic procedures that can be adapted for use in different policy models
Flow control• Implement the fault tolerance pattern
Redirection• Handle redirection to actual services
Caching• Store messages within the container
Query• Used for message querying and manipulation
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 28
Container hosting
Containers are not stand-alone entities but are hosted on a Java servlet engine (currently Apache Tomcat).
This means that we need be less concerned with issues of performance, security and stability and can focus on fault tolerance.
It also allows the dynamic deployment of containers thus opening up the possibility of responsive fault tolerance policies that change according to the services available.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 30
Deployment support
To simplify the development of fault tolerant services, we have developed a deployment tool that allows:• Graphical editing of F/T policies and generation of
associated XML description.• Access to reusable components• Automated support for deployment to the servlet
engine by creating a Web Archive (WAR) file.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 31
Policy and procedure reuse
We do not envisage that developers will normally develop policies and procedures from scratch.
Our support tool is geared to supporting policy and procedure reuse and gives users access to a library of existing policies and procedures that can be modified.
We are currently developing a ‘wizard’ that will allow commonly used policies (e.g. N-version execution with voting) to be configured with no requirement for the user to be aware of the underlying implementation.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 32
The deployment process
Assuming that policies and procedures can be reused, the steps involved are:• Create or edit the fault tolerance policy using the graphical
editing system and generate the policy model.• Chose and adapt or implement the procedures to
implement the policy.• Deploy the results as a WAR file.
Depending on the extent of reuse, creating a fault tolerant service can take between a few minutes and a few hours to complete.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 34
Dynamic adaptation
The model used allows for the possibility of dynamic adaptation of the fault tolerance policy to be used• As the policy is simply an XML document, clients
could specify their own policies in the SOAP message that initiates the service call.
• The container could dynamically switch policy models depending on the QoS provided.
• Dynamic discovery and replacement of services could be provided.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 35
Current issues
Semantic equivalence of services• Services provide similar services but with different interfaces. We
need to be able to specify semantic equivalences across services. Stateful services
• Services with state offer both problems and opportunities when implementing fault tolerance policies.
Checkpointing• How should checkpointing be supported for composed and
computationally intensive services? Container failure
• How should container failure be handled?
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 36
Policy extensions
Service failure simulation• One of the problems we have faced is in testing our
system - failures of externally provided services are uncontrollable.
• Failure policies could be embedded thus allowing simulators for service testing to be created.
Service monitoring• Monitoring policies (see next slide) could be developed.
Service access control• Rather than embedding access control in the service
itself, access control policies could be defined.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 37
Service monitoring
Monitoring involves maintaining information about the quality of the service.
In general, monitoring should be separated from the service provision.
Provider-side monitoring• Providers use monitoring to assess the effectiveness of
their service and to inform re-configuration for service improvement.
Client-side monitoring• Clients use monitoring to assess the actual quality of
service which THEY receive.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 38
Conclusions
Our goal of developing a flexible and adaptable fault-tolerance mechanism for services has been achieved (for atomic services).• Current work is concerned with extending the model to
cope with composite services and long transactions. The mechanism is a generic mechanism that has
significant potential for use in other areas • Work is about to start on how it can be adapted for
service monitoring.
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 39
Key Points
Service dependability relies on• Reliable interchange of messages
• Service availability
• Service reliability Reliable messaging is supported by adding
message delivery information to the SOAP header with middleware using this information to ensure reliable message interchange
©Ian Sommerville 2006 MSc module: Advanced Software Engineering Slide 40
Key Points
Availability and reliability can be enhanced by using fault tolerance techniques
Container based fault tolerance provides fault tolerance mechanism in a container which can be used by all services.
By deploying several services in a container, a range of fault tolerance policies may be supported.