svcc services presentation (silicon valley code camp 2011)
DESCRIPTION
TRANSCRIPT
Proprietary and Confidential
Learning to Love ServicesExperiences from the Transition to a
Service-oriented Architecture
Jeff GreenStaff Software Engineer
Background
Proprietary and Confidential 3
Ingenuity Systems• Collect large curated, computable knowledge
base of biological information– Molecules (genes, chemicals)– Relationships between molecules– Pathways– Biological processes
• Develop software applications to leverage knowledge base for researchers– Ingenuity Pathways Analysis (IPA)– iReports– Next Generation Sequencing application
Proprietary and Confidential 4
IPA
Analyze dataset of molecules uploaded by user
Proprietary and Confidential 5
IPA
Visualize and explore relationships between molecules
Proprietary and Confidential 6
IPA
Search and browse through biological content
Proprietary and Confidential 7
Technologies• Client: Swing-based primary client with some
browser-based visualizations• Server: Multi-tier Java web application in Tomcat
and Jetty servlet containers• Oracle database
Proprietary and Confidential 8
Legacy Architecture
• Servers separated for scalability• Communicate through multicasting• Early notion of services except:
– Significant overlap of code and data access– Tightly coupled build, package, deployment– Entire system released quarterly
Proprietary and Confidential 9
Why move toService-oriented Architecture?
• Code/data reuse– Single definition of services and model objects– Common implementation
• More precise scalability– Scale only parts of system under stress
• More rapid releases– Push new features and fixes out to customers faster
• Simplify testing• Separation of responsibilities
– Untangle unnecessary dependencies– Reduce learning curve of code
Proprietary and Confidential 10
Constraints• IPA enhancements must continue simultaneously
– Changes cannot impact Production system prematurely
– Majority of engineering resources still focused on new development
• Homogeneous implementation– All services will be in Java, although external clients
may be Swing or browser-based.• Majority of services will be private
– Existing public apis will remain but are exception
Proprietary and Confidential 11
Key Considerations• High-level architecture
– What parts of the system should be services?• Implementation
– How is a service defined?• Communication
– How will consumers interact with providers?• Versioning
– How can changes be managed?• Testing
– How can quality be ensured?
High-level Architecture
Proprietary and Confidential 13
Architecture Questions• What existing capabilities should become
separate services?• What is the responsibility of each service?• What data does each service own?• What are the dependencies?
– No circularity• How granular should the system be?
– If too granular, hard to maintain– If not granular enough, lose benefits of services
(reuse, scalability, modularity, etc.)
Proprietary and Confidential 14
Ingenuity Service Categories
• Biological content– Provide biological content from a single source in a
consistent model• Algorithms
– Optimized for executing algorithms on data• User management
– User records– Authentication/authorization– Account creation– License management
Key Driver:
Reuse functionality to reduce development cost and enforce consistency
Proprietary and Confidential 15
Ingenuity Service Categories (cont.)
• Views of biological entities– Common rendering for reports of all biological
information about a molecule, chemical, etc.• Utilities
– Event tracking– Document generation
Proprietary and Confidential 16
Previous Design
Proprietary and Confidential 17
IPA with Services
Implementation
Proprietary and Confidential 19
Strategy• Adopt a common nomenclature, ideally somewhat
similar to existing system• Define simple, brief standards to guide
development and ensure consistency– Code organization– Server directory structure
• For each service, pick best implementation approach– Implement from scratch– Customize third-party application– Refactor existing code
Proprietary and Confidential 20
What exactly is a service anyway?• A set of operations available to consumers• Defined by an Application Programming Interface
(API): the set of operations and the data structures that serve as their input and output
• Java implementation: an interface and the model objects (Plain Old Java Objects, POJOs) that serve as arguments and return values
Proprietary and Confidential 21
Module• Encapsulates one or more related services and
their implementations– Source code: API, business logic, data access– XML and property files– Web resources– Test code (both unit and integration)
• Vertical slice of functionality– Improved modularity, reuse
• Buildable unit resulting in <module>.jar and other artifacts
• Existed in previous architecture, albeit different in nature
Proprietary and Confidential 22
Module Source Packages• Defined new basic package structure to organize
code within modules– Consistency helps with finding, understanding code
Proprietary and Confidential 23
Module Directories• Defined new basic directory structure to organize
resources within modules– Consistency helps with finding resources– Simplifies tools
Proprietary and Confidential 24
Component• Encapsulates one or more modules for
packaging/deployment (i.e. a releasable unit)• In practice, one per JVM• Unit of scale and redundancy• Owner of modules and schemas
Biological Content Component
Molecule Module Finding Module Graph Module
Molecule Service Finding Service Notation Service
Graph Service
Proprietary and Confidential 25
Why have a module?• In smaller components, component = single
module• In larger components, additional unit of
encapsulation helps to further modularize the code
• Clarifies code organization• Forces consideration of dependencies
– Package boundaries not inherently enforced• An attempt to fight spaghetti
Proprietary and Confidential 26
Tactics: Rebuild or Refactor?• Rebuild where existing approach was not
sufficient– New licensing schemes– SSO authentication (CAS implementation)
• Refactor where behavior should not change significantly– Content, algorithms, views
• Changes must not impact ongoing enhancements– Rebuilt components have no immediate impact– Refactor on separate branch; schedule merge at
appropriate time
Proprietary and Confidential 27
Build, Package, Deploy• Before
– Ant script built entire system at once– All components deployed together across multiple
JVMs• After
– Modified ant script builds a single component– Resulting component package is union of resources
in modules it owns– Each component deployed separately
• Current Tools– Hudson used to manage build/package tasks– Custom installer
Communication
Proprietary and Confidential 29
Considerations• How does a consumer locate a service provider?• Once found, what is protocol for communication?• Requirements
– Consumers and providers must be loosely coupled– Flexibility to add providers without impacting
consumers and vice-versa– Synchronous communication is acceptable
Proprietary and Confidential 30
Spring Remoting• Already in use in existing application• Abstract data transfer from consumer and
provider• Provides flexibility to switch from remote to local
services through configurationGraphServiceGraphAlgorithm
GraphServiceImpl
getGraph(id)
GraphObj
Within single component
Proprietary and Confidential 31
Spring Remoting• Already in use in existing application• Abstract data transfer from consumer and
provider• Provides flexibility to switch from remote to local
services through configurationGraphServiceGraphAlgorithm
HttpProxy
getGraph(id)
GraphObj
Across components
GraphService
GraphServiceImpl
getGraph(id)
GraphObj
HttpService
Http request w/ serialized objects
Http response w/ serialized objects
URL fromconfiguration
Proprietary and Confidential 32
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
AlgorithmComponent
ServerA
Proprietary and Confidential 33
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
ContentComponent
GraphService
ServerB
AlgorithmComponent
ServerA
Provider registers at startup
GraphService@ServerB
Proprietary and Confidential 34
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
ContentComponent
GraphService
ServerB
AlgorithmComponent
ServerAGraphService
@ServerB
1a) Consumer requests service location
GraphService
Proprietary and Confidential 35
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
ContentComponent
GraphService
ServerB
AlgorithmComponent
ServerAGraphService
@ServerB
1b) SDM responds with location
ServerB
Proprietary and Confidential 36
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
ContentComponent
GraphService
ServerB
AlgorithmComponent
ServerAGraphService
@ServerB
2a) Consumer requests servicedirectly from ServerB
Proprietary and Confidential 37
Service Discovery Manager (SDM)• Simple, custom utility provides a single (logical)
point for service discovery– Provider registers service– Consumer requests service location
SDM
ContentComponent
GraphService
ServerB
AlgorithmComponent
ServerAGraphService
@ServerB
2b) Provider responds with data
• Additional overhead with multiple requests, so stickiness is available– In practice, have not needed it
Proprietary and Confidential 38
Service Discovery Manager (SDM)• Spring remoting bean replaced with SDM-
implementation– Still abstracted from consumer and provider code
GraphServiceGraphAlgorithm
SDMConsumer
getGraph(id)
GraphObj
GraphService
GraphServiceImpl
getGraph(id)
GraphObj
SDMProvider
Http request w/ serialized objects
Http response w/ serialized objects
URL fromSDM
Proprietary and Confidential 39
api.jar• Consumers and providers must share api
definition• <module>-api.jar encapsulates module service
interfaces and model object classes• Built/published along with component providing
the service• Consumer picks up published jar at build time
Versioning
v1.0
v2.0
Proprietary and Confidential 41
Change is Inevitable• A service is defined by API, which will change
over time• Consumers and providers may change at different
rates• Goal is to minimize impact of change on
consumers
Proprietary and Confidential 42
Types of Service Changes• Implementation only
– No change to API; therefore no impact on consumer• API addition
– New method or member in model object• API change or deletion
– Service contract has changed
Proprietary and Confidential 43
Version Label
<major>.<minor>.<build>• <build> = implementation change
– Updated every build (svn revision number)– No impact on consumer
• <minor> = API addition– All releases with same major version are backward
compatible– Consumer of older service can ignore
• <major> = API change– Consumer must adapt
Proprietary and Confidential 44
Usage during Service Discovery• Api.jar stamped with specific version• Consumer includes versioned api when built• Consumer requests service with specific version• SDM finds provider with = major version and >=
minor version• Multiple versions can coexist
SDM
ContentComponent
GraphService2.3
ServerCAlgorithm
Component
ServerAGraphService2.2
@ServerBGraphService 2.2
ServerBor
ServerC
ContentComponent
GraphService2.2
ServerB
GraphService2.3@ServerC
Proprietary and Confidential 45
Usage during Service Discovery• Api.jar stamped with specific version• Consumer includes versioned api when built• Consumer requests service with specific version• SDM finds provider with = major version and >=
minor version• Multiple versions can coexist
SDM
ContentComponent
GraphService2.3
ServerCAlgorithm
Component
ServerAGraphService2.2
@ServerBGraphService 2.3
ServerC
ContentComponent
GraphService2.2
ServerB
GraphService2.3@ServerC
Proprietary and Confidential 46
Java Serialization Id• Every Java class has serialization id
– Assigned by compiler or programmer– Intent is to identify change
• If unspecified, API model object will get new id with every change– Potentially breaks backwards compatibility of minor
version change• Solution is to specify id for all model objects
Proprietary and Confidential 47
Stability Level• Need to separate components at different stages
of lifecycle– Ongoing development should not impact testing
and demos• Versioning not sufficient because only reflects API
changes
Proprietary and Confidential 48
Option: Duplicate Environments• Duplicate entire system for each stage of stability
Proprietary and Confidential 49
Option: Duplicate Environments• Duplicate entire system for each stage of stability
• Significant redundancy• Unmanageable when several environments
Proprietary and Confidential 50
Better Option: Zones• Establish zones: logical categories of stability
– Development vs. testing vs. stable– In practice, zone can be String
• Providers broadcast service in appropriate zone• Consumers look for service in desired zone
Proprietary and Confidential 51
Full Service Discovery Protocol• Discovery protocol involves three pieces of info
– Service name (i.e. full interface name)– Version– Zone
• Same SDM instance can be used across all zones
SDM
AlgorithmComponent
ServerA
GraphService2.3Stable@ServerB
GraphService 2.3Stable
ServerB
GraphService2.3
ServerB
GraphService2.3
ServerC
GraphService2.3
ServerD
Dev
Test
Stable
GraphService2.3Test@ServerC
GraphService2.3Dev@ServerD
Testing
Proprietary and Confidential 53
Goals• Ensure component meets contract defined by API
– Opportunity for more granular testing– No user interface available– Critical for individual release cycles
• Ensure component is backwards compatible• Test component as a whole
– “Unit test” where component is the testable unit– Not comfortable relying solely on class unit tests– Requires realistic data
• Automate test execution– Enable more frequent updates with less cost
Proprietary and Confidential 54
FitNesse• Framework for writing, organizing, and executing
tests• Acts as a consumer of the component under test• Tests written in wiki-like syntax in browser client
• Classes called Fixtures transform data to/from API model objects
• Results are displayed and history is maintained
Proprietary and Confidential 55
Test Resources• Modules that encapsulate code for
services also contain the testing-related resources– FitNesse wiki pages– Fixtures
• Fixtures built at same time as component• All test resources encapsulated in a separate package– Package includes api jars of the service it tests, just
like other consumers of the service• For every component, there is a suite of FitNesse
tests to validate it
Proprietary and Confidential 56
Continuous Integration (CI)• Goal: rapidly iterate through the develop, build,
test, and release cycle in as automated a way as possible
• Basic workflow– Developer commits change to component– Altered component is built and deployed– Suite of tests are executed against the component– Assuming tests pass, build is released
• Process is as automated as possible• Ideally frequency is limited only by hardware
constraints
Proprietary and Confidential 57
CI in Practice
Dev
Test
Stable
CI
Stable
ContentComponent
FitNesse
1) Component is implemented in a Dev zone
Proprietary and Confidential 58
CI in Practice
Dev
Test
Stable
CI
Stable
ContentComponent
FitNesse2) Component is built and
deployed in the CI zone
Proprietary and Confidential 59
CI in Practice
Dev
Test
Stable
CI
Stable
ContentComponent
FitNesse
3) Component tests are deployed to the FitNesse server
Proprietary and Confidential 60
CI in Practice
Dev
Test
Stable
CI
Stable
ContentComponent
FitNesse
4) Tests are executed against the component
Proprietary and Confidential 61
CI in Practice
Dev
Test
Stable
CI
StableContent
Component
FitNesse
5) If tests pass, component is promoted to Stable zone
Proprietary and Confidential 62
CI in Practice
Dev
Test
Stable
CI
StableContent
Component
FitNesse
AlgorithmComponentConsumers use services in
the Stable zone, even during development and testing
Proprietary and Confidential 63
CI in Practice
Dev
Test
Stable
CI
StableContent
Component
FitNesse
Consumers go through the same CI workflow to be promoted to Stable
AlgorithmComponent
Proprietary and Confidential 64
CI in Practice
Dev
Test
Stable
CI
StableContent
Component
FitNesse
The process is initiated automatically once per day and the steps are automated from build to Stable promotion. Builds are still manually released to Production for now.
AlgorithmComponent
Proprietary and Confidential 65
Backwards Compatibility• Services must be backwards compatible
– Support consumers of all previous minor versions– Ensures provider can be updated independently of
consumer• To validate, execute test bundles built with
previous minor versions– Confirms previous contract is still met– Confirms api jars are still compatible
Results
Proprietary and Confidential 67
Successes• New applications utilized services immediately
– Enabled teams to focus on new innovation– Previously code/capability had to be duplicated– Apps used services even before IPA did
• Faster, more frequent releases– Legacy system previously updated quarterly– User management service now updated within iteration
(< 2 weeks)– Fixes released within a couple hours
• More modular code base• Consistency in code structure, server layout• Several servers smaller than before
– Improved startup time– Now feasible to run/debug within IDE
Proprietary and Confidential 68
Challenges and Do-overs• Difficult to overhaul so many aspects of the
system simultaneously• Major version in API package names
– Potentially results in “false positive” change to significant numbers of files
– 20% case complicated the 80% case• Jar dependencies
– Attempt at simplification ended up being more difficult
Proprietary and Confidential 69
Next Steps• Recently achieved milestone of porting legacy
application to use services it spawned• Improve service APIs, possibly rewrite
– Feasible now that components have clearly defined boundaries
• Improve test coverage• Performance testing of individual components
– Understand load profiles to guide scalability decisions
• Transition to standard tools– maven– puppet
Proprietary and Confidential 70
Key Takeaways• Identify early what is necessary and what is not;
focus on what is necessary• Take incremental steps where possible
– Important for remaining agile• Establish simple patterns or standards to follow in
implementation, packaging, and testing– Helps with tools creation, understanding of system
• Be flexible– Try new approaches and process; adapt if first try
does not work out
Proprietary and Confidential 71
Questions?
Jeff GreenStaff Software EngineerIngenuity Systemswww.ingenuity.com
Slides available at:http://www.slideshare.net/jenlwong/svcc-services-presentation-201110
Photo Credits: Petronas Towers: Franco Pecchio (Ai@ce, flickr.com); Coffee beans: Al Gebra (photl.com); Telegraph: Bill Bradford (mrbill, flickr.com); Chimp: Afrika Expeditionary Force (flickr.com); Mug shot: Jimmy Prescott (flickr.com); Crash test: NASA (everystockphoto.com); Trophies: Roberto Arias (Ariaski, flickr.com)