how and when to migrate analytics workloads to …...how and when to migrate analytics workloads to...
TRANSCRIPT
How and when to migrate analytics workloads to the cloudStrategies to help businesses avoid the risks and understand the complexities
White Paper
1
Unfortunately, migrating analytics workloads also carries risks of delays, service interruptions and project failures. Many organizations don’t fully grasp the complexities and can take months or even years to decide what to move, where to move it, and how to move their data and applications without negatively affecting business operations. Before embarking on a migration, IT departments need to understand the cloud’s dependencies and constraints, potential applications and the advantages of infrastructure as a service (IaaS).
Although companies have different needs, they also face common challenges as they plan, scope and execute an analytics migration to the cloud. Understanding the qualifications, migration patterns, key decision points, best practices and tools that are essential for success will help launch you on the path that works best for your company.
Should we or shouldn’t we?
How can you tell whether your organization would benefit from migrating analytics to the cloud? Here are some clues:
• Does your CFO complain about the cost of analytics?
• Do users grumble about the quality, speed or lack of features, such as analytics that don’t use the latest algorithms?
• Does your CTO report the need for a software upgrade or hardware replacement during the next 24 months?
If you answered “yes” to any of these questions, we strongly suggest that you consider moving at least some of your analytics to the cloud. The chart in Figure 1 offers a thumbnail of other questions DXC Technology recommends asking before taking that step.
Organizations are moving to public and hybrid clouds to increase business and IT agility, compete more effectively, and replace outdated data centers and applications. Driving this growth is increased demand from compute-intensive workloads such as artificial intelligence, analytics and the internet of things (IoT), as well as infrastructure migrations to the cloud. Analytics workloads are particularly suited for migration because most use cases require the scalability, agility, time to market and reduced costs the cloud can provide.
White Paper
PlatformServices
WorkloadOptimization
Hybrid Data Management Architecture
BIModernization
What workloads aredriving modernizationof the architecture?
What workloads need to be migrated to the cloud?
Which cloud?
What workloads are causing performance degradation?
What workloads need to be migrated to a lower- cost platform?
What technologies are best suited for a given workload?
DataWorkload
Assessment
Figure 1. Data workload assessment
2
White Paper
Potential for lower TCO
One strong reason for migrating to the cloud is the potential for significant cost savings. DXC found that companies can save millions of dollars by migrating even half of their analytics solutions to a cloud platform. DXC collected data from a variety of projects, then compiled migration costs. While organizations may have slightly different scenarios, the calculations show how migrating analytics workloads can reduce total cost of ownership (TCO).
Our starting point for the TCO comparison was a typical “as is” (present) condition:
• The current analytics solution stores 200TB of data, but the volume is growing.
• Users complain about delays and availability of data analysis results.
• Data scientists are not satisfied with algorithms provided by the current solution.
• The CTO will have to make a few strategic decisions, because software needs to be upgraded and hardware must be replaced.
We looked at two scenarios for the “to be” (future) condition:
Traditional
• Upgrade software to a higher version
• Replace hardware with new infrastructure
• Develop or buy new algorithms, optimize process
3
White Paper
Modern
• Migrate a meaningful part of the analytics solution to an analytics platform in the cloud, assuming (for calculation) migration of 50 percent of data and workload
• Upgrade software and replace hardware as in traditional scenario, but only for the remaining 50 percent of the data and workload
The results of the TCO calculation presented in Figure 2 indicate that the modern scenario, using 5 years of TCO as a measure, is less expensive by about $5 million. The bulk of the savings are in the first year, primarily due to license and hardware savings, followed by savings of a couple of hundred thousand dollars year after year on maintenance costs. Both calculations include hardware and software costs (initial and annual maintenance).
The modern scenario also includes migration project costs, a reserve for risk, additional network connectivity from on-premises to the cloud, the full cost of managed services for the DXC Analytics Platform solution in the cloud and the cloud provider’s subscription fee.
Faster, better and more scalable analytics
Cost reduction is a strong argument for migrating analytics to the cloud, but it is only one side of the story. The right cloud-based analytics platform can also provide many features and capabilities usually not available in traditional analytics solutions or available only as additionally licensed modules. Figure 3 is a snapshot of the leading platform features offered by the DXC Analytics Platform, a solution that can be deployed on a variety of cloud types and on premises.
Figure 2. Total cost of ownership Year 1 Year 2 Year 3 Year 4 Year 5
$7,925
$2,711 $2,711 $2,711 $2,711$2,362 $2,362 $2,362 $2,362
$12,432
Traditional ModernCost per year [k]
Figure 3. DXC Analytics Platform features
• Un/semi-structured → lower cost of new source integration• Schema on read → less investment in data transformation• Time to market → quick first results (agile)
• Data streaming → data analysis and recommendation in real time• IOT→ reach new data sources for monetization
• Flexible integration of many data sources and formats → new insights and correlation hidden before
• Advanced analytics → result more accurate• Real time analytcs → online scoring • Deep learning → access to hidden complex correlation
Data structure
Data in motion
Data enrichment
Analytics
4
White Paper
Migration patterns for moving workloads to the cloud
Exactly which approach is best for migrating analytics to the cloud varies with the scope of the workloads as well as with a company’s overall IT strategy and level of cloud readiness. To organize the process of cloud adoption, DXC has codified three primary migration patterns (shown in Figure 4): augmentation, partial migration and full replacement.
Augmentation
The most popular pattern for a starting point is augmentation. In this pattern, the existing analytic fabric stays unchanged and is surrounded by modern technology to enhance its functionality by:
• Adding semi-structured, unstructured or other new data sources
• Introducing an additional data transformation approach, such as stream processing
• Providing new data mining and machine learning algorithms, such as deep learning, and implementing online scoring and learning models on full datasets
Partial migration
The second most popular starting point is partial migration. This pattern covers a few levels of migration, from simple offloading of “cold data” (archival data that users rarely or never access) to complex migration of extract, transform and load (ETL) processing and the staging layer to the DXC Analytics Platform, which then loads data that is now ready for reporting or analysis back to the existing enterprise data warehouse (EDW).
In addition to the features offered in augmentation, the partial migration pattern provides an opportunity to reduce the costs of a traditional analytics solution by reducing the environment’s size and complexity.
EDW integration
Following a major merger or corporate reorganization, companies may want to consider an interim EDW integration pattern to manage two or more data warehouses and business intelligence or analytics solutions. In an EDW integration pattern, we assume a quick and easy fix for hiding the complexity of two or more analytics systems through a modern approach using solutions such as “schema on read” and “document databases.”
As a result, the user may see a single system even if the underlying data are not integrated. This pattern needs to be treated as a temporary solution, because it adds complexity and an additional layer to the architecture. It does, however, provide an immediate win and buys some time to analyze and develop the final architecture, which might be partial migration or full replacement.
5
White Paper
Full replacement
The full replacement migration pattern assumes that a traditional analytics solution will be turned off. Obviously, the migration project needs to go through phases and will migrate step by step, but the final architecture needs to be optimized for full replacement. This pattern is the biggest challenge to organizations from a change-adoption perspective; however, it provides the greatest opportunity for reducing cost and complexity.
Moving analytics applications to the cloud removes the need for expensive enterprise-licensed software. In addition, full migration gives the organization the chance to focus on analytics business outcomes rather than on technical complexity and maintenance.
Figure 4. Migration patterns
• Cold data offload (e.g., history)• Heavy area offload (detailed data in big data, aggregates in EDW)• ETL replacement
• New data sources (e.g., semi-structured)• New transformation approach (e.g., stream)• New analytics possibilities (e.g., deep learning)
Partial migration
Connector
Con
nect
orSQLArea 1
Enterprise Data Warehousing
(EDW)Front End
Area 2
ETLD
ata
Sour
ces
Big Data Storage
Big Data Platform
Visualization & Analytics
HOTCOLD
Augmentation
Connector
Con
nect
orSQL
Enterprise Data Warehousing
(EDW)Front EndETL
ETL
File
Stream
Dat
a So
urce
sN
ew S
ourc
es
Big Data Storage
Big Data Platform
Visualization & Analytics
• Leaving current ETL tool (database forklift)• Capturing data directly from the source (ETL replacement)
• Two different EDW need to be integrated (company merger or reorganization)• Different technology• Different data model
Full replacement
Connector
Con
nect
orSQLArea 1
Enterprise Data Warehousing
(EDW)Front End
Area 1ETL
Dat
a So
urce
s
Big Data Storage
Big Data Platform
Visualization & Analytics
HOTCOLD
EDW Integration
Connector
Connector
SQL
Connector
Front End
Front EndETL
ETL
Dat
a So
urce
sD
ata
Sour
ces
Big Data Platform
Visualization & Analytics
Big Data Storage ETLEnterprise Data Warehousing
(EDW)
Enterprise Data Warehousing
(EDW)
Connector
Figure 5. Migration pattern scenarios
6
White Paper
Evolving migration journey
DXC supports an approach where organizations start small in migrating analytics to the cloud yet have the agility to grow rapidly in many directions. The migration patterns can fit together as part of a whole process, from augmentation to full replacement, as shown in Figure 5.
No one-size-fits-all solution
Even after making the strategic decision to migrate analytics to a cloud infrastructure, key architectural and other decisions remain about how to best manage the data and the underlying processes that use or transform that data. In supporting customers through these decisions, DXC has found that no single solution meets every need. Every organization has different strategies, internal regulations, structures, maturity and analytics IQ. Often the most effective way to determine the best approach is a workshop or short assessment project, finalized with a roadmap.
S M L
Pattern/project size Small Medium Large
Augmentation New batch data source without integration with EDW
Streaming processing with lookups to EDW
High volume, velocity and variety of new data
Partial migration Cold data: Offload without changes in ETL and reporting
Heavy area offload: Aggregated data may return to EDW
ETL replacement: Offload of ETL and stage area
Full replacement Replacement of small EDW without ETL replacement
Full replacement, including ETL
Project size Small Medium Large
Duration (months) 3-5 6-9 10-36
Team size (FTE) 5-10 11-20 20+
Augmentation
Partial migration
Full replacement
Connector
Con
nect
orSQL
Enterprise Data Warehousing
(EDW)Front EndETL
ETL
File
Stream
Dat
a So
urce
sN
ew S
ourc
es
Big Data Storage
Big Data Platform
Visualization & Analytics
Connector
Con
nect
orSQLArea 1
Enterprise Data Warehousing
(EDW)Front End
Area 2
ETL
Dat
a So
urce
s
Big Data Storage
Big Data Platform
Visualization & Analytics
HOTCOLD
Connector
Con
nect
orSQLArea 1
Enterprise Data Warehousing
(EDW)Front End
Area 1ETL
Dat
a So
urce
s
Big Data Storage
Big Data Platform
Visualization & Analytics
HOTCOLD
ETL
7
White Paper
Figure 5. Architectural decision points
ETL BI ML
ReplaceRemain and forklift Replace
Remain and integrate Replace
Remain and integrate
Here are a few examples of architectural decision points we typically ask clients to consider:
ETL data from source and inside analytic solution:
• Retain existing solution and reroute some data to a platform in the cloud
• Replace using big data platform computing power and resource flexibility
• Find a solution between those two possibilities and define the approach
Business intelligence insights, reporting and dashboarding:
• Retain existing solution and connect to a new platform in cloud
• Replace using modern tools integrated with a platform in cloud
• Find a solution between those two possibilities and define the approach
Machine learning, data mining and other artificial intelligence methods:
• Retain existing methods and gather data from new platform in cloud
• Replace and leverage the possibilities of modern cloud solutions, such as new algorithms and “infinite” computing power, online learning and scoring
As a trusted partner, DXC specializes in developing workload-specific transformation and migration strategies aligned with our clients’ business objectives. DXC offers a broad set of migration capabilities to support a diverse array of technologies, geographies, regulatory requirements, operating models and target environments. The infrastructure-agnostic DXC Analytics Platform makes it easy and efficient to follow the most suitable migration path to the cloud for your analytics workload.
Our experience in managing enterprise hybrid environments and our knowledge of traditional and next-gen infrastructures, combined with our comprehensive services, has enabled us to successfully deliver a range of analytics workload migration projects on cloud platforms around the world.
Learn more at www.dxc.technology/analytics
White Paper
About DXC Technology
DXC Technology (DXC: NYSE) is the world’s leading independent, end-to-end IT services company, helping clients harness the power of innovation to thrive on change. Created by the merger of CSC and the Enterprise Services business of Hewlett Packard Enterprise, DXC Technology serves nearly 6,000 private and public sector clients across 70 countries. The company’s technology independence, global talent and extensive partner network combine to deliver powerful next-generation IT services and solutions. DXC Technology is recognized among the best corporate citizens globally. For more information, visit www.dxc.technology.
© 2018 DXC Technology Company. All rights reserved. MD_7701a-18. January 2018www.dxc.technology
About the authors
Slawomir Folwarski is the senior architect, DXC Analytics Platform, focused on data workload optimization and big data platform architecture. He has 17 years of experience in the telco, public sector, automotive, logistics and finance industries with expertise in data warehousing, business intelligence and Hadoop technologies. Slawomir has a double master’s degree in IT and economics and keeps himself updated by taking online training courses from Coursera, edX and Udacity. He is also certified in Architecting Microsoft Azure Solutions and in Agile PM and PMP.
Sunil Samantaray is the principal architect, DXC Analytics Platform. He has 11 years of enterprise information management systems experience. His ability to bridge the gap between business and technology has enabled him to provide exceptional solutions for various complex platform/data integration and big data analytics projects in the healthcare, insurance, digital marketing, political consulting, railway and manufacturing industries.